Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.10957
Cited By
v1
v2
v3 (latest)
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
15 April 2025
Hongkang Li
Yihua Zhang
Shuai Zhang
Ming Wang
Sijia Liu
Pin-Yu Chen
MoMe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers"
18 / 68 papers shown
Title
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
166
1,012
1
10 Mar 2022
Benign Overfitting in Two-layer Convolutional Neural Networks
Yuan Cao
Zixiang Chen
M. Belkin
Quanquan Gu
MLT
88
89
0
14 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
859
9,729
0
28 Jan 2022
Merging Models with Fisher-Weighted Averaging
Michael Matena
Colin Raffel
FedML
MoMe
107
402
0
18 Nov 2021
An Explanation of In-context Learning as Implicit Bayesian Inference
Sang Michael Xie
Aditi Raghunathan
Percy Liang
Tengyu Ma
ReLM
BDL
VPVLM
LRM
239
764
0
03 Nov 2021
Robust fine-tuning of zero-shot models
Mitchell Wortsman
Gabriel Ilharco
Jong Wook Kim
Mike Li
Simon Kornblith
...
Raphael Gontijo-Lopes
Hannaneh Hajishirzi
Ali Farhadi
Hongseok Namkoong
Ludwig Schmidt
VLM
171
740
0
04 Sep 2021
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
261
3,793
0
03 Sep 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
Colin Wei
Sang Michael Xie
Tengyu Ma
141
100
0
17 Jun 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
252
4,313
0
01 Jan 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
687
41,563
0
22 Oct 2020
Descent-to-Delete: Gradient-Based Methods for Machine Unlearning
Seth Neel
Aaron Roth
Saeed Sharifi-Malvajerdi
MU
90
276
0
06 Jul 2020
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
MoMe
163
630
0
11 Dec 2019
Certified Data Removal from Machine Learning Models
Chuan Guo
Tom Goldstein
Awni Y. Hannun
Laurens van der Maaten
MU
118
451
0
08 Nov 2019
Making AI Forget You: Data Deletion in Machine Learning
Antonio A. Ginart
M. Guan
Gregory Valiant
James Zou
MU
91
481
0
11 Jul 2019
Invariant Risk Minimization
Martín Arjovsky
Léon Bottou
Ishaan Gulrajani
David Lopez-Paz
OOD
212
2,247
0
05 Jul 2019
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
297
3,225
0
20 Jun 2018
Averaging Weights Leads to Wider Optima and Better Generalization
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
147
1,673
0
14 Mar 2018
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.7K
39,637
0
01 Sep 2014
Previous
1
2