Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.05643
Cited By
A Kernel-Based View of Language Model Fine-Tuning
11 October 2022
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Kernel-Based View of Language Model Fine-Tuning"
20 / 20 papers shown
Title
Tensor Product Attention Is All You Need
Yifan Zhang
Yifeng Liu
Huizhuo Yuan
Zhen Qin
Yang Yuan
Q. Gu
Andrew Chi-Chih Yao
77
9
0
11 Jan 2025
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
Jinxu Lin
Linwei Tao
Minjing Dong
Chang Xu
TDI
41
2
0
24 Oct 2024
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
H. Fernando
Han Shen
Parikshit Ram
Yi Zhou
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
CLL
59
2
0
20 Oct 2024
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
96
16
0
11 Oct 2024
Investigating the Impact of Model Complexity in Large Language Models
Jing Luo
Huiyuan Wang
Weiran Huang
39
0
0
01 Oct 2024
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
73
5
0
26 May 2024
Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs
Siyuan Guo
Aniket Didolkar
Nan Rosemary Ke
Anirudh Goyal
Ferenc Huszár
Bernhard Schölkopf
52
3
0
24 May 2024
Active Few-Shot Fine-Tuning
Jonas Hübotter
Bhavya Sukhija
Lenart Treven
Yarden As
Andreas Krause
45
1
0
13 Feb 2024
Scalable Neural Network Kernels
Arijit Sehanobish
Krzysztof Choromanski
Yunfan Zhao
Kumar Avinava Dubey
Valerii Likhosherstov
41
4
0
20 Oct 2023
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
29
177
0
27 May 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Guillermo Ortiz-Jiménez
Alessandro Favero
P. Frossard
MoMe
51
110
0
22 May 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu
Tianlong Chen
Zhenyu (Allen) Zhang
Xuxi Chen
Tianjin Huang
Ajay Jaiswal
Zhangyang Wang
32
29
0
03 Mar 2023
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation
Noel Loo
Ramin Hasani
Mathias Lechner
Alexander Amini
Daniela Rus
DD
42
5
0
02 Feb 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
53
11
0
30 Dec 2022
A picture of the space of typical learnable tasks
Rahul Ramesh
Jialin Mao
Itay Griniasty
Rubing Yang
H. Teoh
Mark K. Transtrum
James P. Sethna
Pratik Chaudhari
SSL
DRL
36
5
0
31 Oct 2022
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
Sadhika Malladi
Kaifeng Lyu
A. Panigrahi
Sanjeev Arora
92
40
0
20 May 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,858
0
18 Apr 2021
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
243
1,924
0
31 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,589
0
21 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1