ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.05643
  4. Cited By
A Kernel-Based View of Language Model Fine-Tuning

A Kernel-Based View of Language Model Fine-Tuning

11 October 2022
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
    VLM
ArXivPDFHTML

Papers citing "A Kernel-Based View of Language Model Fine-Tuning"

20 / 20 papers shown
Title
Tensor Product Attention Is All You Need
Tensor Product Attention Is All You Need
Yifan Zhang
Yifeng Liu
Huizhuo Yuan
Zhen Qin
Yang Yuan
Q. Gu
Andrew Chi-Chih Yao
77
9
0
11 Jan 2025
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models
Jinxu Lin
Linwei Tao
Minjing Dong
Chang Xu
TDI
41
2
0
24 Oct 2024
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
H. Fernando
Han Shen
Parikshit Ram
Yi Zhou
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
CLL
59
2
0
20 Oct 2024
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
96
16
0
11 Oct 2024
Investigating the Impact of Model Complexity in Large Language Models
Investigating the Impact of Model Complexity in Large Language Models
Jing Luo
Huiyuan Wang
Weiran Huang
39
0
0
01 Oct 2024
When does compositional structure yield compositional generalization? A kernel theory
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
73
5
0
26 May 2024
Learning Beyond Pattern Matching? Assaying Mathematical Understanding in
  LLMs
Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs
Siyuan Guo
Aniket Didolkar
Nan Rosemary Ke
Anirudh Goyal
Ferenc Huszár
Bernhard Schölkopf
52
3
0
24 May 2024
Active Few-Shot Fine-Tuning
Active Few-Shot Fine-Tuning
Jonas Hübotter
Bhavya Sukhija
Lenart Treven
Yarden As
Andreas Krause
45
1
0
13 Feb 2024
Scalable Neural Network Kernels
Scalable Neural Network Kernels
Arijit Sehanobish
Krzysztof Choromanski
Yunfan Zhao
Kumar Avinava Dubey
Valerii Likhosherstov
41
4
0
20 Oct 2023
Fine-Tuning Language Models with Just Forward Passes
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
29
177
0
27 May 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained
  Models
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Guillermo Ortiz-Jiménez
Alessandro Favero
P. Frossard
MoMe
51
110
0
22 May 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu
Tianlong Chen
Zhenyu (Allen) Zhang
Xuxi Chen
Tianjin Huang
Ajay Jaiswal
Zhangyang Wang
32
29
0
03 Mar 2023
Understanding Reconstruction Attacks with the Neural Tangent Kernel and
  Dataset Distillation
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation
Noel Loo
Ramin Hasani
Mathias Lechner
Alexander Amini
Daniela Rus
DD
42
5
0
02 Feb 2023
An Analysis of Attention via the Lens of Exchangeability and Latent
  Variable Models
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
53
11
0
30 Dec 2022
A picture of the space of typical learnable tasks
A picture of the space of typical learnable tasks
Rahul Ramesh
Jialin Mao
Itay Griniasty
Rubing Yang
H. Teoh
Mark K. Transtrum
James P. Sethna
Pratik Chaudhari
SSL
DRL
36
5
0
31 Oct 2022
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
Sadhika Malladi
Kaifeng Lyu
A. Panigrahi
Sanjeev Arora
92
40
0
20 May 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,858
0
18 Apr 2021
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
243
1,924
0
31 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural
  Language Inference
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,589
0
21 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1