ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.20284
  4. Cited By
LayerNorm: A key component in parameter-efficient fine-tuning

LayerNorm: A key component in parameter-efficient fine-tuning

29 March 2024
Taha ValizadehAslani
Hualou Liang
ArXivPDFHTML

Papers citing "LayerNorm: A key component in parameter-efficient fine-tuning"

26 / 26 papers shown
Title
Structured Pruning Learns Compact and Accurate Models
Structured Pruning Learns Compact and Accurate Models
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
51
184
0
01 Apr 2022
NormFormer: Improved Transformer Pretraining with Extra Normalization
NormFormer: Improved Transformer Pretraining with Extra Normalization
Sam Shleifer
Jason Weston
Myle Ott
AI4CE
48
76
0
18 Oct 2021
Raise a Child in Large Language Model: Towards Effective and
  Generalizable Fine-tuning
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
Runxin Xu
Fuli Luo
Zhiyuan Zhang
Chuanqi Tan
Baobao Chang
Songfang Huang
Fei Huang
LRM
176
187
0
13 Sep 2021
FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated
  Learning
FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning
Nam Hyeon-Woo
Moon Ye-Bin
Tae-Hyun Oh
FedML
80
122
0
13 Aug 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based
  Masked Language-models
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
Elad Ben-Zaken
Shauli Ravfogel
Yoav Goldberg
162
1,218
0
18 Jun 2021
The Principles of Deep Learning Theory
The Principles of Deep Learning Theory
Daniel A. Roberts
Sho Yaida
Boris Hanin
FaML
PINN
GNN
59
245
0
18 Jun 2021
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
413
10,328
0
17 Jun 2021
Variational Information Bottleneck for Effective Low-Resource
  Fine-Tuning
Variational Information Bottleneck for Effective Low-Resource Fine-Tuning
Rabeeh Karimi Mahabadi
Yonatan Belinkov
James Henderson
DRL
47
74
0
10 Jun 2021
BERT Busters: Outlier Dimensions that Disrupt Transformers
BERT Busters: Outlier Dimensions that Disrupt Transformers
Olga Kovaleva
Saurabh Kulshreshtha
Anna Rogers
Anna Rumshisky
68
90
0
14 May 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
543
4,036
0
18 Apr 2021
Parameter-Efficient Transfer Learning with Diff Pruning
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
74
400
0
14 Dec 2020
What Happens To BERT Embeddings During Fine-tuning?
What Happens To BERT Embeddings During Fine-tuning?
Amil Merchant
Elahe Rahimtoroghi
Ellie Pavlick
Ian Tenney
67
187
0
29 Apr 2020
On Layer Normalization in the Transformer Architecture
On Layer Normalization in the Transformer Architecture
Ruibin Xiong
Yunchang Yang
Di He
Kai Zheng
Shuxin Zheng
Chen Xing
Huishuai Zhang
Yanyan Lan
Liwei Wang
Tie-Yan Liu
AI4CE
128
989
0
12 Feb 2020
Understanding and Improving Layer Normalization
Understanding and Improving Layer Normalization
Jingjing Xu
Xu Sun
Zhiyuan Zhang
Guangxiang Zhao
Junyang Lin
FAtt
83
352
0
16 Nov 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
615
24,431
0
26 Jul 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
230
8,426
0
19 Jun 2019
Are Sixteen Heads Really Better than One?
Are Sixteen Heads Really Better than One?
Paul Michel
Omer Levy
Graham Neubig
MoE
100
1,061
0
25 May 2019
Neural Network Acceptability Judgments
Neural Network Acceptability Judgments
Alex Warstadt
Amanpreet Singh
Samuel R. Bowman
230
1,407
0
31 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,154
0
20 Apr 2018
A Broad-Coverage Challenge Corpus for Sentence Understanding through
  Inference
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
Adina Williams
Nikita Nangia
Samuel R. Bowman
520
4,476
0
18 Apr 2017
Overcoming catastrophic forgetting in neural networks
Overcoming catastrophic forgetting in neural networks
J. Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
J. Veness
Guillaume Desjardins
...
A. Grabska-Barwinska
Demis Hassabis
Claudia Clopath
D. Kumaran
R. Hadsell
CLL
354
7,504
0
02 Dec 2016
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
308
2,859
0
26 Sep 2016
SQuAD: 100,000+ Questions for Machine Comprehension of Text
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
274
8,127
0
16 Jun 2016
Learning both Weights and Connections for Efficient Neural Networks
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
310
6,672
0
08 Jun 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
463
43,289
0
11 Feb 2015
Distributed Representations of Words and Phrases and their
  Compositionality
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAI
OCL
392
33,521
0
16 Oct 2013
1