Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.10041
Cited By
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning
18 October 2022
Shuo Xie
Jiahao Qiu
Ankita Pasad
Li Du
Qing Qu
Hongyuan Mei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning"
25 / 25 papers shown
Title
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail
Sam Bowman
OffRL
68
45
0
15 Oct 2021
Towards a Unified View of Parameter-Efficient Transfer Learning
Junxian He
Chunting Zhou
Xuezhe Ma
Taylor Berg-Kirkpatrick
Graham Neubig
AAML
129
933
0
08 Oct 2021
An Unconstrained Layer-Peeled Perspective on Neural Collapse
Wenlong Ji
Yiping Lu
Yiliang Zhang
Zhun Deng
Weijie J. Su
161
86
0
06 Oct 2021
Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers
Jason Phang
Haokun Liu
Samuel R. Bowman
56
29
0
17 Sep 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model
Ankita Pasad
Ju-Chieh Chou
Karen Livescu
SSL
58
303
0
10 Jul 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
Elad Ben-Zaken
Shauli Ravfogel
Yoav Goldberg
162
1,218
0
18 Jun 2021
Learning How to Ask: Querying LMs with Mixtures of Soft Prompts
Guanghui Qin
J. Eisner
61
546
0
14 Apr 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
266
443
0
24 Feb 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
223
4,247
0
01 Jan 2021
WARP: Word-level Adversarial ReProgramming
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
316
350
0
01 Jan 2021
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
74
400
0
14 Dec 2020
Neural collapse with unconstrained features
D. Mixon
Hans Parshall
Jianzong Pi
66
120
0
23 Nov 2020
Prevalence of Neural Collapse during the terminal phase of deep learning training
Vardan Papyan
Xuemei Han
D. Donoho
194
574
0
18 Aug 2020
BERT Loses Patience: Fast and Robust Inference with Early Exit
Wangchunshu Zhou
Canwen Xu
Tao Ge
Julian McAuley
Ke Xu
Furu Wei
45
341
0
07 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
137
2,731
0
05 Jun 2020
What Happens To BERT Embeddings During Fine-tuning?
Amil Merchant
Elahe Rahimtoroghi
Ellie Pavlick
Ian Tenney
67
187
0
29 Apr 2020
Information-Theoretic Probing with Minimum Description Length
Elena Voita
Ivan Titov
73
275
0
27 Mar 2020
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives
Elena Voita
Rico Sennrich
Ivan Titov
272
186
0
03 Sep 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
106
1,139
0
23 May 2019
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
133
1,471
0
15 May 2019
ERNIE: Enhanced Representation through Knowledge Integration
Yu Sun
Shuohuan Wang
Yukun Li
Shikun Feng
Xuyi Chen
Han Zhang
Xin Tian
Danxiang Zhu
Hao Tian
Hua Wu
119
901
0
19 Apr 2019
Linguistic Knowledge and Transferability of Contextual Representations
Nelson F. Liu
Matt Gardner
Yonatan Belinkov
Matthew E. Peters
Noah A. Smith
119
731
0
21 Mar 2019
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks
Matthew E. Peters
Sebastian Ruder
Noah A. Smith
79
437
0
14 Mar 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,154
0
20 Apr 2018
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
204
11,549
0
15 Feb 2018
1