Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.00099
Cited By
v1
v2 (latest)
Efficient Methods for Natural Language Processing: A Survey
31 August 2022
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
Manuel R. Ciosici
Michael Hassid
Kenneth Heafield
Sara Hooker
Colin Raffel
Pedro Henrique Martins
André F. T. Martins
Jessica Zosa Forde
Peter Milder
Edwin Simpson
Noam Slonim
Jesse Dodge
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Efficient Methods for Natural Language Processing: A Survey"
50 / 244 papers shown
Title
Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks
Curtis G. Northcutt
Anish Athalye
Jonas W. Mueller
76
537
0
26 Mar 2021
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Julia Kreutzer
Isaac Caswell
Lisa Wang
Ahsan Wahab
D. Esch
...
Duygu Ataman
Orevaoghene Ahia
Oghenefego Ahia
Sweta Agrawal
Mofetoluwa Adeyemi
56
278
0
22 Mar 2021
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
168
1,179
0
18 Mar 2021
Smoothing and Shrinking the Sparse Seq2Seq Search Space
Ben Peters
André F. T. Martins
125
17
0
18 Mar 2021
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
204
1,022
0
04 Mar 2021
Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing
Zejian Liu
Gang Li
Jian Cheng
MQ
42
61
0
04 Mar 2021
Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation
Runzhe Zhan
Xuebo Liu
Derek F. Wong
Lidia S. Chao
75
46
0
03 Mar 2021
Random Feature Attention
Hao Peng
Nikolaos Pappas
Dani Yogatama
Roy Schwartz
Noah A. Smith
Lingpeng Kong
107
362
0
03 Mar 2021
Adaptive Semiparametric Language Models
Dani Yogatama
Cyprien de Masson dÁutume
Lingpeng Kong
KELM
RALM
80
100
0
04 Feb 2021
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
314
723
0
31 Jan 2021
Muppet: Massive Multi-task Representations with Pre-Finetuning
Armen Aghajanyan
Anchit Gupta
Akshat Shrivastava
Xilun Chen
Luke Zettlemoyer
Sonal Gupta
79
269
0
26 Jan 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
262
429
0
18 Jan 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
88
2,208
0
11 Jan 2021
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
160
351
0
05 Jan 2021
Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers
Machel Reid
Edison Marrese-Taylor
Y. Matsuo
MoE
73
48
0
01 Jan 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
248
4,298
0
01 Jan 2021
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
273
90
0
31 Dec 2020
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
211
227
0
31 Dec 2020
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning
Armen Aghajanyan
Luke Zettlemoyer
Sonal Gupta
101
570
1
22 Dec 2020
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Hanrui Wang
Zhekai Zhang
Song Han
122
393
0
17 Dec 2020
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
82
405
0
14 Dec 2020
Data and its (dis)contents: A survey of dataset development and use in machine learning research
Amandalynne Paullada
Inioluwa Deborah Raji
Emily M. Bender
Emily L. Denton
A. Hanna
121
525
0
09 Dec 2020
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference
Thierry Tambe
Coleman Hooper
Lillian Pentecost
Tianyu Jia
En-Yu Yang
...
Victor Sanh
P. Whatmough
Alexander M. Rush
David Brooks
Gu-Yeon Wei
50
123
0
28 Nov 2020
Long Range Arena: A Benchmark for Efficient Transformers
Yi Tay
Mostafa Dehghani
Samira Abnar
Songlin Yang
Dara Bahri
Philip Pham
J. Rao
Liu Yang
Sebastian Ruder
Donald Metzler
152
727
0
08 Nov 2020
AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts
Taylor Shin
Yasaman Razeghi
Robert L Logan IV
Eric Wallace
Sameer Singh
KELM
79
400
0
29 Oct 2020
AdapterDrop: On the Efficiency of Adapters in Transformers
Andreas Rucklé
Gregor Geigle
Max Glockner
Tilman Beck
Jonas Pfeiffer
Nils Reimers
Iryna Gurevych
121
266
0
22 Oct 2020
The De-democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research
N. Ahmed
Muntasir Wahed
69
111
0
22 Oct 2020
Cold-start Active Learning through Self-supervised Language Modeling
Michelle Yuan
Hsuan-Tien Lin
Jordan L. Boyd-Graber
181
184
0
19 Oct 2020
SMYRF: Efficient Attention using Asymmetric Clustering
Giannis Daras
Nikita Kitaev
Augustus Odena
A. Dimakis
83
46
0
11 Oct 2020
Towards Accurate and Reliable Energy Measurement of NLP Models
Qingqing Cao
A. Balasubramanian
Niranjan Balasubramanian
18
33
0
11 Oct 2020
Self-Paced Learning for Neural Machine Translation
Boyi Deng
Baosong Yang
Derek F. Wong
Yikai Zhou
Lidia S. Chao
Haibo Zhang
Boxing Chen
110
49
0
09 Oct 2020
Characterising Bias in Compressed Models
Sara Hooker
Nyalleng Moorosi
Gregory Clark
Samy Bengio
Emily L. Denton
70
185
0
06 Oct 2020
Nearest Neighbor Machine Translation
Urvashi Khandelwal
Angela Fan
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
64
286
0
01 Oct 2020
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
186
1,597
0
30 Sep 2020
TernaryBERT: Distillation-aware Ultra-low Bit BERT
Wei Zhang
Lu Hou
Yichun Yin
Lifeng Shang
Xiao Chen
Xin Jiang
Qun Liu
MQ
93
211
0
27 Sep 2020
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
Swabha Swayamdipta
Roy Schwartz
Nicholas Lourie
Yizhong Wang
Hannaneh Hajishirzi
Noah A. Smith
Yejin Choi
118
452
0
22 Sep 2020
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
Timo Schick
Hinrich Schütze
130
974
0
15 Sep 2020
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
159
1,124
0
14 Sep 2020
The Hardware Lottery
Sara Hooker
75
212
0
14 Sep 2020
A Survey of Deep Active Learning
Pengzhen Ren
Yun Xiao
Xiaojun Chang
Po-Yao (Bernie) Huang
Zhihui Li
Brij B. Gupta
Xiaojiang Chen
Xin Wang
103
1,146
0
30 Aug 2020
What is being transferred in transfer learning?
Behnam Neyshabur
Hanie Sedghi
Chiyuan Zhang
109
527
0
26 Aug 2020
Estimating Example Difficulty Using Variance of Gradients
Chirag Agarwal
Daniel D'souza
Sara Hooker
251
111
0
26 Aug 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
554
2,098
0
28 Jul 2020
The Computational Limits of Deep Learning
Neil C. Thompson
Kristjan Greenewald
Keeheon Lee
Gabriel F. Manso
VLM
53
528
0
10 Jul 2020
Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning
Matthias Feurer
Katharina Eggensperger
Stefan Falkner
Marius Lindauer
Frank Hutter
85
279
0
08 Jul 2020
Carbontracker: Tracking and Predicting the Carbon Footprint of Training Deep Learning Models
Lasse F. Wolff Anthony
Benjamin Kanding
Raghavendra Selvan
HAI
62
313
0
06 Jul 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
116
1,184
0
30 Jun 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
201
1,786
0
29 Jun 2020
Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL
Lucas Zimmer
Marius Lindauer
Frank Hutter
MU
118
92
0
24 Jun 2020
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
F. Iandola
Albert Eaton Shaw
Ravi Krishna
Kurt Keutzer
VLM
71
127
0
19 Jun 2020
Previous
1
2
3
4
5
Next