ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.00099
  4. Cited By
Efficient Methods for Natural Language Processing: A Survey
v1v2 (latest)

Efficient Methods for Natural Language Processing: A Survey

31 August 2022
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
Manuel R. Ciosici
Michael Hassid
Kenneth Heafield
Sara Hooker
Colin Raffel
Pedro Henrique Martins
André F. T. Martins
Jessica Zosa Forde
Peter Milder
Edwin Simpson
Noam Slonim
Jesse Dodge
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
ArXiv (abs)PDFHTML

Papers citing "Efficient Methods for Natural Language Processing: A Survey"

44 / 244 papers shown
Title
Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
Jordan T. Ash
Chicheng Zhang
A. Krishnamurthy
John Langford
Alekh Agarwal
BDLUQCV
88
776
0
09 Jun 2019
Energy and Policy Considerations for Deep Learning in NLP
Energy and Policy Considerations for Deep Learning in NLP
Emma Strubell
Ananya Ganesh
Andrew McCallum
73
2,660
0
05 Jun 2019
Efficient 8-Bit Quantization of Transformer Neural Machine Language
  Translation Model
Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model
Aishwarya Bhandare
Vamsi Sripathi
Deepthi Karkada
Vivek V. Menon
Sun Choi
Kushal Datta
V. Saletore
MQ
69
132
0
03 Jun 2019
A Study of BFLOAT16 for Deep Learning Training
A Study of BFLOAT16 for Deep Learning Training
Dhiraj D. Kalamkar
Dheevatsa Mudigere
Naveen Mellempudi
Dipankar Das
K. Banerjee
...
Sudarshan Srinivasan
Abhisek Kundu
M. Smelyanskiy
Bharat Kaul
Pradeep Dubey
MQ
83
347
0
29 May 2019
Simple and Effective Curriculum Pointer-Generator Networks for Reading
  Comprehension over Long Narratives
Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives
Yi Tay
Shuohang Wang
Anh Tuan Luu
Jie Fu
Minh C. Phan
Xingdi Yuan
J. Rao
S. Hui
Aston Zhang
88
110
0
26 May 2019
Are Sixteen Heads Really Better than One?
Are Sixteen Heads Really Better than One?
Paul Michel
Omer Levy
Graham Neubig
MoE
103
1,068
0
25 May 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy
  Lifting, the Rest Can Be Pruned
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
114
1,146
0
23 May 2019
Curriculum Learning for Domain Adaptation in Neural Machine Translation
Curriculum Learning for Domain Adaptation in Neural Machine Translation
Xuan Zhang
Pamela Shapiro
Manish Kumar
Paul McNamee
Marine Carpuat
Kevin Duh
64
124
0
14 May 2019
Sparse Sequence-to-Sequence Models
Sparse Sequence-to-Sequence Models
Ben Peters
Vlad Niculae
André F. T. Martins
TPM
177
213
0
14 May 2019
Generating Long Sequences with Sparse Transformers
Generating Long Sequences with Sparse Transformers
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
129
1,908
0
23 Apr 2019
Competence-based Curriculum Learning for Neural Machine Translation
Competence-based Curriculum Learning for Neural Machine Translation
Emmanouil Antonios Platanios
Otilia Stretcu
Graham Neubig
Barnabás Póczós
Tom Michael Mitchell
89
344
0
23 Mar 2019
The State of Sparsity in Deep Neural Networks
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
161
761
0
25 Feb 2019
Parameter Efficient Training of Deep Convolutional Neural Networks by
  Dynamic Sparse Reparameterization
Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
Hesham Mostafa
Xin Wang
79
314
0
15 Feb 2019
Parameter-Efficient Transfer Learning for NLP
Parameter-Efficient Transfer Learning for NLP
N. Houlsby
A. Giurgiu
Stanislaw Jastrzebski
Bruna Morrone
Quentin de Laroussilhe
Andrea Gesmundo
Mona Attariyan
Sylvain Gelly
217
4,499
0
02 Feb 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
253
3,745
0
09 Jan 2019
Rethinking ImageNet Pre-training
Rethinking ImageNet Pre-training
Kaiming He
Ross B. Girshick
Piotr Dollár
VLMSSeg
130
1,086
0
21 Nov 2018
A System for Massively Parallel Hyperparameter Tuning
A System for Massively Parallel Hyperparameter Tuning
Liam Li
Kevin Jamieson
Afshin Rostamizadeh
Ekaterina Gonina
Moritz Hardt
Benjamin Recht
Ameet Talwalkar
68
386
0
13 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,114
0
11 Oct 2018
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense
  Inference
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference
Rowan Zellers
Yonatan Bisk
Roy Schwartz
Yejin Choi
109
718
0
16 Aug 2018
Practical Obstacles to Deploying Active Learning
Practical Obstacles to Deploying Active Learning
David Lowell
Zachary Chase Lipton
Byron C. Wallace
84
111
0
12 Jul 2018
Universal Transformers
Universal Transformers
Mostafa Dehghani
Stephan Gouws
Oriol Vinyals
Jakob Uszkoreit
Lukasz Kaiser
87
755
0
10 Jul 2018
Measuring the Intrinsic Dimension of Objective Landscapes
Measuring the Intrinsic Dimension of Objective Landscapes
Chunyuan Li
Heerad Farkhoor
Rosanne Liu
J. Yosinski
86
414
0
24 Apr 2018
Pieces of Eight: 8-bit Neural Machine Translation
Pieces of Eight: 8-bit Neural Machine Translation
Jerry Quinn
Miguel Ballesteros
MQ
53
25
0
13 Apr 2018
Datasheets for Datasets
Datasheets for Datasets
Timnit Gebru
Jamie Morgenstern
Briana Vecchione
Jennifer Wortman Vaughan
Hanna M. Wallach
Hal Daumé
Kate Crawford
266
2,194
0
23 Mar 2018
Self-Attention with Relative Position Representations
Self-Attention with Relative Position Representations
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
177
2,295
0
06 Mar 2018
Deep contextualized word representations
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
227
11,565
0
15 Feb 2018
Learning Sparse Neural Networks through $L_0$ Regularization
Learning Sparse Neural Networks through L0L_0L0​ Regularization
Christos Louizos
Max Welling
Diederik P. Kingma
433
1,147
0
04 Dec 2017
Mixed Precision Training
Mixed Precision Training
Paulius Micikevicius
Sharan Narang
Jonah Alben
G. Diamos
Erich Elsen
...
Boris Ginsburg
Michael Houston
Oleksii Kuchaiev
Ganesh Venkatesh
Hao Wu
168
1,804
0
10 Oct 2017
Reporting Score Distributions Makes a Difference: Performance Study of
  LSTM-networks for Sequence Tagging
Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging
Nils Reimers
Iryna Gurevych
75
437
0
31 Jul 2017
An Overview of Multi-Task Learning in Deep Neural Networks
An Overview of Multi-Task Learning in Deep Neural Networks
Sebastian Ruder
CVBM
156
2,830
0
15 Jun 2017
Learning multiple visual domains with residual adapters
Learning multiple visual domains with residual adapters
Sylvestre-Alvise Rebuffi
Hakan Bilen
Andrea Vedaldi
OOD
173
937
0
22 May 2017
Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain
  Surgeon
Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon
Xin Luna Dong
Shangyu Chen
Sinno Jialin Pan
178
506
0
22 May 2017
Search Engine Guided Non-Parametric Neural Machine Translation
Search Engine Guided Non-Parametric Neural Machine Translation
Jiatao Gu
Yong Wang
Kyunghyun Cho
Victor O.K. Li
55
49
0
20 May 2017
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
1.2K
20,880
0
17 Apr 2017
Learning to Generate Reviews and Discovering Sentiment
Learning to Generate Reviews and Discovering Sentiment
Alec Radford
Rafal Jozefowicz
Ilya Sutskever
97
510
0
05 Apr 2017
Deep Bayesian Active Learning with Image Data
Deep Bayesian Active Learning with Image Data
Y. Gal
Riashat Islam
Zoubin Ghahramani
BDLUQCV
73
1,739
0
08 Mar 2017
Outrageously Large Neural Networks: The Sparsely-Gated
  Mixture-of-Experts Layer
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam M. Shazeer
Azalia Mirhoseini
Krzysztof Maziarz
Andy Davis
Quoc V. Le
Geoffrey E. Hinton
J. Dean
MoE
251
2,683
0
23 Jan 2017
Sequence-Level Knowledge Distillation
Sequence-Level Knowledge Distillation
Yoon Kim
Alexander M. Rush
122
1,120
0
25 Jun 2016
A large annotated corpus for learning natural language inference
A large annotated corpus for learning natural language inference
Samuel R. Bowman
Gabor Angeli
Christopher Potts
Christopher D. Manning
321
4,293
0
21 Aug 2015
Learning both Weights and Connections for Efficient Neural Networks
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
313
6,694
0
08 Jun 2015
Distilling the Knowledge in a Neural Network
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
362
19,723
0
09 Mar 2015
Non-stochastic Best Arm Identification and Hyperparameter Optimization
Non-stochastic Best Arm Identification and Hyperparameter Optimization
Kevin Jamieson
Ameet Talwalkar
208
580
0
27 Feb 2015
Practical Bayesian Optimization of Machine Learning Algorithms
Practical Bayesian Optimization of Machine Learning Algorithms
Jasper Snoek
Hugo Larochelle
Ryan P. Adams
359
7,954
0
13 Jun 2012
Sample Selection Bias Correction Theory
Sample Selection Bias Correction Theory
Corinna Cortes
M. Mohri
Michael Riley
Afshin Rostamizadeh
103
350
0
19 May 2008
Previous
12345