ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.08900
  4. Cited By
The Cost of Training NLP Models: A Concise Overview

The Cost of Training NLP Models: A Concise Overview

19 April 2020
Or Sharir
Barak Peleg
Y. Shoham
ArXivPDFHTML

Papers citing "The Cost of Training NLP Models: A Concise Overview"

50 / 104 papers shown
Title
EZClone: Improving DNN Model Extraction Attack via Shape Distillation
  from GPU Execution Profiles
EZClone: Improving DNN Model Extraction Attack via Shape Distillation from GPU Execution Profiles
Jonah O'Brien Weiss
Tiago A. O. Alves
S. Kundu
MIACV
AAML
FedML
22
8
0
06 Apr 2023
The Shaky Foundations of Clinical Foundation Models: A Survey of Large
  Language Models and Foundation Models for EMRs
The Shaky Foundations of Clinical Foundation Models: A Survey of Large Language Models and Foundation Models for EMRs
Michael Wornow
Yizhe Xu
Rahul Thapa
Birju S. Patel
E. Steinberg
Scott L. Fleming
M. Pfeffer
Jason Alan Fries
N. Shah
LM&MA
28
32
0
22 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
25
42
0
10 Mar 2023
Provable Data Subset Selection For Efficient Neural Network Training
Provable Data Subset Selection For Efficient Neural Network Training
M. Tukan
Samson Zhou
Alaa Maalouf
Daniela Rus
Vladimir Braverman
Dan Feldman
MLT
25
9
0
09 Mar 2023
The Contribution of Knowledge in Visiolinguistic Learning: A Survey on
  Tasks and Challenges
The Contribution of Knowledge in Visiolinguistic Learning: A Survey on Tasks and Challenges
Maria Lymperaiou
Giorgos Stamou
VLM
32
4
0
04 Mar 2023
On the Generalization Ability of Retrieval-Enhanced Transformers
On the Generalization Ability of Retrieval-Enhanced Transformers
Tobias Norlund
Ehsan Doostmohammadi
Richard Johansson
Marco Kuhlmann
RALM
27
6
0
23 Feb 2023
Complex QA and language models hybrid architectures, Survey
Complex QA and language models hybrid architectures, Survey
Xavier Daull
P. Bellot
Emmanuel Bruno
Vincent Martin
Elisabeth Murisasco
ELM
28
15
0
17 Feb 2023
Which Model Shall I Choose? Cost/Quality Trade-offs for Text
  Classification Tasks
Which Model Shall I Choose? Cost/Quality Trade-offs for Text Classification Tasks
Shi Zong
Joshua Seltzer
Jia-Yu Pan
Pan
Kathy Cheng
Jimmy J. Lin
21
4
0
17 Jan 2023
Renormalization in the neural network-quantum field theory
  correspondence
Renormalization in the neural network-quantum field theory correspondence
Harold Erbin
Vincent Lahoche
D. O. Samary
39
7
0
22 Dec 2022
Review of security techniques for memristor computing systems
Review of security techniques for memristor computing systems
Minhui Zou
Nan Du
Shahar Kvatinsky
AAML
16
7
0
19 Dec 2022
Memorization of Named Entities in Fine-tuned BERT Models
Memorization of Named Entities in Fine-tuned BERT Models
Andor Diera
N. Lell
Aygul Garifullina
A. Scherp
17
0
0
07 Dec 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model
  From Scratch?
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Joel Niklaus
Daniele Giofré
30
11
0
30 Nov 2022
A survey on knowledge-enhanced multimodal learning
A survey on knowledge-enhanced multimodal learning
Maria Lymperaiou
Giorgos Stamou
41
13
0
19 Nov 2022
Partitioned Gradient Matching-based Data Subset Selection for
  Compute-Efficient Robust ASR Training
Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training
Ashish R. Mittal
D. Sivasubramanian
Rishabh K. Iyer
P. Jyothi
Ganesh Ramakrishnan
19
3
0
30 Oct 2022
Tempo: Accelerating Transformer-Based Model Training through Memory
  Footprint Reduction
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Muralidhar Andoorveedu
Zhanda Zhu
Bojian Zheng
Gennady Pekhimenko
20
6
0
19 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
31
47
0
13 Oct 2022
Green Learning: Introduction, Examples and Outlook
Green Learning: Introduction, Examples and Outlook
C.-C. Jay Kuo
A. Madni
70
71
0
03 Oct 2022
Dataset Inference for Self-Supervised Models
Dataset Inference for Self-Supervised Models
Adam Dziedzic
Haonan Duan
Muhammad Ahmad Kaleem
Nikita Dhawan
Jonas Guan
Yannis Cattan
Franziska Boenisch
Nicolas Papernot
32
26
0
16 Sep 2022
Training a T5 Using Lab-sized Resources
Training a T5 Using Lab-sized Resources
Manuel R. Ciosici
Leon Derczynski
VLM
33
8
0
25 Aug 2022
Deep Unsupervised Domain Adaptation: A Review of Recent Advances and
  Perspectives
Deep Unsupervised Domain Adaptation: A Review of Recent Advances and Perspectives
Xiaofeng Liu
Chaehwa Yoo
Fangxu Xing
Hyejin Oh
G. El Fakhri
Je-Won Kang
Jonghye Woo
OOD
43
191
0
15 Aug 2022
Efficient model compression with Random Operation Access Specific Tile
  (ROAST) hashing
Efficient model compression with Random Operation Access Specific Tile (ROAST) hashing
Aditya Desai
K. Zhou
Anshumali Shrivastava
14
1
0
21 Jul 2022
Confident Adaptive Language Modeling
Confident Adaptive Language Modeling
Tal Schuster
Adam Fisch
Jai Gupta
Mostafa Dehghani
Dara Bahri
Vinh Q. Tran
Yi Tay
Donald Metzler
43
160
0
14 Jul 2022
PASHA: Efficient HPO and NAS with Progressive Resource Allocation
PASHA: Efficient HPO and NAS with Progressive Resource Allocation
Ondrej Bohdal
Lukas Balles
Martin Wistuba
B. Ermiş
Cédric Archambeau
Giovanni Zappella
32
12
0
14 Jul 2022
Machine Learning Model Sizes and the Parameter Gap
Machine Learning Model Sizes and the Parameter Gap
Pablo Villalobos
J. Sevilla
T. Besiroglu
Lennart Heim
A. Ho
Marius Hobbhahn
ALM
ELM
AI4CE
30
58
0
05 Jul 2022
Tutel: Adaptive Mixture-of-Experts at Scale
Tutel: Adaptive Mixture-of-Experts at Scale
Changho Hwang
Wei Cui
Yifan Xiong
Ziyue Yang
Ze Liu
...
Joe Chau
Peng Cheng
Fan Yang
Mao Yang
Y. Xiong
MoE
97
110
0
07 Jun 2022
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Sosuke Kobayashi
Shun Kiyono
Jun Suzuki
Kentaro Inui
MoMe
26
7
0
24 May 2022
On the Difficulty of Defending Self-Supervised Learning against Model
  Extraction
On the Difficulty of Defending Self-Supervised Learning against Model Extraction
Adam Dziedzic
Nikita Dhawan
Muhammad Ahmad Kaleem
Jonas Guan
Nicolas Papernot
MIACV
54
22
0
16 May 2022
MRKL Systems: A modular, neuro-symbolic architecture that combines large
  language models, external knowledge sources and discrete reasoning
MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
Ehud D. Karpas
Omri Abend
Yonatan Belinkov
Barak Lenz
Opher Lieber
...
Erez Schwartz
Gal Shachaf
Shai Shalev-Shwartz
Amnon Shashua
Moshe Tenenholtz
LLMAG
12
68
0
01 May 2022
Standing on the Shoulders of Giant Frozen Language Models
Standing on the Shoulders of Giant Frozen Language Models
Yoav Levine
Itay Dalmedigos
Ori Ram
Yoel Zeldes
Daniel Jannai
...
Barak Lenz
Shai Shalev-Shwartz
Amnon Shashua
Kevin Leyton-Brown
Y. Shoham
VLM
35
49
0
21 Apr 2022
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural
  Language Guidance
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
CLIP
74
368
0
18 Apr 2022
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box
  Floating-Point Transformer Models
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Ali Hadi Zadeh
Mostafa Mahmoud
Ameer Abdelhadi
Andreas Moshovos
MQ
24
31
0
23 Mar 2022
Towards Personalized Intelligence at Scale
Towards Personalized Intelligence at Scale
Yiping Kang
Ashish Mahendra
Christopher Clarke
Lingjia Tang
Jason Mars
17
1
0
13 Mar 2022
DCT-Former: Efficient Self-Attention with Discrete Cosine Transform
DCT-Former: Efficient Self-Attention with Discrete Cosine Transform
Carmelo Scribano
Giorgia Franchini
M. Prato
Marko Bertogna
18
21
0
02 Mar 2022
Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing?
  A Structured Review
Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing? A Structured Review
Kyle Hamilton
Aparna Nayak
Bojan Bozic
Luca Longo
NAI
29
57
0
24 Feb 2022
Compute Trends Across Three Eras of Machine Learning
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
27
269
0
11 Feb 2022
Benchmarking Resource Usage for Efficient Distributed Deep Learning
Benchmarking Resource Usage for Efficient Distributed Deep Learning
Nathan C. Frey
Baolin Li
Joseph McDonald
Dan Zhao
Michael Jones
David Bestor
Devesh Tiwari
V. Gadepally
S. Samsi
32
9
0
28 Jan 2022
Copy, Right? A Testing Framework for Copyright Protection of Deep
  Learning Models
Copy, Right? A Testing Framework for Copyright Protection of Deep Learning Models
Jialuo Chen
Jingyi Wang
Tinglan Peng
Youcheng Sun
Peng Cheng
S. Ji
Xingjun Ma
Bo-wen Li
D. Song
AAML
12
63
0
10 Dec 2021
On the Existence of Universal Lottery Tickets
On the Existence of Universal Lottery Tickets
R. Burkholz
Nilanjana Laha
Rajarshi Mukherjee
Alkis Gotovos
UQCV
13
32
0
22 Nov 2021
Varuna: Scalable, Low-cost Training of Massive Deep Learning Models
Varuna: Scalable, Low-cost Training of Massive Deep Learning Models
Sanjith Athlur
Nitika Saran
Muthian Sivathanu
Ramachandran Ramjee
Nipun Kwatra
GNN
31
80
0
07 Nov 2021
The Efficiency Misnomer
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
34
99
0
25 Oct 2021
Automated Essay Scoring Using Transformer Models
Automated Essay Scoring Using Transformer Models
Sabrina Ludwig
Christian W. F. Mayer
Christopher Hansen
Kerstin Eilers
Steffen Brandt
19
38
0
13 Oct 2021
Dynamic Language Models for Continuously Evolving Content
Dynamic Language Models for Continuously Evolving Content
Spurthi Amba Hombaiah
Tao Chen
Mingyang Zhang
Michael Bendersky
Marc Najork
CLL
KELM
40
37
0
11 Jun 2021
Consistent Accelerated Inference via Confident Adaptive Transformers
Consistent Accelerated Inference via Confident Adaptive Transformers
Tal Schuster
Adam Fisch
Tommi Jaakkola
Regina Barzilay
AI4TS
184
69
0
18 Apr 2021
Perspectives and Prospects on Transformer Architecture for Cross-Modal
  Tasks with Language and Vision
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
Andrew Shin
Masato Ishii
T. Narihira
35
37
0
06 Mar 2021
GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient
  Deep Model Training
GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient Deep Model Training
Krishnateja Killamsetty
D. Sivasubramanian
Ganesh Ramakrishnan
A. De
Rishabh K. Iyer
OOD
91
188
0
27 Feb 2021
GIST: Distributed Training for Large-Scale Graph Convolutional Networks
GIST: Distributed Training for Large-Scale Graph Convolutional Networks
Cameron R. Wolfe
Jingkang Yang
Arindam Chowdhury
Chen Dun
Artun Bayer
Santiago Segarra
Anastasios Kyrillidis
BDL
GNN
LRM
49
9
0
20 Feb 2021
Scaling Down Deep Learning with MNIST-1D
Scaling Down Deep Learning with MNIST-1D
S. Greydanus
Dmitry Kobak
13
20
0
29 Nov 2020
Challenges in Deploying Machine Learning: a Survey of Case Studies
Challenges in Deploying Machine Learning: a Survey of Case Studies
Andrei Paleyes
Raoul-Gabriel Urma
Neil D. Lawrence
23
389
0
18 Nov 2020
Class-incremental learning: survey and performance evaluation on image
  classification
Class-incremental learning: survey and performance evaluation on image classification
Marc Masana
Xialei Liu
Bartlomiej Twardowski
Mikel Menta
Andrew D. Bagdanov
Joost van de Weijer
CLL
25
660
0
28 Oct 2020
Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of
  claims using transformer-based models
Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of claims using transformer-based models
Evan Williams
Paul Rodrigues
Valerie Novak
34
42
0
05 Sep 2020
Previous
123
Next