The Cost of Training NLP Models: A Concise Overview

19 April 2020

Papers citing "The Cost of Training NLP Models: A Concise Overview"

50 / 104 papers shown

Title
EZClone: Improving DNN Model Extraction Attack via Shape Distillation from GPU Execution Profiles Jonah O'Brien Weiss Tiago A. O. Alves S. Kundu MIACV AAML FedML 22 8 0 06 Apr 2023
The Shaky Foundations of Clinical Foundation Models: A Survey of Large Language Models and Foundation Models for EMRs Michael Wornow Yizhe Xu Rahul Thapa Birju S. Patel E. Steinberg Scott L. Fleming M. Pfeffer Jason Alan Fries N. Shah LM&MA 28 32 0 22 Mar 2023
An Overview on Language Models: Recent Developments and Outlook Chengwei Wei Yun Cheng Wang Bin Wang C.-C. Jay Kuo 25 42 0 10 Mar 2023
Provable Data Subset Selection For Efficient Neural Network Training M. Tukan Samson Zhou Alaa Maalouf Daniela Rus Vladimir Braverman Dan Feldman MLT 25 9 0 09 Mar 2023
The Contribution of Knowledge in Visiolinguistic Learning: A Survey on Tasks and Challenges Maria Lymperaiou Giorgos Stamou VLM 32 4 0 04 Mar 2023
On the Generalization Ability of Retrieval-Enhanced Transformers Tobias Norlund Ehsan Doostmohammadi Richard Johansson Marco Kuhlmann RALM 27 6 0 23 Feb 2023
Complex QA and language models hybrid architectures, Survey Xavier Daull P. Bellot Emmanuel Bruno Vincent Martin Elisabeth Murisasco ELM 28 15 0 17 Feb 2023
Which Model Shall I Choose? Cost/Quality Trade-offs for Text Classification Tasks Shi Zong Joshua Seltzer Jia-Yu Pan Pan Kathy Cheng Jimmy J. Lin 21 4 0 17 Jan 2023
Renormalization in the neural network-quantum field theory correspondence Harold Erbin Vincent Lahoche D. O. Samary 39 7 0 22 Dec 2022
Review of security techniques for memristor computing systems Minhui Zou Nan Du Shahar Kvatinsky AAML 16 7 0 19 Dec 2022
Memorization of Named Entities in Fine-tuned BERT Models Andor Diera N. Lell Aygul Garifullina A. Scherp 17 0 0 07 Dec 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch? Joel Niklaus Daniele Giofré 30 11 0 30 Nov 2022
A survey on knowledge-enhanced multimodal learning Maria Lymperaiou Giorgos Stamou 41 13 0 19 Nov 2022
Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training Ashish R. Mittal D. Sivasubramanian Rishabh K. Iyer P. Jyothi Ganesh Ramakrishnan 19 3 0 30 Oct 2022
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction Muralidhar Andoorveedu Zhanda Zhu Bojian Zheng Gennady Pekhimenko 20 6 0 19 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities Brian Bartoldson B. Kailkhura Davis W. Blalock 31 47 0 13 Oct 2022
Green Learning: Introduction, Examples and Outlook C.-C. Jay Kuo A. Madni 70 71 0 03 Oct 2022
Dataset Inference for Self-Supervised Models Adam Dziedzic Haonan Duan Muhammad Ahmad Kaleem Nikita Dhawan Jonas Guan Yannis Cattan Franziska Boenisch Nicolas Papernot 32 26 0 16 Sep 2022
Training a T5 Using Lab-sized Resources Manuel R. Ciosici Leon Derczynski VLM 33 8 0 25 Aug 2022
Deep Unsupervised Domain Adaptation: A Review of Recent Advances and Perspectives Xiaofeng Liu Chaehwa Yoo Fangxu Xing Hyejin Oh G. El Fakhri Je-Won Kang Jonghye Woo OOD 43 191 0 15 Aug 2022
Efficient model compression with Random Operation Access Specific Tile (ROAST) hashing Aditya Desai K. Zhou Anshumali Shrivastava 14 1 0 21 Jul 2022
Confident Adaptive Language Modeling Tal Schuster Adam Fisch Jai Gupta Mostafa Dehghani Dara Bahri Vinh Q. Tran Yi Tay Donald Metzler 43 160 0 14 Jul 2022
PASHA: Efficient HPO and NAS with Progressive Resource Allocation Ondrej Bohdal Lukas Balles Martin Wistuba B. Ermiş Cédric Archambeau Giovanni Zappella 32 12 0 14 Jul 2022
Machine Learning Model Sizes and the Parameter Gap Pablo Villalobos J. Sevilla T. Besiroglu Lennart Heim A. Ho Marius Hobbhahn ALM ELM AI4CE 30 58 0 05 Jul 2022
Tutel: Adaptive Mixture-of-Experts at Scale Changho Hwang Wei Cui Yifan Xiong Ziyue Yang Ze Liu ... Joe Chau Peng Cheng Fan Yang Mao Yang Y. Xiong MoE 97 110 0 07 Jun 2022
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model Sosuke Kobayashi Shun Kiyono Jun Suzuki Kentaro Inui MoMe 26 7 0 24 May 2022
On the Difficulty of Defending Self-Supervised Learning against Model Extraction Adam Dziedzic Nikita Dhawan Muhammad Ahmad Kaleem Jonas Guan Nicolas Papernot MIACV 54 22 0 16 May 2022
MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning Ehud D. Karpas Omri Abend Yonatan Belinkov Barak Lenz Opher Lieber ... Erez Schwartz Gal Shachaf Shai Shalev-Shwartz Amnon Shashua Moshe Tenenholtz LLMAG 12 68 0 01 May 2022
Standing on the Shoulders of Giant Frozen Language Models Yoav Levine Itay Dalmedigos Ori Ram Yoel Zeldes Daniel Jannai ... Barak Lenz Shai Shalev-Shwartz Amnon Shashua Kevin Leyton-Brown Y. Shoham VLM 35 49 0 21 Apr 2022
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance Katherine Crowson Stella Biderman Daniel Kornis Dashiell Stander Eric Hallahan Louis Castricato Edward Raff CLIP 74 368 0 18 Apr 2022
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models Ali Hadi Zadeh Mostafa Mahmoud Ameer Abdelhadi Andreas Moshovos MQ 24 31 0 23 Mar 2022
Towards Personalized Intelligence at Scale Yiping Kang Ashish Mahendra Christopher Clarke Lingjia Tang Jason Mars 17 1 0 13 Mar 2022
DCT-Former: Efficient Self-Attention with Discrete Cosine Transform Carmelo Scribano Giorgia Franchini M. Prato Marko Bertogna 18 21 0 02 Mar 2022
Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing? A Structured Review Kyle Hamilton Aparna Nayak Bojan Bozic Luca Longo NAI 29 57 0 24 Feb 2022
Compute Trends Across Three Eras of Machine Learning J. Sevilla Lennart Heim A. Ho T. Besiroglu Marius Hobbhahn Pablo Villalobos 27 269 0 11 Feb 2022
Benchmarking Resource Usage for Efficient Distributed Deep Learning Nathan C. Frey Baolin Li Joseph McDonald Dan Zhao Michael Jones David Bestor Devesh Tiwari V. Gadepally S. Samsi 32 9 0 28 Jan 2022
Copy, Right? A Testing Framework for Copyright Protection of Deep Learning Models Jialuo Chen Jingyi Wang Tinglan Peng Youcheng Sun Peng Cheng S. Ji Xingjun Ma Bo-wen Li D. Song AAML 12 63 0 10 Dec 2021
On the Existence of Universal Lottery Tickets R. Burkholz Nilanjana Laha Rajarshi Mukherjee Alkis Gotovos UQCV 13 32 0 22 Nov 2021
Varuna: Scalable, Low-cost Training of Massive Deep Learning Models Sanjith Athlur Nitika Saran Muthian Sivathanu Ramachandran Ramjee Nipun Kwatra GNN 31 80 0 07 Nov 2021
The Efficiency Misnomer Daoyuan Chen Liuyi Yao Dawei Gao Ashish Vaswani Yaliang Li 34 99 0 25 Oct 2021
Automated Essay Scoring Using Transformer Models Sabrina Ludwig Christian W. F. Mayer Christopher Hansen Kerstin Eilers Steffen Brandt 19 38 0 13 Oct 2021
Dynamic Language Models for Continuously Evolving Content Spurthi Amba Hombaiah Tao Chen Mingyang Zhang Michael Bendersky Marc Najork CLL KELM 40 37 0 11 Jun 2021
Consistent Accelerated Inference via Confident Adaptive Transformers Tal Schuster Adam Fisch Tommi Jaakkola Regina Barzilay AI4TS 184 69 0 18 Apr 2021
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision Andrew Shin Masato Ishii T. Narihira 35 37 0 06 Mar 2021
GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient Deep Model Training Krishnateja Killamsetty D. Sivasubramanian Ganesh Ramakrishnan A. De Rishabh K. Iyer OOD 91 188 0 27 Feb 2021
GIST: Distributed Training for Large-Scale Graph Convolutional Networks Cameron R. Wolfe Jingkang Yang Arindam Chowdhury Chen Dun Artun Bayer Santiago Segarra Anastasios Kyrillidis BDL GNN LRM 49 9 0 20 Feb 2021
Scaling Down Deep Learning with MNIST-1D S. Greydanus Dmitry Kobak 13 20 0 29 Nov 2020
Challenges in Deploying Machine Learning: a Survey of Case Studies Andrei Paleyes Raoul-Gabriel Urma Neil D. Lawrence 23 389 0 18 Nov 2020
Class-incremental learning: survey and performance evaluation on image classification Marc Masana Xialei Liu Bartlomiej Twardowski Mikel Menta Andrew D. Bagdanov Joost van de Weijer CLL 25 660 0 28 Oct 2020
Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of claims using transformer-based models Evan Williams Paul Rodrigues Valerie Novak 34 42 0 05 Sep 2020