Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.03771
Cited By
v1
v2
v3
v4
v5 (latest)
HuggingFace's Transformers: State-of-the-art Natural Language Processing
9 October 2019
Thomas Wolf
Lysandre Debut
Victor Sanh
Julien Chaumond
Clement Delangue
Anthony Moi
Pierric Cistac
Tim Rault
Rémi Louf
Morgan Funtowicz
Joe Davison
Sam Shleifer
Patrick von Platen
Clara Ma
Yacine Jernite
J. Plu
Canwen Xu
Teven Le Scao
Sylvain Gugger
Mariama Drame
Quentin Lhoest
Alexander M. Rush
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Github (144926★)
Papers citing
"HuggingFace's Transformers: State-of-the-art Natural Language Processing"
50 / 503 papers shown
Title
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Elias Frantar
Dan Alistarh
VLM
126
737
0
02 Jan 2023
When Do Decompositions Help for Machine Reading?
Kangda Wei
Dawn J Lawrie
Benjamin Van Durme
Yunmo Chen
Orion Weller
ReLM
152
3
0
20 Dec 2022
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preoţiuc-Pietro
Pengxiang Cheng
FedML
MoMe
99
250
0
19 Dec 2022
JEMMA: An Extensible Java Dataset for ML4Code Applications
Anjan Karmakar
Miltiadis Allamanis
Romain Robbes
VLM
55
3
0
18 Dec 2022
Evaluating Step-by-Step Reasoning through Symbolic Verification
Yi-Fan Zhang
Hanlin Zhang
Li Erran Li
Eric P. Xing
ReLM
LRM
87
8
0
16 Dec 2022
Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue Systems
Denis Emelin
Daniele Bonadiman
Sawsan Alqahtani
Yi Zhang
Saab Mansour
82
17
0
15 Dec 2022
Federated Few-Shot Learning for Mobile NLP
Dongqi Cai
Shangguang Wang
Yaozong Wu
F. Lin
Mengwei Xu
FedML
93
12
0
12 Dec 2022
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
Ashwinee Panda
Xinyu Tang
Saeed Mahloujifar
Vikash Sehwag
Prateek Mittal
126
12
0
08 Dec 2022
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
213
522
0
08 Dec 2022
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
163
386
0
07 Dec 2022
Towards Practical Few-shot Federated NLP
Dongqi Cai
Yaozong Wu
Haitao Yuan
Shangguang Wang
F. Lin
Mengwei Xu
FedML
84
6
0
01 Dec 2022
COMET: A Comprehensive Cluster Design Methodology for Distributed Deep Learning Training
D. Kadiyala
Saeed Rashidi
Taekyung Heo
Abhimanyu Bambhaniya
T. Krishna
Alexandros Daglis
VLM
74
7
0
30 Nov 2022
RoentGen: Vision-Language Foundation Model for Chest X-ray Generation
Pierre J. Chambon
Christian Blüthgen
Jean-Benoit Delbrouck
Rogier van der Sluijs
M. Polacin
Juan Manuel Zambrano Chaves
Tanishq Mathew Abraham
Shivanshu Purohit
C. Langlotz
Akshay S. Chaudhari
LM&MA
DiffM
MedIm
89
102
0
23 Nov 2022
Generative Aspect-Based Sentiment Analysis with Contrastive Learning and Expressive Structure
Joseph Peper
Lu Wang
44
34
0
14 Nov 2022
Efficient Adversarial Training with Robust Early-Bird Tickets
Zhiheng Xi
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
AAML
84
9
0
14 Nov 2022
Estimating Soft Labels for Out-of-Domain Intent Detection
Hao Lang
Yinhe Zheng
Jian Sun
Feiling Huang
Luo Si
Yongbin Li
76
15
0
10 Nov 2022
ADEPT: A DEbiasing PrompT Framework
Ke Yang
Charles Yu
Yi R. Fung
Manling Li
Heng Ji
122
27
0
10 Nov 2022
Using Deep Mixture-of-Experts to Detect Word Meaning Shift for TempoWiC
Ze Chen
Kangxu Wang
Zijian Cai
Jiewen Zheng
Jiarong He
Max Gao
Jason Zhang
MoE
52
3
0
07 Nov 2022
When Language Model Meets Private Library
Daoguang Zan
Bei Chen
Zeqi Lin
Bei Guan
Yongji Wang
Jian-Guang Lou
ALM
131
74
0
31 Oct 2022
1Cademy @ Causal News Corpus 2022: Enhance Causal Span Detection via Beam-Search-based Position Selector
Xingran Chen
Ge Zhang
A. Nik
Mingyu Li
Jie Fu
80
5
0
31 Oct 2022
Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe
Xiang Yue
Huseyin A. Inan
Xuechen Li
Girish Kumar
Julia McAnallen
Hoda Shajari
Huan Sun
David Levitan
Robert Sim
152
86
0
25 Oct 2022
Are All Spurious Features in Natural Language Alike? An Analysis through a Causal Lens
Nitish Joshi
X. Pan
Hengxing He
CML
123
30
0
25 Oct 2022
On-Demand Sampling: Learning Optimally from Multiple Distributions
Nika Haghtalab
Michael I. Jordan
Eric Zhao
FedML
149
38
0
22 Oct 2022
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models
Alessandro Stolfo
Zhijing Jin
Kumar Shridhar
Bernhard Schölkopf
Mrinmaya Sachan
ELM
OOD
LRM
133
66
0
21 Oct 2022
Modeling Document-level Temporal Structures for Building Temporal Dependency Graphs
Prafulla Kumar Choubey
Ruihong Huang
45
3
0
21 Oct 2022
MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
Zifeng Wang
Zhenbang Wu
Dinesh Agarwal
Jimeng Sun
CLIP
VLM
MedIm
129
434
0
18 Oct 2022
Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature
Tomas Goldsack
Zhihao Zhang
Chenghua Lin
Carolina Scarton
86
77
0
18 Oct 2022
Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models
Qihuang Zhong
Liang Ding
Li Shen
Peng Mi
Juhua Liu
Bo Du
Dacheng Tao
AAML
90
51
0
11 Oct 2022
CHAE: Fine-Grained Controllable Story Generation with Characters, Actions and Emotions
Xinpeng Wang
Han Jiang
Zhihua Wei
Shanlin Zhou
74
7
0
11 Oct 2022
Understanding HTML with Large Language Models
Izzeddin Gur
Ofir Nachum
Yingjie Miao
Mustafa Safdari
Austin Huang
Aakanksha Chowdhery
Sharan Narang
Noah Fiedel
Aleksandra Faust
AI4CE
225
71
0
08 Oct 2022
Just ClozE! A Novel Framework for Evaluating the Factual Consistency Faster in Abstractive Summarization
Yiyang Li
Lei Li
Marina Litvak
N. Vanetik
Dingxing Hu
Yuze Li
Yanquan Zhou
HILM
77
0
0
06 Oct 2022
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering
Shamane Siriwardhana
Rivindu Weerasekera
Elliott Wen
Tharindu Kaluarachchi
R. Rana
Suranga Nanayakkara
VLM
84
187
0
06 Oct 2022
The Vendi Score: A Diversity Evaluation Metric for Machine Learning
Dan Friedman
Adji Bousso Dieng
EGVM
165
128
0
05 Oct 2022
Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification
Muhammad N. ElNokrashy
Badr AlKhamissi
Mona T. Diab
MoMe
90
5
0
30 Sep 2022
CCTCOVID: COVID-19 Detection from Chest X-Ray Images Using Compact Convolutional Transformers
Abdolreza Marefat
Mahdieh Marefat
Javad Hassannataj Joloudari
M. Nematollahi
Reza Lashgari
ViT
MedIm
73
13
0
27 Sep 2022
Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity
Gabriel Simmons
176
67
0
24 Sep 2022
Unsupervised domain adaptation for speech recognition with unsupervised error correction
Long Mai
Julie Carson-Berndsen
103
8
0
24 Sep 2022
Twitter Topic Classification
Dimosthenis Antypas
Asahi Ushio
Jose Camacho-Collados
Leonardo Neves
Vítor Silva
Francesco Barbieri
103
33
0
20 Sep 2022
Importance Tempering: Group Robustness for Overparameterized Models
Yiping Lu
Wenlong Ji
Zachary Izzo
Lexing Ying
87
7
0
19 Sep 2022
Order-Disorder: Imitation Adversarial Attacks for Black-box Neural Ranking Models
Jiawei Liu
Yangyang Kang
Di Tang
Kaisong Song
Changlong Sun
Xiaofeng Wang
Wei Lu
Xiaozhong Liu
AAML
116
42
0
14 Sep 2022
Selective Annotation Makes Language Models Better Few-Shot Learners
Hongjin Su
Jungo Kasai
Chen Henry Wu
Weijia Shi
Tianlu Wang
...
Rui Zhang
Mari Ostendorf
Luke Zettlemoyer
Noah A. Smith
Tao Yu
118
262
0
05 Sep 2022
ChemBERTa-2: Towards Chemical Foundation Models
Walid Ahmad
Elana Simon
Seyone Chithrananda
Gabriel Grand
Bharath Ramsundar
AI4CE
68
141
0
05 Sep 2022
ArgLegalSumm: Improving Abstractive Summarization of Legal Documents with Argument Mining
Mohamed S. Elaraby
Diane Litman
AILaw
ELM
103
33
0
04 Sep 2022
DualVoice: Speech Interaction that Discriminates between Normal and Whispered Voice Input
Jun Rekimoto
58
6
0
22 Aug 2022
Reliable Decision from Multiple Subtasks through Threshold Optimization: Content Moderation in the Wild
Donghyun Son
Byounggyu Lew
Kwanghee Choi
Yongsu Baek
Seungwoo Choi
Beomjun Shin
S. Ha
Buru Chang
57
10
0
16 Aug 2022
Reproduction and Replication of an Adversarial Stylometry Experiment
Haining Wang
P. Juola
A. Riddell
57
2
0
15 Aug 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
144
666
0
15 Aug 2022
An Algorithm-Hardware Co-Optimized Framework for Accelerating N:M Sparse Transformers
Chao Fang
Aojun Zhou
Zhongfeng Wang
MoE
76
54
0
12 Aug 2022
A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data for Interpretable In-Hospital Mortality Prediction
Weimin Lyu
Xinyu Dong
Rachel Wong
Songzhu Zheng
Kayley Abell-Hart
Fusheng Wang
Chao Chen
111
52
0
09 Aug 2022
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Saleh Soltan
Shankar Ananthakrishnan
Jack G. M. FitzGerald
Rahul Gupta
Wael Hamza
...
Mukund Sridhar
Fabian Triefenbach
Apurv Verma
Gokhan Tur
Premkumar Natarajan
129
83
0
02 Aug 2022
Previous
1
2
3
4
5
...
9
10
11
Next