Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10964
Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 522 papers shown
Title
DiSTRICT: Dialogue State Tracking with Retriever Driven In-Context Tuning
Praveen Venkateswaran
Evelyn Duesterwald
Vatche Isahagian
41
7
0
06 Dec 2022
LUNA: Language Understanding with Number Augmentations on Transformers via Number Plugins and Pre-training
Hongwei Han
Jialiang Xu
Mengyuan Zhou
Yijia Shao
Shi Han
Dongmei Zhang
LMTD
29
7
0
06 Dec 2022
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Hamish Ivison
Noah A. Smith
Hannaneh Hajishirzi
Pradeep Dasigi
38
20
0
01 Dec 2022
ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format
Qi Zhu
Christian Geishauser
Hsien-Chin Lin
Carel van Niekerk
Baolin Peng
...
Dazhen Wan
Xiaochen Zhu
Jianfeng Gao
Milica Gavsić
Minlie Huang
56
23
0
30 Nov 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Joel Niklaus
Daniele Giofré
33
11
0
30 Nov 2022
Rationale-Guided Few-Shot Classification to Detect Abusive Language
Punyajoy Saha
Divyanshu Sheth
Kushal Kedia
Binny Mathew
Animesh Mukherjee
9
3
0
30 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
41
2
0
27 Nov 2022
Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked Language Models
Kenan Tang
Hanchun Jiang
AI4CE
18
1
0
26 Nov 2022
Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods
Xiang Dai
Sarvnaz Karimi
32
3
0
24 Nov 2022
Using Selective Masking as a Bridge between Pre-training and Fine-tuning
Tanish Lad
Himanshu Maheshwari
Shreyas Kottukkal
R. Mamidi
24
3
0
24 Nov 2022
Continual Learning of Natural Language Processing Tasks: A Survey
Zixuan Ke
Bin Liu
KELM
CLL
VLM
37
69
0
23 Nov 2022
TCBERT: A Technical Report for Chinese Topic Classification BERT
Ting Han
Kunhao Pan
Xinyu Chen
Dingjie Song
Yuchen Fan
Xinyu Gao
Ruyi Gan
Jiaxing Zhang
VLM
25
1
0
21 Nov 2022
An Efficient Active Learning Pipeline for Legal Text Classification
Sepideh Mamooler
R. Lebret
Stéphane Massonnet
Karl Aberer
AILaw
27
4
0
15 Nov 2022
Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps
Hiroki Iida
Naoaki Okazaki
42
4
0
08 Nov 2022
Coarse-to-fine Knowledge Graph Domain Adaptation based on Distantly-supervised Iterative Training
Homgmin Cai
Wenxiong Liao
Zheng Liu
Yiyang Zhang
Xiaoke Huang
...
Lingfei Wu
Ninghao Liu
Quanzheng Li
Tianming Liu
Xiang Li
14
20
0
05 Nov 2022
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
Chan-Jan Hsu
Ho-Lam Chung
Hung-yi Lee
Yu Tsao
29
6
0
01 Nov 2022
Where to start? Analyzing the potential value of intermediate models
Leshem Choshen
Elad Venezian
Shachar Don-Yehiya
Noam Slonim
Yoav Katz
MoMe
27
27
0
31 Oct 2022
WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain
Raj Sanjay Shah
Kunal Chawla
Dheeraj Eidnani
Agam Shah
Wendi Du
Sudheer Chava
Natraj Raman
Charese Smiley
Jiaao Chen
Diyi Yang
AIFin
37
103
0
31 Oct 2022
Generating Sequences by Learning to Self-Correct
Sean Welleck
Ximing Lu
Peter West
Faeze Brahman
T. Shen
Daniel Khashabi
Yejin Choi
LRM
38
217
0
31 Oct 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
45
79
0
31 Oct 2022
Parameter-Efficient Tuning Makes a Good Classification Head
Zhuoyi Yang
Ming Ding
Yanhui Guo
Qingsong Lv
Jie Tang
VLM
63
14
0
30 Oct 2022
COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
Yue Yu
Chenyan Xiong
Si Sun
Chao Zhang
Arnold Overwijk
VLM
OOD
52
22
0
27 Oct 2022
Learning on Large-scale Text-attributed Graphs via Variational Inference
Jianan Zhao
Meng Qu
Chaozhuo Li
Hao Yan
Qian Liu
Rui Li
Xing Xie
Jian Tang
VLM
37
134
0
26 Oct 2022
Predicting Long-Term Citations from Short-Term Linguistic Influence
Sandeep Soni
David Bamman
Jacob Eisenstein
23
2
0
24 Oct 2022
Knowledge Transfer from Answer Ranking to Answer Generation
Matteo Gabburo
Rik Koncel-Kedziorski
Siddhant Garg
Luca Soldaini
Alessandro Moschitti
33
7
0
23 Oct 2022
Cross-domain Generalization for AMR Parsing
Xuefeng Bai
Sen Yang
Leyang Cui
Linfeng Song
Yue Zhang
49
2
0
22 Oct 2022
NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation
Phillip Howard
Gadi Singer
Vasudev Lal
Yejin Choi
Swabha Swayamdipta
CML
60
25
0
22 Oct 2022
A Survey of Active Learning for Natural Language Processing
Zhisong Zhang
Emma Strubell
Eduard H. Hovy
LM&MA
35
65
0
18 Oct 2022
Using Bottleneck Adapters to Identify Cancer in Clinical Notes under Low-Resource Constraints
Omid Rohanian
Hannah Jauncey
Mohammadmahdi Nouriborji
Vinod Kumar Chauhan
Bronner P. Gonccalves
Christiana Kartsonaki
Isaric Clinical Characterisation Group
L. Merson
David Clifton
24
7
0
17 Oct 2022
Table-To-Text generation and pre-training with TabT5
Ewa Andrejczuk
Julian Martin Eisenschlos
Francesco Piccinno
Syrine Krichene
Yasemin Altun
LMTD
34
31
0
17 Oct 2022
Improving generalizability of distilled self-supervised speech processing models under distorted settings
Kuan-Po Huang
Yu-Kuan Fu
Tsung-Yuan Hsu
Fabian Ritter-Gutierrez
Fan Wang
Liang-Hsuan Tseng
Yu Zhang
Hung-yi Lee
32
14
0
14 Oct 2022
Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge
Kosuke Nishida
Naoki Yoshinaga
Kyosuke Nishida
37
2
0
14 Oct 2022
Developing a general-purpose clinical language inference model from a large corpus of clinical notes
Madhumita Sushil
Dana Ludwig
A. Butte
V. Rudrapatna
LM&MA
14
12
0
12 Oct 2022
EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain
Amir Hadifar
Semere Kiros Bitew
Johannes Deleu
Chris Develder
Thomas Demeester
AI4Ed
41
18
0
12 Oct 2022
MedJEx: A Medical Jargon Extraction Model with Wiki's Hyperlink Span and Contextualized Masked Language Model Score
Sunjae Kwon
Zonghai Yao
H. Jordan
David Levy
Brian Corner
Hong-ye Yu
30
18
0
12 Oct 2022
Knowledge Distillation Transfer Sets and their Impact on Downstream NLU Tasks
Charith Peris
Lizhen Tan
Thomas Gueudré
Turan Gojayev
Vivi Wei
Gokmen Oz
30
4
0
10 Oct 2022
Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization
Zonghan Yang
Xiaoyuan Yi
Peng Li
Yang Liu
Xing Xie
38
33
0
10 Oct 2022
Leveraging Key Information Modeling to Improve Less-Data Constrained News Headline Generation via Duality Fine-Tuning
Zhuoxuan Jiang
Lingfeng Qiao
Di Yin
Shanshan Feng
Bo Ren
SyDa
30
2
0
10 Oct 2022
KSAT: Knowledge-infused Self Attention Transformer -- Integrating Multiple Domain-Specific Contexts
Kaushik Roy
Yuxin Zi
Vignesh Narayanan
Manas Gaur
Amit P. Sheth
AI4MH
46
12
0
09 Oct 2022
Spread Love Not Hate: Undermining the Importance of Hateful Pre-training for Hate Speech Detection
Omkar Gokhale
Aditya Kane
Shantanu Patankar
Tanmay Chavan
Raviraj Joshi
VLM
35
7
0
09 Oct 2022
On Task-Adaptive Pretraining for Dialogue Response Selection
Tzu-Hsiang Lin
Ta-Chung Chi
Anna Rumshisky
21
1
0
08 Oct 2022
Short Text Pre-training with Extended Token Classification for E-commerce Query Understanding
Haoming Jiang
Tianyu Cao
Zheng Li
Cheng-hsin Luo
Xianfeng Tang
Qingyu Yin
Danqing Zhang
R. Goutam
Bing Yin
RALM
37
11
0
08 Oct 2022
Calibrating Factual Knowledge in Pretrained Language Models
Qingxiu Dong
Damai Dai
Yifan Song
Jingjing Xu
Zhifang Sui
Lei Li
KELM
249
83
0
07 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
129
95
0
06 Oct 2022
SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis
Jiaxin Pei
Vítor Silva
Maarten W. Bos
Yozon Liu
Leonardo Neves
David Jurgens
Francesco Barbieri
55
28
0
03 Oct 2022
DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing
Yanjun Gao
Dmitriy Dligach
Timothy A. Miller
John R. Caskey
Brihat Sharma
M. Churpek
Majid Afshar
ELM
LRM
34
17
0
29 Sep 2022
Downstream Datasets Make Surprisingly Good Pretraining Corpora
Kundan Krishna
Saurabh Garg
Jeffrey P. Bigham
Zachary Chase Lipton
50
30
0
28 Sep 2022
PePe: Personalized Post-editing Model utilizing User-generated Post-edits
Jihyeon Janel Lee
Taehee Kim
Yunwon Tae
Cheonbok Park
Jaegul Choo
24
0
0
21 Sep 2022
Generating Persuasive Responses to Customer Reviews with Multi-Source Prior Knowledge in E-commerce
Bo Chen
Jiayi Liu
M. Maimaiti
Xing Gao
Ji Zhang
25
3
0
20 Sep 2022
Generalizing through Forgetting -- Domain Generalization for Symptom Event Extraction in Clinical Notes
Sitong Zhou
K. Lybarger
Meliha Yetisgen-Yildiz
Mari Ostendorf
40
2
0
20 Sep 2022
Previous
1
2
3
4
5
6
...
9
10
11
Next