Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10964
Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 521 papers shown
Title
Online Continual Knowledge Learning for Language Models
Yuhao Wu
Tongjun Shi
Karthick Sharma
Chun Seah
Shuhao Zhang
CLL
KELM
33
4
0
16 Nov 2023
Controlled Text Generation for Black-box Language Models via Score-based Progressive Editor
Sangwon Yu
Changmin Lee
Hojin Lee
Sungroh Yoon
29
0
0
13 Nov 2023
AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification
Yongxin Huang
Kexin Wang
Sourav Dutta
Raj Nath Patel
Goran Glavas
Iryna Gurevych
VLM
22
4
0
01 Nov 2023
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models
Laura Cabello
Emanuele Bugliarello
Stephanie Brandl
Desmond Elliott
23
7
0
26 Oct 2023
GradSim: Gradient-Based Language Grouping for Effective Multilingual Training
Mingyang Wang
Heike Adel
Lukas Lange
Jannik Strötgen
Hinrich Schütze
33
3
0
23 Oct 2023
CLIFT: Analysing Natural Distribution Shift on Question Answering Models in Clinical Domain
Ankit Pal
19
2
0
19 Oct 2023
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models
Luiza Amador Pozzobon
Beyza Ermis
Patrick Lewis
Sara Hooker
36
20
0
11 Oct 2023
Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting
Hemant Yadav
Erica Cooper
Junichi Yamagishi
Sunayana Sitaram
R. Shah
11
0
0
08 Oct 2023
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Chen Dun
Mirian Hipolito Garcia
Guoqing Zheng
Ahmed Hassan Awadallah
Anastasios Kyrillidis
Robert Sim
87
6
0
04 Oct 2023
Controllable Text Generation with Residual Memory Transformer
Hanqing Zhang
Sun Si
Haiming Wu
Dawei Song
37
1
0
28 Sep 2023
Species196: A One-Million Semi-supervised Dataset for Fine-grained Species Recognition
W. He
Kai Han
Ying Nie
Chengcheng Wang
Yunhe Wang
VLM
48
6
0
25 Sep 2023
TouchUp-G: Improving Feature Representation through Graph-Centric Finetuning
Jing Zhu
Xiang Song
V. Ioannidis
Danai Koutra
Christos Faloutsos
62
13
0
25 Sep 2023
Personalized Adaptation with Pre-trained Speech Encoders for Continuous Emotion Recognition
Minh Tran
Yufeng Yin
M. Soleymani
56
2
0
05 Sep 2023
Refashioning Emotion Recognition Modelling: The Advent of Generalised Large Models
Zixing Zhang
Liyizhe Peng
Tao Pang
Jing Han
Huan Zhao
Bjorn W. Schuller
40
13
0
21 Aug 2023
SPM: Structured Pretraining and Matching Architectures for Relevance Modeling in Meituan Search
Wen-xin Zan
Yaopeng Han
Xiaotian Jiang
Yao Xiao
Yang Yang
Dayao Chen
Sheng Chen
32
3
0
15 Aug 2023
Continual Pre-Training of Large Language Models: How to (re)warm your model?
Kshitij Gupta
Benjamin Thérien
Adam Ibrahim
Mats L. Richter
Quentin G. Anthony
Eugene Belilovsky
Irina Rish
Timothée Lesort
KELM
35
99
0
08 Aug 2023
DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for Domain Adaptation
Menglong Lu
Zhen Huang
Yunxiang Zhao
Zhiliang Tian
Yang Liu
Dongsheng Li
39
6
0
05 Aug 2023
Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking In-domain Keywords
Shahriar Golchin
Mihai Surdeanu
N. Tavabi
A. Kiapour
21
4
0
14 Jul 2023
Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain
Aryo Pradipta Gema
Pasquale Minervini
Luke Daines
Tom Hope
Beatrice Alex
LM&MA
ALM
36
40
0
06 Jul 2023
Named Entity Inclusion in Abstractive Text Summarization
S. Berezin
Tatiana Batura
39
7
0
05 Jul 2023
Improving Language Plasticity via Pretraining with Active Forgetting
Yihong Chen
Kelly Marchisio
Roberta Raileanu
David Ifeoluwa Adelani
Pontus Stenetorp
Sebastian Riedel
Mikel Artetx
KELM
AI4CE
CLL
37
24
0
03 Jul 2023
Could Small Language Models Serve as Recommenders? Towards Data-centric Cold-start Recommendations
Xuansheng Wu
Huachi Zhou
Yucheng Shi
Wenlin Yao
Xiao Shi Huang
Ninghao Liu
LRM
37
8
0
29 Jun 2023
Solving Dialogue Grounding Embodied Task in a Simulated Environment using Further Masked Language Modeling
Weijie Zhang
40
0
0
21 Jun 2023
KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation
Yuxi Feng
Xiaoyuan Yi
L. Lakshmanan
Xing Xie
27
1
0
17 Jun 2023
QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search
Jian Xie
Yidan Liang
Jingping Liu
Yanghua Xiao
Baohua Wu
Shenghua Ni
VLM
LRM
38
8
0
11 Jun 2023
Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models Memories
Shizhe Diao
Tianyang Xu
Ruijia Xu
Jiawei Wang
Tong Zhang
MoE
AI4CE
13
36
0
08 Jun 2023
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models
Potsawee Manakul
Yassir Fathullah
Adian Liusie
Vyas Raina
Vatsal Raina
Mark Gales
29
12
0
08 Jun 2023
Extensive Evaluation of Transformer-based Architectures for Adverse Drug Events Extraction
Simone Scaboro
Beatrice Portelli
Emmanuele Chersoni
Enrico Santus
Giuseppe Serra
24
8
0
08 Jun 2023
LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification
K. Chernyshev
E. Garanina
Duygu Bayram
Qiankun Zheng
Lukas Edman
13
0
0
08 Jun 2023
Good Data, Large Data, or No Data? Comparing Three Approaches in Developing Research Aspect Classifiers for Biomedical Papers
S. Chandrasekhar
Huang Chieh-Yang
Ting-Hao 'Kenneth' Huang
29
2
0
07 Jun 2023
A Scalable and Adaptive System to Infer the Industry Sectors of Companies: Prompt + Model Tuning of Generative Language Models
Le-le Cao
Vilhelm von Ehrenheim
Astrid Berghult
Cecilia Henje
Richard Anselmo Stahl
Joar Wandborg
S. Stan
Armin Catovic
Erik Ferm
Hannes Ingelhag
22
4
0
05 Jun 2023
shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned LLMs for Radiology Report Impression Generation
Sanjeev Kumar Karn
Rikhiya Ghosh
P. Kusuma
Oladimeji Farri
LM&MA
MedIm
AI4CE
33
12
0
05 Jun 2023
RadLing: Towards Efficient Radiology Report Understanding
Rikhiya Ghosh
Sanjeev Kumar Karn
Manuela Danu
Larisa Micu
Ramya Vunikili
Oladimeji Farri
MedIm
29
6
0
04 Jun 2023
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
Rahul Madhavan
Rishabh Garg
Kahini Wadhawan
S. Mehta
29
5
0
01 Jun 2023
An Invariant Learning Characterization of Controlled Text Generation
Carolina Zheng
Claudia Shi
Keyon Vafa
Amir Feder
David M. Blei
OOD
38
8
0
31 May 2023
Measuring the Robustness of NLP Models to Domain Shifts
Nitay Calderon
Naveh Porat
Eyal Ben-David
Alexander Chapanin
Zorik Gekhman
Nadav Oved
Vitaly Shalumov
Roi Reichart
21
7
0
31 May 2023
Controlled Text Generation with Hidden Representation Transformations
Vaibhav Kumar
H. Koorehdavoudi
Masud Moshtaghi
Amita Misra
Ankit Chadha
Emilio Ferrara
26
3
0
30 May 2023
The Utility of Large Language Models and Generative AI for Education Research
Andrew Katz
Umair Shakir
B. Chambers
AI4CE
27
6
0
29 May 2023
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
43
180
0
27 May 2023
HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis
Christoforos Vasilatos
Manaar Alam
Talal Rahwan
Yasir Zaki
Michail Maniatakos
DeLMO
40
32
0
26 May 2023
Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification
Mujeen Sung
James Gung
Elman Mansimov
Nikolaos Pappas
Raphael Shu
Salvatore Romeo
Yi Zhang
Vittorio Castelli
33
7
0
24 May 2023
Difference-Masking: Choosing What to Mask in Continued Pretraining
Alex Wilf
Syeda Nahida Akter
Leena Mathur
Paul Pu Liang
Sheryl Mathew
Mengrou Shou
Eric Nyberg
Louis-Philippe Morency
CLL
SSL
32
4
0
23 May 2023
Selective Pre-training for Private Fine-tuning
Da Yu
Sivakanth Gopi
Janardhan Kulkarni
Zinan Lin
Saurabh Naik
Tomasz Religa
Jian Yin
Huishuai Zhang
40
19
0
23 May 2023
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models
Aitor Ormazabal
Mikel Artetxe
Eneko Agirre
37
19
0
23 May 2023
APPLS: Evaluating Evaluation Metrics for Plain Language Summarization
Yue Guo
Tal August
Gondy Leroy
T. Cohen
Lucy Lu Wang
57
9
0
23 May 2023
Rethinking Semi-supervised Learning with Language Models
Zhengxiang Shi
Francesco Tonolini
Nikolaos Aletras
Emine Yilmaz
G. Kazai
Yunlong Jiao
32
18
0
22 May 2023
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Xiao Wang
Wei Zhou
Qi Zhang
Jie Zhou
Songyang Gao
Junzhe Wang
Menghan Zhang
Xiang Gao
Yunwen Chen
Tao Gui
48
7
0
22 May 2023
Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting
Xinlu Zhang
Shiyang Li
Xianjun Yang
Chenxin Tian
Yao Qin
Linda R. Petzold
28
9
0
22 May 2023
TADA: Efficient Task-Agnostic Domain Adaptation for Transformers
Chia-Chien Hung
Lukas Lange
Jannik Strötgen
30
9
0
22 May 2023
Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis
Seraphina Goldfarb-Tarrant
Bjorn Ross
Adam Lopez
39
7
0
22 May 2023
Previous
1
2
3
4
5
6
...
9
10
11
Next