ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.10964
  4. Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
    VLM
    AI4CE
    CLL
ArXivPDFHTML

Papers citing "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"

50 / 521 papers shown
Title
Online Continual Knowledge Learning for Language Models
Online Continual Knowledge Learning for Language Models
Yuhao Wu
Tongjun Shi
Karthick Sharma
Chun Seah
Shuhao Zhang
CLL
KELM
33
4
0
16 Nov 2023
Controlled Text Generation for Black-box Language Models via Score-based
  Progressive Editor
Controlled Text Generation for Black-box Language Models via Score-based Progressive Editor
Sangwon Yu
Changmin Lee
Hojin Lee
Sungroh Yoon
29
0
0
13 Nov 2023
AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot
  Classification
AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification
Yongxin Huang
Kexin Wang
Sourav Dutta
Raj Nath Patel
Goran Glavas
Iryna Gurevych
VLM
22
4
0
01 Nov 2023
Evaluating Bias and Fairness in Gender-Neutral Pretrained
  Vision-and-Language Models
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models
Laura Cabello
Emanuele Bugliarello
Stephanie Brandl
Desmond Elliott
23
7
0
26 Oct 2023
GradSim: Gradient-Based Language Grouping for Effective Multilingual
  Training
GradSim: Gradient-Based Language Grouping for Effective Multilingual Training
Mingyang Wang
Heike Adel
Lukas Lange
Jannik Strötgen
Hinrich Schütze
33
3
0
23 Oct 2023
CLIFT: Analysing Natural Distribution Shift on Question Answering Models
  in Clinical Domain
CLIFT: Analysing Natural Distribution Shift on Question Answering Models in Clinical Domain
Ankit Pal
19
2
0
19 Oct 2023
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented
  Models
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models
Luiza Amador Pozzobon
Beyza Ermis
Patrick Lewis
Sara Hooker
36
20
0
11 Oct 2023
Partial Rank Similarity Minimization Method for Quality MOS Prediction
  of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting
Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting
Hemant Yadav
Erica Cooper
Junichi Yamagishi
Sunayana Sitaram
R. Shah
11
0
0
08 Oct 2023
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Chen Dun
Mirian Hipolito Garcia
Guoqing Zheng
Ahmed Hassan Awadallah
Anastasios Kyrillidis
Robert Sim
87
6
0
04 Oct 2023
Controllable Text Generation with Residual Memory Transformer
Controllable Text Generation with Residual Memory Transformer
Hanqing Zhang
Sun Si
Haiming Wu
Dawei Song
37
1
0
28 Sep 2023
Species196: A One-Million Semi-supervised Dataset for Fine-grained
  Species Recognition
Species196: A One-Million Semi-supervised Dataset for Fine-grained Species Recognition
W. He
Kai Han
Ying Nie
Chengcheng Wang
Yunhe Wang
VLM
48
6
0
25 Sep 2023
TouchUp-G: Improving Feature Representation through Graph-Centric Finetuning
TouchUp-G: Improving Feature Representation through Graph-Centric Finetuning
Jing Zhu
Xiang Song
V. Ioannidis
Danai Koutra
Christos Faloutsos
62
13
0
25 Sep 2023
Personalized Adaptation with Pre-trained Speech Encoders for Continuous
  Emotion Recognition
Personalized Adaptation with Pre-trained Speech Encoders for Continuous Emotion Recognition
Minh Tran
Yufeng Yin
M. Soleymani
56
2
0
05 Sep 2023
Refashioning Emotion Recognition Modelling: The Advent of Generalised
  Large Models
Refashioning Emotion Recognition Modelling: The Advent of Generalised Large Models
Zixing Zhang
Liyizhe Peng
Tao Pang
Jing Han
Huan Zhao
Bjorn W. Schuller
40
13
0
21 Aug 2023
SPM: Structured Pretraining and Matching Architectures for Relevance
  Modeling in Meituan Search
SPM: Structured Pretraining and Matching Architectures for Relevance Modeling in Meituan Search
Wen-xin Zan
Yaopeng Han
Xiaotian Jiang
Yao Xiao
Yang Yang
Dayao Chen
Sheng Chen
32
3
0
15 Aug 2023
Continual Pre-Training of Large Language Models: How to (re)warm your
  model?
Continual Pre-Training of Large Language Models: How to (re)warm your model?
Kshitij Gupta
Benjamin Thérien
Adam Ibrahim
Mats L. Richter
Quentin G. Anthony
Eugene Belilovsky
Irina Rish
Timothée Lesort
KELM
35
99
0
08 Aug 2023
DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for
  Domain Adaptation
DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for Domain Adaptation
Menglong Lu
Zhen Huang
Yunxiang Zhao
Zhiliang Tian
Yang Liu
Dongsheng Li
39
6
0
05 Aug 2023
Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking
  In-domain Keywords
Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking In-domain Keywords
Shahriar Golchin
Mihai Surdeanu
N. Tavabi
A. Kiapour
21
4
0
14 Jul 2023
Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain
Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain
Aryo Pradipta Gema
Pasquale Minervini
Luke Daines
Tom Hope
Beatrice Alex
LM&MA
ALM
36
40
0
06 Jul 2023
Named Entity Inclusion in Abstractive Text Summarization
Named Entity Inclusion in Abstractive Text Summarization
S. Berezin
Tatiana Batura
39
7
0
05 Jul 2023
Improving Language Plasticity via Pretraining with Active Forgetting
Improving Language Plasticity via Pretraining with Active Forgetting
Yihong Chen
Kelly Marchisio
Roberta Raileanu
David Ifeoluwa Adelani
Pontus Stenetorp
Sebastian Riedel
Mikel Artetx
KELM
AI4CE
CLL
37
24
0
03 Jul 2023
Could Small Language Models Serve as Recommenders? Towards Data-centric
  Cold-start Recommendations
Could Small Language Models Serve as Recommenders? Towards Data-centric Cold-start Recommendations
Xuansheng Wu
Huachi Zhou
Yucheng Shi
Wenlin Yao
Xiao Shi Huang
Ninghao Liu
LRM
37
8
0
29 Jun 2023
Solving Dialogue Grounding Embodied Task in a Simulated Environment
  using Further Masked Language Modeling
Solving Dialogue Grounding Embodied Task in a Simulated Environment using Further Masked Language Modeling
Weijie Zhang
40
0
0
21 Jun 2023
KEST: Kernel Distance Based Efficient Self-Training for Improving
  Controllable Text Generation
KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation
Yuxi Feng
Xiaoyuan Yi
L. Lakshmanan
Xing Xie
27
1
0
17 Jun 2023
QUERT: Continual Pre-training of Language Model for Query Understanding
  in Travel Domain Search
QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search
Jian Xie
Yidan Liang
Jingping Liu
Yanghua Xiao
Baohua Wu
Shenghua Ni
VLM
LRM
38
8
0
11 Jun 2023
Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to
  Pre-trained Language Models Memories
Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models Memories
Shizhe Diao
Tianyang Xu
Ruijia Xu
Jiawei Wang
Tong Zhang
MoE
AI4CE
13
36
0
08 Jun 2023
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models
Potsawee Manakul
Yassir Fathullah
Adian Liusie
Vyas Raina
Vatsal Raina
Mark Gales
29
12
0
08 Jun 2023
Extensive Evaluation of Transformer-based Architectures for Adverse Drug
  Events Extraction
Extensive Evaluation of Transformer-based Architectures for Adverse Drug Events Extraction
Simone Scaboro
Beatrice Portelli
Emmanuele Chersoni
Enrico Santus
Giuseppe Serra
24
8
0
08 Jun 2023
LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for
  Sexism Detection and Classification
LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification
K. Chernyshev
E. Garanina
Duygu Bayram
Qiankun Zheng
Lukas Edman
13
0
0
08 Jun 2023
Good Data, Large Data, or No Data? Comparing Three Approaches in
  Developing Research Aspect Classifiers for Biomedical Papers
Good Data, Large Data, or No Data? Comparing Three Approaches in Developing Research Aspect Classifiers for Biomedical Papers
S. Chandrasekhar
Huang Chieh-Yang
Ting-Hao 'Kenneth' Huang
29
2
0
07 Jun 2023
A Scalable and Adaptive System to Infer the Industry Sectors of
  Companies: Prompt + Model Tuning of Generative Language Models
A Scalable and Adaptive System to Infer the Industry Sectors of Companies: Prompt + Model Tuning of Generative Language Models
Le-le Cao
Vilhelm von Ehrenheim
Astrid Berghult
Cecilia Henje
Richard Anselmo Stahl
Joar Wandborg
S. Stan
Armin Catovic
Erik Ferm
Hannes Ingelhag
22
4
0
05 Jun 2023
shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned
  LLMs for Radiology Report Impression Generation
shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned LLMs for Radiology Report Impression Generation
Sanjeev Kumar Karn
Rikhiya Ghosh
P. Kusuma
Oladimeji Farri
LM&MA
MedIm
AI4CE
33
12
0
05 Jun 2023
RadLing: Towards Efficient Radiology Report Understanding
RadLing: Towards Efficient Radiology Report Understanding
Rikhiya Ghosh
Sanjeev Kumar Karn
Manuela Danu
Larisa Micu
Ramya Vunikili
Oladimeji Farri
MedIm
29
6
0
04 Jun 2023
CFL: Causally Fair Language Models Through Token-level Attribute
  Controlled Generation
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
Rahul Madhavan
Rishabh Garg
Kahini Wadhawan
S. Mehta
29
5
0
01 Jun 2023
An Invariant Learning Characterization of Controlled Text Generation
An Invariant Learning Characterization of Controlled Text Generation
Carolina Zheng
Claudia Shi
Keyon Vafa
Amir Feder
David M. Blei
OOD
38
8
0
31 May 2023
Measuring the Robustness of NLP Models to Domain Shifts
Measuring the Robustness of NLP Models to Domain Shifts
Nitay Calderon
Naveh Porat
Eyal Ben-David
Alexander Chapanin
Zorik Gekhman
Nadav Oved
Vitaly Shalumov
Roi Reichart
21
7
0
31 May 2023
Controlled Text Generation with Hidden Representation Transformations
Controlled Text Generation with Hidden Representation Transformations
Vaibhav Kumar
H. Koorehdavoudi
Masud Moshtaghi
Amita Misra
Ankit Chadha
Emilio Ferrara
26
3
0
30 May 2023
The Utility of Large Language Models and Generative AI for Education
  Research
The Utility of Large Language Models and Generative AI for Education Research
Andrew Katz
Umair Shakir
B. Chambers
AI4CE
27
6
0
29 May 2023
Fine-Tuning Language Models with Just Forward Passes
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
43
180
0
27 May 2023
HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis
HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis
Christoforos Vasilatos
Manaar Alam
Talal Rahwan
Yasir Zaki
Michail Maniatakos
DeLMO
40
32
0
26 May 2023
Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent
  Classification
Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification
Mujeen Sung
James Gung
Elman Mansimov
Nikolaos Pappas
Raphael Shu
Salvatore Romeo
Yi Zhang
Vittorio Castelli
33
7
0
24 May 2023
Difference-Masking: Choosing What to Mask in Continued Pretraining
Difference-Masking: Choosing What to Mask in Continued Pretraining
Alex Wilf
Syeda Nahida Akter
Leena Mathur
Paul Pu Liang
Sheryl Mathew
Mengrou Shou
Eric Nyberg
Louis-Philippe Morency
CLL
SSL
32
4
0
23 May 2023
Selective Pre-training for Private Fine-tuning
Selective Pre-training for Private Fine-tuning
Da Yu
Sivakanth Gopi
Janardhan Kulkarni
Zinan Lin
Saurabh Naik
Tomasz Religa
Jian Yin
Huishuai Zhang
40
19
0
23 May 2023
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned
  Models
CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models
Aitor Ormazabal
Mikel Artetxe
Eneko Agirre
37
19
0
23 May 2023
APPLS: Evaluating Evaluation Metrics for Plain Language Summarization
APPLS: Evaluating Evaluation Metrics for Plain Language Summarization
Yue Guo
Tal August
Gondy Leroy
T. Cohen
Lucy Lu Wang
57
9
0
23 May 2023
Rethinking Semi-supervised Learning with Language Models
Rethinking Semi-supervised Learning with Language Models
Zhengxiang Shi
Francesco Tonolini
Nikolaos Aletras
Emine Yilmaz
G. Kazai
Yunlong Jiao
32
18
0
22 May 2023
Farewell to Aimless Large-scale Pretraining: Influential Subset
  Selection for Language Model
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Xiao Wang
Wei Zhou
Qi Zhang
Jie Zhou
Songyang Gao
Junzhe Wang
Menghan Zhang
Xiang Gao
Yunwen Chen
Tao Gui
48
7
0
22 May 2023
Enhancing Small Medical Learners with Privacy-preserving Contextual
  Prompting
Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting
Xinlu Zhang
Shiyang Li
Xianjun Yang
Chenxin Tian
Yao Qin
Linda R. Petzold
28
9
0
22 May 2023
TADA: Efficient Task-Agnostic Domain Adaptation for Transformers
TADA: Efficient Task-Agnostic Domain Adaptation for Transformers
Chia-Chien Hung
Lukas Lange
Jannik Strötgen
30
9
0
22 May 2023
Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis
Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis
Seraphina Goldfarb-Tarrant
Bjorn Ross
Adam Lopez
39
7
0
22 May 2023
Previous
123456...91011
Next