ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.10964
  4. Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
    VLM
    AI4CE
    CLL
ArXivPDFHTML

Papers citing "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"

50 / 522 papers shown
Title
Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis
Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis
Seraphina Goldfarb-Tarrant
Bjorn Ross
Adam Lopez
39
7
0
22 May 2023
On the Limitations of Simulating Active Learning
On the Limitations of Simulating Active Learning
Katerina Margatina
Nikolaos Aletras
31
11
0
21 May 2023
Teaching the Pre-trained Model to Generate Simple Texts for Text
  Simplification
Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification
Renliang Sun
Wei Xu
Xiaojun Wan
CLL
27
17
0
21 May 2023
Patton: Language Model Pretraining on Text-Rich Networks
Patton: Language Model Pretraining on Text-Rich Networks
Bowen Jin
Wentao Zhang
Yu Zhang
Yu Meng
Xinyang Zhang
Qi Zhu
Jiawei Han
VLM
48
45
0
20 May 2023
ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for the Job Market
  Domain
ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for the Job Market Domain
Mike Zhang
Rob van der Goot
Barbara Plank
26
14
0
20 May 2023
Appraising the Potential Uses and Harms of LLMs for Medical Systematic
  Reviews
Appraising the Potential Uses and Harms of LLMs for Medical Systematic Reviews
Hye Sun Yun
Iain J. Marshall
T. Trikalinos
Byron C. Wallace
26
17
0
19 May 2023
"Nothing Abnormal": Disambiguating Medical Reports via Contrastive
  Knowledge Infusion
"Nothing Abnormal": Disambiguating Medical Reports via Contrastive Knowledge Infusion
Zexue He
An Yan
Amilcare Gentili
Julian McAuley
Chun-Nan Hsu
MedIm
35
2
0
15 May 2023
CroSentiNews 2.0: A Sentence-Level News Sentiment Corpus
CroSentiNews 2.0: A Sentence-Level News Sentiment Corpus
Gaurish Thakkar
Nives Mikelic Preradović
Marko Tadić
16
1
0
14 May 2023
How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer
  to Novel Tasks and Healthcare Systems
How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer to Novel Tasks and Healthcare Systems
Cara Van Uden
Jeremy Irvin
Mars Huang
N. Dean
J. Carr
A. Ng
C. Langlotz
OOD
34
1
0
13 May 2023
When and What to Ask Through World States and Text Instructions: IGLU
  NLP Challenge Solution
When and What to Ask Through World States and Text Instructions: IGLU NLP Challenge Solution
Zhengxiang Shi
Jerome Ramos
To Eun Kim
Xi Wang
Hossein A. Rahmani
Aldo Lipani
31
10
0
09 May 2023
Going beyond research datasets: Novel intent discovery in the industry
  setting
Going beyond research datasets: Novel intent discovery in the industry setting
Aleksandra Chrabrowa
Tsimur Hadeliya
D. Kajtoch
Robert Mroczkowski
Piotr Rybak
24
2
0
09 May 2023
OPI at SemEval 2023 Task 9: A Simple But Effective Approach to
  Multilingual Tweet Intimacy Analysis
OPI at SemEval 2023 Task 9: A Simple But Effective Approach to Multilingual Tweet Intimacy Analysis
Slawomir Dadas
25
2
0
14 Apr 2023
Attention at SemEval-2023 Task 10: Explainable Detection of Online
  Sexism (EDOS)
Attention at SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS)
Debashish Roy
Manish Shrivastava
25
1
0
10 Apr 2023
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language
  Models
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models
Emilio Ferrara
SILM
36
248
0
07 Apr 2023
BloombergGPT: A Large Language Model for Finance
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
99
793
0
30 Mar 2023
AutoAD: Movie Description in Context
AutoAD: Movie Description in Context
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
24
34
0
29 Mar 2023
End-to-End $n$-ary Relation Extraction for Combination Drug Therapies
End-to-End nnn-ary Relation Extraction for Combination Drug Therapies
Yuhang Jiang
Ramakanth Kavuluru
32
7
0
29 Mar 2023
SwissBERT: The Multilingual Language Model for Switzerland
SwissBERT: The Multilingual Language Model for Switzerland
Jannis Vamvas
Johannes Graen
Rico Sennrich
43
6
0
23 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of
  Generative AI from GAN to ChatGPT
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
38
509
0
07 Mar 2023
DiTTO: A Feature Representation Imitation Approach for Improving
  Cross-Lingual Transfer
DiTTO: A Feature Representation Imitation Approach for Improving Cross-Lingual Transfer
Shanu Kumar
Abbaraju Soujanya
Sandipan Dandapat
Sunayana Sitaram
Monojit Choudhury
VLM
33
1
0
04 Mar 2023
Choosing Public Datasets for Private Machine Learning via Gradient
  Subspace Distance
Choosing Public Datasets for Private Machine Learning via Gradient Subspace Distance
Xin Gu
Gautam Kamath
Zhiwei Steven Wu
33
12
0
02 Mar 2023
CLICKER: Attention-Based Cross-Lingual Commonsense Knowledge Transfer
CLICKER: Attention-Based Cross-Lingual Commonsense Knowledge Transfer
Ruolin Su
Zhongkai Sun
Sixing Lu
Chengyuan Ma
Chenlei Guo
LRM
26
0
0
26 Feb 2023
$k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
kkkNN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Yangsibo Huang
Daogao Liu
Zexuan Zhong
Weijia Shi
Y. Lee
RALM
ALM
43
14
0
21 Feb 2023
Auditing large language models: a three-layered approach
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILaw
MLAU
48
196
0
16 Feb 2023
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained
  Language Models
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models
Alexandra Chronopoulou
Matthew E. Peters
Alexander Fraser
Jesse Dodge
MoMe
32
66
0
14 Feb 2023
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Shuyan Zhou
Uri Alon
Sumit Agarwal
Graham Neubig
ELM
ALM
40
103
0
10 Feb 2023
Leveraging supplementary text data to kick-start automatic speech
  recognition system development with limited transcriptions
Leveraging supplementary text data to kick-start automatic speech recognition system development with limited transcriptions
Nay San
Martijn Bartelds
Blaine Billings
Ella de Falco
Hendi Feriza
Johan Safri
Wawan Sahrozi
Ben Foley
Bradley McDonnell
Dan Jurafsky
26
9
0
09 Feb 2023
Towards Geospatial Foundation Models via Continual Pretraining
Towards Geospatial Foundation Models via Continual Pretraining
Matías Mendieta
Boran Han
Xingjian Shi
Yi Zhu
Chen Chen
VLM
AI4CE
48
65
0
09 Feb 2023
Data Selection for Language Models via Importance Resampling
Data Selection for Language Models via Importance Resampling
Sang Michael Xie
Shibani Santurkar
Tengyu Ma
Percy Liang
46
173
0
06 Feb 2023
Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural
  Networks
Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural Networks
Antoine Louis
Gijs van Dijck
Gerasimos Spanakis
AILaw
23
9
0
30 Jan 2023
Learning Optimal Features via Partial Invariance
Learning Optimal Features via Partial Invariance
Moulik Choraria
Ibtihal Ferwana
Ankur Mani
Lav Varshney
OOD
28
2
0
28 Jan 2023
Open Problems in Applied Deep Learning
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
44
2
0
26 Jan 2023
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Asha Vishwanathan
R. Warrier
G. V. Suresh
Chandrashekhar Kandpal
13
2
0
25 Jan 2023
Audience-Centric Natural Language Generation via Style Infusion
Audience-Centric Natural Language Generation via Style Infusion
Samraj Moorjani
A. Krishnan
Hari Sundaram
E. Maslowska
Aravind Sankar
15
4
0
24 Jan 2023
Cross-lingual German Biomedical Information Extraction: from Zero-shot
  to Human-in-the-Loop
Cross-lingual German Biomedical Information Extraction: from Zero-shot to Human-in-the-Loop
Siting Liang
Mareike Hartmann
Daniel Sonntag
23
3
0
24 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
Adapting Multilingual Speech Representation Model for a New,
  Underresourced Language through Multilingual Fine-tuning and Continued
  Pretraining
Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining
Karol Nowakowski
M. Ptaszynski
Kyoko Murasaki
Jagna Nieuwazny
23
24
0
18 Jan 2023
Does compressing activations help model parallel training?
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
35
5
0
06 Jan 2023
CiT: Curation in Training for Effective Vision-Language Data
CiT: Curation in Training for Effective Vision-Language Data
Hu Xu
Saining Xie
Po-Yao (Bernie) Huang
Licheng Yu
Russ Howes
Gargi Ghosh
Luke Zettlemoyer
Christoph Feichtenhofer
VLM
DiffM
33
25
0
05 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from
  Text Edits
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
99
35
0
01 Jan 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition
  Systems A case study for Modern Greek
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
Vassilis Katsouros
Alexandros Potamianos
VLM
30
7
0
31 Dec 2022
Towards Proactively Forecasting Sentence-Specific Information Popularity
  within Online News Documents
Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents
Sayar Ghosh Roy
Anshul Padhi
Risubh Jain
Manish Gupta
Vasudeva Varma
AI4TS
30
2
0
31 Dec 2022
Continual Contrastive Finetuning Improves Low-Resource Relation
  Extraction
Continual Contrastive Finetuning Improves Low-Resource Relation Extraction
Wenxuan Zhou
Sheng Zhang
Tristan Naumann
Muhao Chen
Hoifung Poon
54
6
0
21 Dec 2022
Smooth Sailing: Improving Active Learning for Pre-trained Language
  Models with Representation Smoothness Analysis
Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis
Josip Jukić
Jan Snajder
16
5
0
20 Dec 2022
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Hongjin Su
Weijia Shi
Jungo Kasai
Yizhong Wang
Yushi Hu
Mari Ostendorf
Wen-tau Yih
Noah A. Smith
Luke Zettlemoyer
Tao Yu
27
282
0
19 Dec 2022
APOLLO: A Simple Approach for Adaptive Pretraining of Language Models
  for Logical Reasoning
APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning
Soumya Sanyal
Yichong Xu
Shuohang Wang
Ziyi Yang
Reid Pryzant
Wenhao Yu
Chenguang Zhu
Xiang Ren
ReLM
LRM
35
8
0
19 Dec 2022
DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text
  Generation
DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation
Yuxi Feng
Xiaoyuan Yi
Xiting Wang
L. Lakshmanan
Xing Xie
DiffM
35
5
0
16 Dec 2022
Pivotal Role of Language Modeling in Recommender Systems: Enriching
  Task-specific and Task-agnostic Representation Learning
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
Kyuyong Shin
Hanock Kwak
Wonjae Kim
Jisu Jeong
Seungjae Jung
KyungHyun Kim
Jung-Woo Ha
Sang-Woo Lee
27
4
0
07 Dec 2022
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain
  Tasks
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks
Zhongwei Wan
Yichun Yin
Wei Zhang
Jiaxin Shi
Lifeng Shang
Guangyong Chen
Xin Jiang
Qun Liu
VLM
CLL
36
16
0
07 Dec 2022
CySecBERT: A Domain-Adapted Language Model for the Cybersecurity Domain
CySecBERT: A Domain-Adapted Language Model for the Cybersecurity Domain
Markus Bayer
Philip D. . Kuehn
Ramin Shanehsaz
Christian A. Reuter
15
43
0
06 Dec 2022
Previous
12345...91011
Next