ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.10964
  4. Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
    VLM
    AI4CE
    CLL
ArXivPDFHTML

Papers citing "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"

50 / 522 papers shown
Title
Semantic-based Pre-training for Dialogue Understanding
Semantic-based Pre-training for Dialogue Understanding
Xuefeng Bai
Linfeng Song
Yue Zhang
40
7
0
19 Sep 2022
Classical Sequence Match is a Competitive Few-Shot One-Class Learner
Classical Sequence Match is a Competitive Few-Shot One-Class Learner
Mengting Hu
H. Gao
Yinhao Bai
Mingming Liu
8
0
0
14 Sep 2022
CSL: A Large-scale Chinese Scientific Literature Dataset
CSL: A Large-scale Chinese Scientific Literature Dataset
Yudong Li
Yuqing Zhang
Zhe Zhao
Lin-cheng Shen
Weijie Liu
Weiquan Mao
Hui Zhang
AILaw
135
50
0
12 Sep 2022
Trigger Warnings: Bootstrapping a Violence Detector for FanFiction
Trigger Warnings: Bootstrapping a Violence Detector for FanFiction
Magdalena Wolska
Christopher Schröder
Ole Borchardt
Benno Stein
Martin Potthast
20
9
0
09 Sep 2022
T-NER: An All-Round Python Library for Transformer-based Named Entity
  Recognition
T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition
Asahi Ushio
Jose Camacho-Collados
20
73
0
09 Sep 2022
External Knowledge Selection with Weighted Negative Sampling in
  Knowledge-grounded Task-oriented Dialogue Systems
External Knowledge Selection with Weighted Negative Sampling in Knowledge-grounded Task-oriented Dialogue Systems
Janghoon Han
Joongbo Shin
Hosung Song
Hyunjik Jo
Gyeonghun Kim
Yireun Kim
Stanley Jungkyu Choi
21
4
0
06 Sep 2022
Find the Funding: Entity Linking with Incomplete Funding Knowledge Bases
Find the Funding: Entity Linking with Incomplete Funding Knowledge Bases
G. Aydin
S. A. Tabatabaei
Giorgios Tsatsaronis
Faegheh Hasibi
21
2
0
01 Sep 2022
Review of Natural Language Processing in Pharmacology
Review of Natural Language Processing in Pharmacology
D. Trajanov
Vangel Trajkovski
Makedonka Dimitrieva
Jovana Dobreva
Milos Jovanovik
Matej Klemen
Alevs vZagar
Marko Robnik-vSikonja
LM&MA
36
7
0
22 Aug 2022
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model
  Adaptation
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
VLM
CLL
34
41
0
22 Aug 2022
Pathway to Future Symbiotic Creativity
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
34
0
0
18 Aug 2022
Summarizing Patients Problems from Hospital Progress Notes Using
  Pre-trained Sequence-to-Sequence Models
Summarizing Patients Problems from Hospital Progress Notes Using Pre-trained Sequence-to-Sequence Models
Yanjun Gao
Dmitriy Dligach
T. Miller
Dongfang Xu
M. Churpek
Majid Afshar
AI4MH
26
36
0
17 Aug 2022
Visual Comparison of Language Model Adaptation
Visual Comparison of Language Model Adaptation
Rita Sevastjanova
E. Cakmak
Shauli Ravfogel
Ryan Cotterell
Mennatallah El-Assady
VLM
49
16
0
17 Aug 2022
Transformer Encoder for Social Science
Transformer Encoder for Social Science
Haosen Ge
In Young Park
Xuancheng Qian
Grace Zeng
31
0
0
17 Aug 2022
A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data
  for Interpretable In-Hospital Mortality Prediction
A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data for Interpretable In-Hospital Mortality Prediction
Weimin Lyu
Xinyu Dong
Rachel Wong
Songzhu Zheng
Kayley Abell-Hart
Fusheng Wang
Chao Chen
26
47
0
09 Aug 2022
Abstractive Meeting Summarization: A Survey
Abstractive Meeting Summarization: A Survey
Virgile Rennard
Guokan Shang
Julie Hunter
Michalis Vazirgiannis
40
15
0
08 Aug 2022
On the Limitations of Sociodemographic Adaptation with Transformers
On the Limitations of Sociodemographic Adaptation with Transformers
Chia-Chien Hung
Anne Lauscher
Dirk Hovy
Simone Paolo Ponzetto
Goran Glavaš
32
0
0
01 Aug 2022
Few-shot Adaptation Works with UnpredicTable Data
Few-shot Adaptation Works with UnpredicTable Data
Jun Shern Chan
Michael Pieler
Jonathan Jao
Jérémy Scheurer
Ethan Perez
36
5
0
01 Aug 2022
Masked Autoencoders As The Unified Learners For Pre-Trained Sentence
  Representation
Masked Autoencoders As The Unified Learners For Pre-Trained Sentence Representation
Alexander H. Liu
Samuel J. Yang
32
5
0
30 Jul 2022
ELF22: A Context-based Counter Trolling Dataset to Combat Internet
  Trolls
ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls
Huije Lee
Young Ju Na
Hoyun Song
Jisu Shin
Jong C. Park
26
7
0
30 Jul 2022
"Do you follow me?": A Survey of Recent Approaches in Dialogue State
  Tracking
"Do you follow me?": A Survey of Recent Approaches in Dialogue State Tracking
Léo Jacqmin
L. Rojas-Barahona
Benoit Favre
43
27
0
29 Jul 2022
Innovations in Neural Data-to-text Generation: A Survey
Innovations in Neural Data-to-text Generation: A Survey
Mandar Sharma
Ajay K. Gogineni
Naren Ramakrishnan
34
10
0
25 Jul 2022
PLM-ICD: Automatic ICD Coding with Pretrained Language Models
PLM-ICD: Automatic ICD Coding with Pretrained Language Models
Chao-Wei Huang
Shang-Chi Tsai
Yun-Nung Chen
40
49
0
12 Jul 2022
Domain Confused Contrastive Learning for Unsupervised Domain Adaptation
Domain Confused Contrastive Learning for Unsupervised Domain Adaptation
Quanyu Long
Tianze Luo
Wenya Wang
Sinno Jialin Pan
59
8
0
10 Jul 2022
Meta-Learning the Difference: Preparing Large Language Models for
  Efficient Adaptation
Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation
Zejiang Hou
Julian Salazar
George Polovets
30
14
0
07 Jul 2022
Improving Low-Resource Speech Recognition with Pretrained Speech Models:
  Continued Pretraining vs. Semi-Supervised Training
Improving Low-Resource Speech Recognition with Pretrained Speech Models: Continued Pretraining vs. Semi-Supervised Training
Mitchell DeHaven
J. Billa
VLM
AI4TS
17
8
0
01 Jul 2022
MVP: Multi-task Supervised Pre-training for Natural Language Generation
MVP: Multi-task Supervised Pre-training for Natural Language Generation
Tianyi Tang
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
49
24
0
24 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem
  Understanding
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding
Wayne Xin Zhao
Kun Zhou
Zheng Gong
Beichen Zhang
Yuanhang Zhou
Jing Sha
Zhigang Chen
Shijin Wang
Cong Liu
Ji-Rong Wen
39
19
0
13 Jun 2022
Sort by Structure: Language Model Ranking as Dependency Probing
Sort by Structure: Language Model Ranking as Dependency Probing
Max Müller-Eberstein
Rob van der Goot
Barbara Plank
41
3
0
10 Jun 2022
Annotation Error Detection: Analyzing the Past and Present for a More
  Coherent Future
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future
Jan-Christoph Klie
Bonnie Webber
Iryna Gurevych
42
43
0
05 Jun 2022
Task-Adaptive Pre-Training for Boosting Learning With Noisy Labels: A
  Study on Text Classification for African Languages
Task-Adaptive Pre-Training for Boosting Learning With Noisy Labels: A Study on Text Classification for African Languages
D. Zhu
Michael A. Hedderich
Fangzhou Zhai
David Ifeoluwa Adelani
Dietrich Klakow
NoLa
34
0
0
03 Jun 2022
UPB at SemEval-2022 Task 5: Enhancing UNITER with Image Sentiment and
  Graph Convolutional Networks for Multimedia Automatic Misogyny Identification
UPB at SemEval-2022 Task 5: Enhancing UNITER with Image Sentiment and Graph Convolutional Networks for Multimedia Automatic Misogyny Identification
Andrei Paraschiv
M. Dascalu
Dumitru-Clementin Cercel
27
3
0
29 May 2022
AANG: Automating Auxiliary Learning
AANG: Automating Auxiliary Learning
Lucio Dery
Paul Michel
M. Khodak
Graham Neubig
Ameet Talwalkar
41
9
0
27 May 2022
kNN-Prompt: Nearest Neighbor Zero-Shot Inference
kNN-Prompt: Nearest Neighbor Zero-Shot Inference
Weijia Shi
Julian Michael
Suchin Gururangan
Luke Zettlemoyer
RALM
VLM
29
32
0
27 May 2022
Quark: Controllable Text Generation with Reinforced Unlearning
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
68
206
0
26 May 2022
The Document Vectors Using Cosine Similarity Revisited
The Document Vectors Using Cosine Similarity Revisited
Bingyu Zhang
N. Arefyev
27
9
0
26 May 2022
Detecting Label Errors by using Pre-Trained Language Models
Detecting Label Errors by using Pre-Trained Language Models
Derek Chong
Jenny Hong
Christopher D. Manning
NoLa
55
21
0
25 May 2022
Leveraging QA Datasets to Improve Generative Data Augmentation
Leveraging QA Datasets to Improve Generative Data Augmentation
Dheeraj Mekala
Tu Vu
Timo Schick
Jingbo Shang
27
18
0
25 May 2022
ORCA: Interpreting Prompted Language Models via Locating Supporting Data
  Evidence in the Ocean of Pretraining Data
ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data
Xiaochuang Han
Yulia Tsvetkov
24
28
0
25 May 2022
Gradient-Based Constrained Sampling from Language Models
Gradient-Based Constrained Sampling from Language Models
Sachin Kumar
Biswajit Paria
Yulia Tsvetkov
BDL
39
53
0
25 May 2022
Can Foundation Models Wrangle Your Data?
Can Foundation Models Wrangle Your Data?
A. Narayan
Ines Chami
Laurel J. Orr
Simran Arora
Christopher Ré
LMTD
AI4CE
181
214
0
20 May 2022
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for
  Pre-Trained Encoder Transfer Learning
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning
Orion Weller
Kevin Seppi
Matt Gardner
22
21
0
17 May 2022
Improving Contextual Representation with Gloss Regularized Pre-training
Improving Contextual Representation with Gloss Regularized Pre-training
Yu Lin
Zhecheng An
Peihao Wu
Zejun Ma
27
5
0
13 May 2022
Clinical Prompt Learning with Frozen Language Models
Clinical Prompt Learning with Frozen Language Models
Niall Taylor
Yi Zhang
Dan W Joyce
A. Nevado-Holgado
Andrey Kormilitzin
VLM
LM&MA
16
31
0
11 May 2022
Few-shot Mining of Naturally Occurring Inputs and Outputs
Few-shot Mining of Naturally Occurring Inputs and Outputs
Mandar Joshi
Terra Blevins
M. Lewis
Daniel S. Weld
Luke Zettlemoyer
33
1
0
09 May 2022
Improving negation detection with negation-focused pre-training
Improving negation detection with negation-focused pre-training
Thinh Hung Truong
Timothy Baldwin
Trevor Cohn
Karin Verspoor
32
21
0
09 May 2022
A Dataset for N-ary Relation Extraction of Drug Combinations
A Dataset for N-ary Relation Extraction of Drug Combinations
Aryeh Tiktinsky
Vijay Viswanathan
Danna Niezni
D. Azagury
Y. Shamay
Hillel Taub-Tabib
Tom Hope
Yoav Goldberg
37
18
0
04 May 2022
Contrastive Learning for Improving ASR Robustness in Spoken Language
  Understanding
Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding
Yanfeng Chang
Yun-Nung Chen
30
9
0
02 May 2022
POLITICS: Pretraining with Same-story Article Comparison for Ideology
  Prediction and Stance Detection
POLITICS: Pretraining with Same-story Article Comparison for Ideology Prediction and Stance Detection
Yujian Liu
Xinliang Frederick Zhang
David Wegsman
Nick Beauchamp
Lu Wang
40
71
0
02 May 2022
Crude Oil-related Events Extraction and Processing: A Transfer Learning
  Approach
Crude Oil-related Events Extraction and Processing: A Transfer Learning Approach
Meisin Lee
Lay-Ki Soon
Eu-Gene Siew
29
0
0
01 May 2022
Previous
123...567...91011
Next