Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10964
Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 528 papers shown
Title
Crude Oil-related Events Extraction and Processing: A Transfer Learning Approach
Meisin Lee
Lay-Ki Soon
Eu-Gene Siew
32
0
0
01 May 2022
Detoxifying Language Models with a Toxic Corpus
Yoon A Park
Frank Rudzicz
27
6
0
30 Apr 2022
On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model
Seongjin Shin
Sang-Woo Lee
Hwijeen Ahn
Sungdong Kim
Hyoungseok Kim
...
Kyunghyun Cho
Gichang Lee
W. Park
Jung-Woo Ha
Nako Sung
LRM
38
94
0
28 Apr 2022
SkillSpan: Hard and Soft Skill Extraction from English Job Postings
Mike Zhang
Kristian Nørgaard Jensen
Sif Dam Sonniks
Barbara Plank
28
53
0
27 Apr 2022
A Thorough Examination on Zero-shot Dense Retrieval
Ruiyang Ren
Yingqi Qu
Qingbin Liu
Wayne Xin Zhao
Qifei Wu
Yuchen Ding
Hua Wu
Haifeng Wang
Ji-Rong Wen
39
41
0
27 Apr 2022
Modular Domain Adaptation
Junshen K. Chen
Dallas Card
Dan Jurafsky
17
1
0
26 Apr 2022
KALA: Knowledge-Augmented Language Model Adaptation
Minki Kang
Jinheon Baek
Sung Ju Hwang
VLM
KELM
36
34
0
22 Apr 2022
Decorate the Examples: A Simple Method of Prompt Design for Biomedical Relation Extraction
Hui-Syuan Yeh
Thomas Lavergne
Pierre Zweigenbaum
26
10
0
21 Apr 2022
A Corpus for Understanding and Generating Moral Stories
Jian Guan
Ziqi Liu
Minlie Huang
32
10
0
20 Apr 2022
Synthetic Target Domain Supervision for Open Retrieval QA
Revanth Reddy Gangi Reddy
Bhavani Iyer
Md Arafat Sultan
Rong Zhang
Avirup Sil
Vittorio Castelli
Radu Florian
Salim Roukos
OOD
36
12
0
20 Apr 2022
Zero-shot Entity and Tweet Characterization with Designed Conditional Prompts and Contexts
S. Srivatsa
Tushar Mohan
Kumari Neha
Nishchay Malakar
Ponnurangam Kumaraguru
Srinath Srinivasa
36
0
0
18 Apr 2022
Knowledgeable Salient Span Mask for Enhancing Language Models as Knowledge Base
Cunxiang Wang
Fuli Luo
Yanyang Li
Runxin Xu
Fei Huang
Yue Zhang
KELM
39
2
0
17 Apr 2022
Improving Passage Retrieval with Zero-Shot Question Generation
Devendra Singh Sachan
M. Lewis
Mandar Joshi
Armen Aghajanyan
Wen-tau Yih
J. Pineau
Luke Zettlemoyer
OOD
LRM
38
157
0
15 Apr 2022
Revisiting Transformer-based Models for Long Document Classification
Xiang Dai
Ilias Chalkidis
S. Darkner
Desmond Elliott
VLM
25
68
0
14 Apr 2022
Distributionally Robust Models with Parametric Likelihood Ratios
Paul Michel
Tatsunori Hashimoto
Graham Neubig
OOD
30
15
0
13 Apr 2022
BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model
Hongyi Yuan
Zheng Yuan
Ruyi Gan
Jiaxing Zhang
Yutao Xie
Sheng Yu
LM&MA
35
123
0
08 Apr 2022
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores
Wei-Cheng Tseng
Wei-Tsung Kao
Hung-yi Lee
19
21
0
07 Apr 2022
Fusing finetuned models for better pretraining
Leshem Choshen
Elad Venezian
Noam Slonim
Yoav Katz
FedML
AI4CE
MoMe
56
87
0
06 Apr 2022
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems
Caleb Ziems
Jane A. Yu
Yi-Chia Wang
A. Halevy
Diyi Yang
33
92
0
06 Apr 2022
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations
L. D. Prasad
Sreyan Ghosh
S. Umesh
30
12
0
31 Mar 2022
Neural Pipeline for Zero-Shot Data-to-Text Generation
Zdeněk Kasner
Ondrej Dusek
18
33
0
30 Mar 2022
Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment
Mu Yang
K. Hirschi
S. Looney
Okim Kang
John H. L. Hansen
43
15
0
29 Mar 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
42
100
0
24 Mar 2022
Mix and Match: Learning-free Controllable Text Generation using Energy Language Models
Fatemehsadat Mireshghallah
Kartik Goyal
Taylor Berg-Kirkpatrick
41
78
0
24 Mar 2022
Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal
Umang Gupta
Jwala Dhamala
Varun Kumar
Apurv Verma
Yada Pruksachatkun
Satyapriya Krishna
Rahul Gupta
Kai-Wei Chang
Greg Ver Steeg
Aram Galstyan
21
49
0
23 Mar 2022
A Scalable Model Specialization Framework for Training and Inference using Submodels and its Application to Speech Model Personalization
Fadi Biadsy
Youzheng Chen
Xia Zhang
Oleg Rybakov
Andrew Rosenberg
Pedro J. Moreno
51
13
0
23 Mar 2022
From Stance to Concern: Adaptation of Propositional Analysis to New Tasks and Domains
Brodie Mather
Bonnie J. Dorr
Adam Dalton
William de Beaumont
Owen Rambow
Sonja M. Schmer-Galunder
35
8
0
20 Mar 2022
Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation
Xinyi Wang
Sebastian Ruder
Graham Neubig
42
61
0
17 Mar 2022
Geographic Adaptation of Pretrained Language Models
Valentin Hofmann
Goran Glavaš
Nikola Ljubevsić
J. Pierrehumbert
Hinrich Schütze
VLM
26
16
0
16 Mar 2022
Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation
Wenxuan Wang
Wenxiang Jiao
Yongchang Hao
Xing Wang
Shuming Shi
Zhaopeng Tu
Michael Lyu
AIMat
39
26
0
16 Mar 2022
Representation Learning for Resource-Constrained Keyphrase Generation
Di Wu
Wasi Uddin Ahmad
Sunipa Dev
Kai-Wei Chang
43
17
0
15 Mar 2022
ELLE: Efficient Lifelong Pre-training for Emerging Data
Yujia Qin
Jiajie Zhang
Yankai Lin
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
30
67
0
12 Mar 2022
LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval
Canwen Xu
Daya Guo
Nan Duan
Julian McAuley
RALM
VLM
29
46
0
11 Mar 2022
Adaptor: Objective-Centric Adaptation Framework for Language Models
Michal vStefánik
Vít Novotný
Nikola Groverová
Petr Sojka
35
10
0
08 Mar 2022
Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots
Wenting Zhao
Ye Liu
Yao Wan
Philip S. Yu
30
11
0
01 Mar 2022
Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation
Xiang Hu
Haitao Mi
Liang Li
Gerard de Melo
34
13
0
01 Mar 2022
Reward Modeling for Mitigating Toxicity in Transformer-based Language Models
Farshid Faal
K. Schmitt
Jia Yuan Yu
13
24
0
19 Feb 2022
Automated Attack Synthesis by Extracting Finite State Machines from Protocol Specification Documents
Maria Leonor Pacheco
Max von Hippel
Ben Weintraub
Dan Goldwasser
Cristina Nita-Rotaru
27
30
0
18 Feb 2022
Semantic-Oriented Unlabeled Priming for Large-Scale Language Models
Yanchen Liu
Timo Schick
Hinrich Schütze
VLM
36
15
0
12 Feb 2022
AdaPrompt: Adaptive Model Training for Prompt-based NLP
Yulong Chen
Yang Liu
Li Dong
Shuohang Wang
Chenguang Zhu
Michael Zeng
Yue Zhang
VLM
27
45
0
10 Feb 2022
Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models
Wei Ping
Ming-Yu Liu
Chaowei Xiao
P. Xu
M. Patwary
M. Shoeybi
Bo-wen Li
Anima Anandkumar
Bryan Catanzaro
31
65
0
08 Feb 2022
Can Wikipedia Help Offline Reinforcement Learning?
Machel Reid
Yutaro Yamada
S. Gu
3DV
RALM
OffRL
140
95
0
28 Jan 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Shaden Smith
M. Patwary
Brandon Norick
P. LeGresley
Samyam Rajbhandari
...
M. Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
MoE
95
733
0
28 Jan 2022
NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis
Shamsuddeen Hassan Muhammad
David Ifeoluwa Adelani
Sebastian Ruder
I. Ahmad
Idris Abdulmumin
...
Chris C. Emezue
Saheed Abdul
Anuoluwapo Aremu
Alipio Jeorge
P. Brazdil
45
96
0
20 Jan 2022
TourBERT: A pretrained language model for the tourism industry
Veronika Arefieva
R. Egger
14
4
0
19 Jan 2022
Improving Neural Machine Translation by Denoising Training
Liang Ding
Keqin Peng
Dacheng Tao
VLM
AI4CE
48
6
0
19 Jan 2022
Transferability in Deep Learning: A Survey
Junguang Jiang
Yang Shu
Jianmin Wang
Mingsheng Long
OOD
34
101
0
15 Jan 2022
Assemble Foundation Models for Automatic Code Summarization
Jian Gu
P. Salza
H. Gall
36
35
0
13 Jan 2022
Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis
L. Ein-Dor
Ilya Shnayderman
Artem Spector
Lena Dankin
R. Aharonov
Noam Slonim
36
8
0
06 Jan 2022
Neural Architectures for Biological Inter-Sentence Relation Extraction
Enrique Noriega-Atala
Peter Lovett
Clayton T. Morrison
Mihai Surdeanu
NAI
33
3
0
17 Dec 2021
Previous
1
2
3
...
10
11
6
7
8
9
Next