ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.10964
  4. Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
    VLM
    AI4CE
    CLL
ArXivPDFHTML

Papers citing "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"

50 / 522 papers shown
Title
Towards understanding evolution of science through language model series
Towards understanding evolution of science through language model series
Junjie Dong
Zhuoqi Lyu
Qing Ke
AI4TS
37
0
0
15 Sep 2024
DomURLs_BERT: Pre-trained BERT-based Model for Malicious Domains and
  URLs Detection and Classification
DomURLs_BERT: Pre-trained BERT-based Model for Malicious Domains and URLs Detection and Classification
Abdelkader El Mahdaouy
Salima Lamsiyah
Meryem Janati Idrissi
H. Alami
Zakaria Yartaoui
Ismail Berrada
21
3
0
13 Sep 2024
Self-Masking Networks for Unsupervised Adaptation
Self-Masking Networks for Unsupervised Adaptation
Alfonso Taboada Warmerdam
Mathilde Caron
Yuki M. Asano
54
1
0
11 Sep 2024
LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models
LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models
Lipeng Ma
Weidong Yang
Sihang Jiang
Ben Fei
Mingjie Zhou
Shuhao Li
Bo Xu
Bo Xu
Yanghua Xiao
66
0
0
03 Sep 2024
From Prediction to Application: Language Model-based Code Knowledge
  Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with
  Pedagogical Prompting for Comprehensive Programming Education
From Prediction to Application: Language Model-based Code Knowledge Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with Pedagogical Prompting for Comprehensive Programming Education
Unggi Lee
Jiyeong Bae
Yeonji Jung
Minji Kang
Gyuri Byun
...
Sookbun Lee
Jaekwon Park
Taekyung Ahn
Gunho Lee
Hyeoncheol Kim
AI4Ed
KELM
39
1
0
31 Aug 2024
Diffusion Guided Language Modeling
Diffusion Guided Language Modeling
Justin Lovelace
Varsha Kishore
Yiwei Chen
Kilian Q. Weinberger
44
6
0
08 Aug 2024
Automated Review Generation Method Based on Large Language Models
Automated Review Generation Method Based on Large Language Models
Shican Wu
Xiao Ma
Dehui Luo
Lulu Li
Xiangcheng Shi
...
Ran Luo
Chunlei Pei
Zhijian Zhao
Zhi-Jian Zhao
Jinlong Gong
77
0
0
30 Jul 2024
Towards Aligning Language Models with Textual Feedback
Towards Aligning Language Models with Textual Feedback
Sauc Abadal Lloret
S. Dhuliawala
K. Murugesan
Mrinmaya Sachan
VLM
50
1
0
24 Jul 2024
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates
Zeyu Leo Liu
Shrey Pandit
Xi Ye
Eunsol Choi
Greg Durrett
KELM
ALM
81
4
0
08 Jul 2024
BadCLM: Backdoor Attack in Clinical Language Models for Electronic
  Health Records
BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records
Weimin Lyu
Zexin Bi
Fusheng Wang
Chao Chen
50
5
0
06 Jul 2024
Using LLMs to label medical papers according to the CIViC evidence model
Using LLMs to label medical papers according to the CIViC evidence model
Markus Hisch
Xing David Wang
47
0
0
05 Jul 2024
CHEW: A Dataset of CHanging Events in Wikipedia
CHEW: A Dataset of CHanging Events in Wikipedia
Hsuvas Borkakoty
Luis Espinosa-Anke
48
1
0
27 Jun 2024
MPCODER: Multi-user Personalized Code Generator with Explicit and
  Implicit Style Representation Learning
MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning
Zhenlong Dai
Chang Yao
Wenkang Han
Ying Yuan
Zhipeng Gao
Jingyuan Chen
26
11
0
25 Jun 2024
A Survey on Large Language Models from General Purpose to Medical
  Applications: Datasets, Methodologies, and Evaluations
A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang
Huansheng Ning
Yi Peng
Qikai Wei
Daniel Tesfai
Wenwei Mao
Tao Zhu
Runhe Huang
LM&MA
AI4MH
ELM
49
5
0
14 Jun 2024
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
Chen Cecilia Liu
Iryna Gurevych
Anna Korhonen
38
5
0
06 Jun 2024
HYDRA: Model Factorization Framework for Black-Box LLM Personalization
HYDRA: Model Factorization Framework for Black-Box LLM Personalization
Yuchen Zhuang
Haotian Sun
Yue Yu
Rushi Qiang
Qifan Wang
Chao Zhang
Bo Dai
AAML
53
16
0
05 Jun 2024
Entangled Relations: Leveraging NLI and Meta-analysis to Enhance Biomedical Relation Extraction
Entangled Relations: Leveraging NLI and Meta-analysis to Enhance Biomedical Relation Extraction
William Hogan
Jingbo Shang
18
0
0
31 May 2024
ESG-FTSE: A corpus of news articles with ESG relevance labels and use
  cases
ESG-FTSE: A corpus of news articles with ESG relevance labels and use cases
Mariya Pavlova
Bernard Casey
Miaosen Wang
22
0
0
30 May 2024
Aligning to Thousands of Preferences via System Message Generalization
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee
Sue Hyun Park
Seungone Kim
Minjoon Seo
ALM
44
38
0
28 May 2024
Scaling Laws for Discriminative Classification in Large Language Models
Scaling Laws for Discriminative Classification in Large Language Models
Dean Wyatte
Fatemeh Tahmasbi
Ming Li
Thomas Markovich
52
2
0
24 May 2024
BMRetriever: Tuning Large Language Models as Better Biomedical Text
  Retrievers
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers
Ran Xu
Wenqi Shi
Yue Yu
Yuchen Zhuang
Yanqiao Zhu
M. D. Wang
Joyce C. Ho
Chao Zhang
Carl Yang
LM&MA
40
19
0
29 Apr 2024
Effective Unsupervised Constrained Text Generation based on Perturbed
  Masking
Effective Unsupervised Constrained Text Generation based on Perturbed Masking
Yingwen Fu
Wenjie Ou
Zhou Yu
Yue Lin
28
1
0
24 Apr 2024
No Train but Gain: Language Arithmetic for training-free Language
  Adapters enhancement
No Train but Gain: Language Arithmetic for training-free Language Adapters enhancement
Mateusz Klimaszewski
Piotr Andruszkiewicz
Alexandra Birch
MoMe
47
4
0
24 Apr 2024
Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions
Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions
Taojun Hu
Xiao-Hua Zhou
ELM
41
13
0
14 Apr 2024
Comprehensive Study on German Language Models for Clinical and
  Biomedical Text Understanding
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding
Ahmad Idrissi-Yaghir
Amin Dada
Henning Schafer
Kamyar Arzideh
Giulia Baldini
...
Peter A. Horn
Christin Seifert
F. Nensa
Jens Kleesiek
Christoph M. Friedrich
AI4MH
39
2
0
08 Apr 2024
Automating Research Synthesis with Domain-Specific Large Language Model
  Fine-Tuning
Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning
Teo Susnjak
Peter Hwang
N. Reyes
A. Barczak
Timothy R. McIntosh
Surangika Ranathunga
70
23
0
08 Apr 2024
Your Finetuned Large Language Model is Already a Powerful Out-of-distribution Detector
Your Finetuned Large Language Model is Already a Powerful Out-of-distribution Detector
Andi Zhang
Tim Z. Xiao
Weiyang Liu
Robert Bamler
Damon J. Wischik
OODD
51
4
0
07 Apr 2024
Can Humans Identify Domains?
Can Humans Identify Domains?
Maria Barrett
Max Müller-Eberstein
Elisa Bassignana
Amalie Brogaard Pauli
Mike Zhang
Rob van der Goot
47
1
0
02 Apr 2024
From Robustness to Improved Generalization and Calibration in
  Pre-trained Language Models
From Robustness to Improved Generalization and Calibration in Pre-trained Language Models
Josip Jukić
Jan Snajder
45
0
0
31 Mar 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
59
64
0
25 Mar 2024
Simple and Scalable Strategies to Continually Pre-train Large Language
  Models
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Adam Ibrahim
Benjamin Thérien
Kshitij Gupta
Mats L. Richter
Quentin Anthony
Timothée Lesort
Eugene Belilovsky
Irina Rish
KELM
CLL
44
54
0
13 Mar 2024
From One to Many: Expanding the Scope of Toxicity Mitigation in Language
  Models
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models
Luiza Amador Pozzobon
Patrick Lewis
Sara Hooker
Beyza Ermis
45
7
0
06 Mar 2024
SaulLM-7B: A pioneering Large Language Model for Law
SaulLM-7B: A pioneering Large Language Model for Law
Pierre Colombo
T. Pires
Malik Boudiaf
Dominic Culver
Rui Melo
...
Andre F. T. Martins
Fabrizio Esposito
Vera Lúcia Raposo
Sofia Morgado
Michael Desa
ELM
AILaw
52
66
0
06 Mar 2024
A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry
A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry
Handel Moshe
Ashish Vaswani
Niki Parmar
Łukasz Gomez
Kaiser Illia
48
1
0
27 Feb 2024
Investigating Continual Pretraining in Large Language Models: Insights and Implications
Investigating Continual Pretraining in Large Language Models: Insights and Implications
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
Beyza Ermis
CLL
KELM
LRM
60
25
0
27 Feb 2024
How Important is Domain Specificity in Language Models and Instruction
  Finetuning for Biomedical Relation Extraction?
How Important is Domain Specificity in Language Models and Instruction Finetuning for Biomedical Relation Extraction?
Aviv Brokman
Ramakanth Kavuluru
LM&MA
ALM
34
3
0
21 Feb 2024
MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared
  Semantic Spaces
MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces
Tianyu Zheng
Ge Zhang
Xingwei Qu
Ming Kuang
Stephen W. Huang
Zhaofeng He
OffRL
58
1
0
20 Feb 2024
LEIA: Facilitating Cross-lingual Knowledge Transfer in Language Models
  with Entity-based Data Augmentation
LEIA: Facilitating Cross-lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation
Ikuya Yamada
Ryokan Ri
KELM
25
0
0
18 Feb 2024
Deep Learning-based Computational Job Market Analysis: A Survey on Skill
  Extraction and Classification from Job Postings
Deep Learning-based Computational Job Market Analysis: A Survey on Skill Extraction and Classification from Job Postings
Elena Senger
Mike Zhang
Rob van der Goot
Barbara Plank
34
7
0
08 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in
  Closed-Source LLMs
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILM
ELM
PILM
35
159
0
06 Feb 2024
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
Rheeya Uppaal
Yixuan Li
Junjie Hu
40
4
0
31 Jan 2024
Named Entity Recognition Under Domain Shift via Metric Learning for Life
  Sciences
Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences
Hongyi Liu
Qingyun Wang
Payam Karisani
Heng Ji
21
1
0
19 Jan 2024
Some things are more CRINGE than others: Iterative Preference
  Optimization with the Pairwise Cringe Loss
Some things are more CRINGE than others: Iterative Preference Optimization with the Pairwise Cringe Loss
Jing Xu
Andrew Lee
Sainbayar Sukhbaatar
Jason Weston
23
86
0
27 Dec 2023
Balancing the Style-Content Trade-Off in Sentiment Transfer Using
  Polarity-Aware Denoising
Balancing the Style-Content Trade-Off in Sentiment Transfer Using Polarity-Aware Denoising
Sourabrata Mukherjee
Zdeněk Kasner
Ondrej Dusek
DiffM
16
11
0
22 Dec 2023
Time is Encoded in the Weights of Finetuned Language Models
Time is Encoded in the Weights of Finetuned Language Models
Kai Nylund
Suchin Gururangan
Noah A. Smith
AI4TS
36
18
0
20 Dec 2023
Mutual Enhancement of Large and Small Language Models with Cross-Silo
  Knowledge Transfer
Mutual Enhancement of Large and Small Language Models with Cross-Silo Knowledge Transfer
Yongheng Deng
Ziqing Qiao
Ju Ren
Yang Liu
Yaoxue Zhang
30
11
0
10 Dec 2023
DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction
DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction
Yanlong Li
Chamara Madarasingha
Kanchana Thilakarathna
26
1
0
06 Dec 2023
Leveraging Domain Adaptation and Data Augmentation to Improve Quránic
  IR in English and Arabic
Leveraging Domain Adaptation and Data Augmentation to Improve Quránic IR in English and Arabic
Vera Pavlova
23
2
0
05 Dec 2023
LowResource at BLP-2023 Task 2: Leveraging BanglaBert for Low Resource
  Sentiment Analysis of Bangla Language
LowResource at BLP-2023 Task 2: Leveraging BanglaBert for Low Resource Sentiment Analysis of Bangla Language
Aunabil Chakma
Masum Hasan
47
3
0
21 Nov 2023
On the Potential and Limitations of Few-Shot In-Context Learning to
  Generate Metamorphic Specifications for Tax Preparation Software
On the Potential and Limitations of Few-Shot In-Context Learning to Generate Metamorphic Specifications for Tax Preparation Software
Dananjay Srinivas
Rohan Das
Saeid Tizpaz-Niari
Ashutosh Trivedi
Maria Leonor Pacheco
33
4
0
20 Nov 2023
Previous
12345...91011
Next