Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10964
Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 475 papers shown
Title
Telco-oRAG: Optimizing Retrieval-augmented Generation for Telecom Queries via Hybrid Retrieval and Neural Routing
Andrei-Laurentiu Bornea
Fadhel Ayed
Antonio De Domenico
Nicola Piovesan
Tareq Si Salem
Ali Maatouk
7
0
0
17 May 2025
Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning
Shaurya Sharthak
Vinayak Pahalwan
Adithya Kamath
Adarsh Shirawalmath
CLL
VLM
45
0
0
14 May 2025
Large Language Models Meet Stance Detection: A Survey of Tasks, Methods, Applications, Challenges and Future Directions
Lata Pangtey
Anukriti Bhatnagar
Shubhi Bansal
Shahid Shafi Dar
Nagendra Kumar
34
0
0
13 May 2025
Prediction-powered estimators for finite population statistics in highly imbalanced textual data: Public hate crime estimation
Hannes Waldetoft
Jakob Torgander
Måns Magnusson
29
0
0
05 May 2025
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
Paul Kassianik
Baturay Saglam
Alexander Chen
Blaine Nelson
Anu Vellore
...
Hyrum Anderson
Kojin Oshiba
Omar Santos
Yaron Singer
Amin Karbasi
PILM
63
0
0
28 Apr 2025
Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language
Anastasia Zhukova
Christian E. Matt
Terry Ruas
Bela Gipp
CLL
VLM
98
0
0
28 Apr 2025
TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation
Gwen Yidou Weng
Benjie Wang
Mathias Niepert
BDL
158
0
0
25 Apr 2025
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
Wataru Kawakami
Keita Suzuki
Junichiro Iwasawa
LRM
75
0
0
25 Apr 2025
CSPLADE: Learned Sparse Retrieval with Causal Language Models
Zhichao Xu
Aosong Feng
Yijun Tian
Haibo Ding
Lin Leee Cheong
RALM
47
0
0
15 Apr 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
61
0
0
02 Apr 2025
OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery
Vignesh Prabhakar
Md Amirul Islam
Adam Atanas
Yansen Wang
J. N. Han
...
Rucha Apte
Robert Clark
Kang Xu
Zihan Wang
Kai Liu
LRM
88
1
0
22 Mar 2025
A Dataset for Analysing News Framing in Chinese Media
Owen Cook
Yida Mu
Xinye Yang
Xingyi Song
Kalina Bontcheva
72
1
0
06 Mar 2025
CareerBERT: Matching Resumes to ESCO Jobs in a Shared Embedding Space for Generic Job Recommendations
Julian Rosenberger
Lukas Wolfrum
Sven Weinzierl
Mathias Kraus
Patrick Zschech
60
0
0
03 Mar 2025
Personalize Your LLM: Fake it then Align it
Yijing Zhang
Dyah Adila
Changho Shin
Frederic Sala
88
0
0
02 Mar 2025
NaijaNLP: A Survey of Nigerian Low-Resource Languages
Isa Inuwa-Dutse
44
0
0
27 Feb 2025
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM
3DV
61
2
0
21 Feb 2025
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
Yuchen Wen
Keping Bi
Wei Chen
J. Guo
Xueqi Cheng
89
1
0
20 Feb 2025
Why does my medical AI look at pictures of birds? Exploring the efficacy of transfer learning across domain boundaries
F. Jonske
M. Kim
Enrico Nasca
J. Evers
Johannes Haubold
...
F. Nensa
Michael Kamp
C. Seibold
Jan Egger
Jens Kleesiek
79
1
0
17 Feb 2025
FinMTEB: Finance Massive Text Embedding Benchmark
Yixuan Tang
Yi Yang
AIFin
66
0
0
16 Feb 2025
Assessing the Impact of the Quality of Textual Data on Feature Representation and Machine Learning Models
Tabinda Sarwar
Antonio Jose Jimeno Yepes
Lawrence Cavedon
69
0
0
12 Feb 2025
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
Naome A. Etori
Maria Gini
81
2
0
10 Feb 2025
Privacy-Preserving Dataset Combination
Keren Fuentes
Mimee Xu
Irene Chen
43
0
0
09 Feb 2025
Detecting harassment and defamation in cyberbullying with emotion-adaptive training
Peiling Yi
A. Zubiaga
Yunfei Long
87
0
0
28 Jan 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Min Zhang
LM&MA
AILaw
93
154
0
28 Jan 2025
Speech Translation Refinement using Large Language Models
Huaixia Dou
Xinyu Tian
Xinglin Lyu
Jie Zhu
Junhui Li
Lifan Guo
176
0
0
28 Jan 2025
Addressing Bias in Generative AI: Challenges and Research Opportunities in Information Management
Xiahua Wei
Naveen Kumar
Han Zhang
68
5
0
22 Jan 2025
News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation
Andreea Iana
Fabian David Schmidt
Goran Glavas
Heiko Paulheim
71
3
0
20 Jan 2025
On Adversarial Robustness of Language Models in Transfer Learning
Bohdan Turbal
Anastasiia Mazur
Jiaxu Zhao
Mykola Pechenizkiy
AAML
45
0
0
03 Jan 2025
The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better
Scott Geng
Cheng-Yu Hsieh
Vivek Ramanujan
Matthew Wallingford
Chun-Liang Li
Pang Wei Koh
Ranjay Krishna
DiffM
68
6
0
03 Jan 2025
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao
Shiqian Su
X. Zhu
Chenyu Zhang
Zhe Chen
...
Wenhai Wang
Lewei Lu
Gao Huang
Yu Qiao
Jifeng Dai
MLLM
VLM
104
2
0
20 Dec 2024
A recent evaluation on the performance of LLMs on radiation oncology physics using questions of randomly shuffled options
Peilong Wang
J. Holmes
Ziqiang Liu
Dequan Chen
Tianming Liu
Jiajian Shen
Wei Liu
LRM
ELM
LM&MA
87
0
0
14 Dec 2024
CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search
Kaixin Wu
Yixin Ji
Ziyang Chen
Qiang Wang
Cunxiang Wang
...
Jia Xu
Zhongyi Liu
Jinjie Gu
Yuan Zhou
Linjian Mo
KELM
CLL
92
0
0
02 Dec 2024
Efficient Alignment of Large Language Models via Data Sampling
Amrit Khera
Rajat Ghosh
Debojyoti Dutta
36
1
0
15 Nov 2024
Gradient Localization Improves Lifelong Pretraining of Language Models
Jared Fernandez
Yonatan Bisk
Emma Strubell
KELM
39
1
0
07 Nov 2024
DELIFT: Data Efficient Language model Instruction Fine Tuning
Ishika Agarwal
Krishnateja Killamsetty
Lucian Popa
Marina Danilevksy
ALM
VLM
58
3
0
07 Nov 2024
Latent Paraphrasing: Perturbation on Layers Improves Knowledge Injection in Language Models
Minki Kang
Sung Ju Hwang
Gibbeum Lee
Jaewoong Cho
KELM
43
0
0
01 Nov 2024
ZIP-FIT: Embedding-Free Data Selection via Compression-Based Alignment
Elyas Obbad
Iddah Mlauzi
Brando Miranda
Rylan Schaeffer
Kamal Obbad
Suhana Bedi
Sanmi Koyejo
CVBM
53
0
0
23 Oct 2024
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
Xianyang Zhan
Agam Goyal
Yilun Chen
Eshwar Chandrasekharan
Koustuv Saha
AI4MH
168
0
0
17 Oct 2024
REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via Model Fusion of Embedding Models
Ambuje Gupta
Mrinal Rawat
Andreas Stolcke
Roberto Pieraccini
RALM
21
1
0
16 Oct 2024
ELICIT: LLM Augmentation via External In-Context Capability
Futing Wang
Jianhao Yan
Yue Zhang
Tao Lin
44
0
0
12 Oct 2024
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Zhipeng Chen
Liang Song
K. Zhou
Wayne Xin Zhao
Binghui Wang
Weipeng Chen
Ji-Rong Wen
68
0
0
10 Oct 2024
From Tokens to Words: On the Inner Lexicon of LLMs
Guy Kaplan
Matanel Oren
Yuval Reif
Roy Schwartz
50
12
0
08 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
171
0
0
07 Oct 2024
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Tianjian Li
Haoran Xu
Weiting Tan
Kenton Murray
Daniel Khashabi
35
1
0
06 Oct 2024
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
David Grangier
Simin Fan
Skyler Seto
Pierre Ablin
44
3
0
30 Sep 2024
Do We Need Domain-Specific Embedding Models? An Empirical Investigation
Yixuan Tang
Yi Yang
AIFin
50
3
0
27 Sep 2024
MindGuard: Towards Accessible and Sitgma-free Mental Health First Aid via Edge LLM
Sijie Ji
Xinzhe Zheng
Jiawei Sun
Renqi Chen
Wei Gao
Mani Srivastava
AI4MH
37
4
0
16 Sep 2024
Towards understanding evolution of science through language model series
Junjie Dong
Zhuoqi Lyu
Qing Ke
AI4TS
37
0
0
15 Sep 2024
DomURLs_BERT: Pre-trained BERT-based Model for Malicious Domains and URLs Detection and Classification
Abdelkader El Mahdaouy
Salima Lamsiyah
Meryem Janati Idrissi
H. Alami
Zakaria Yartaoui
Ismail Berrada
16
3
0
13 Sep 2024
Self-Masking Networks for Unsupervised Adaptation
Alfonso Taboada Warmerdam
Mathilde Caron
Yuki M. Asano
46
1
0
11 Sep 2024
1
2
3
4
...
8
9
10
Next