ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.06305
  4. Cited By
Fine-Tuning Pretrained Language Models: Weight Initializations, Data
  Orders, and Early Stopping

Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

15 February 2020
Jesse Dodge
Gabriel Ilharco
Roy Schwartz
Ali Farhadi
Hannaneh Hajishirzi
Noah A. Smith
ArXivPDFHTML

Papers citing "Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping"

50 / 137 papers shown
Title
TelePlanNet: An AI-Driven Framework for Efficient Telecom Network Planning
TelePlanNet: An AI-Driven Framework for Efficient Telecom Network Planning
Zongyuan Deng
Yujie Cai
Qing Liu
Shiyao Mu
Bin Lyu
Zhen Yang
9
0
0
20 May 2025
GRASP: Municipal Budget AI Chatbots for Enhancing Civic Engagement
GRASP: Municipal Budget AI Chatbots for Enhancing Civic Engagement
Jerry Xu
Justin Wang
Joley Leung
Jasmine Gu
55
0
0
30 Mar 2025
Representation Engineering for Large-Language Models: Survey and Research Challenges
Representation Engineering for Large-Language Models: Survey and Research Challenges
Lukasz Bartoszcze
Sarthak Munshi
Bryan Sukidi
Jennifer Yen
Zejia Yang
David Williams-King
Linh Le
Kosi Asuzu
Carsten Maple
102
0
0
24 Feb 2025
Distributional Scaling Laws for Emergent Capabilities
Distributional Scaling Laws for Emergent Capabilities
Rosie Zhao
Tian Qin
David Alvarez-Melis
Sham Kakade
Naomi Saphra
LRM
39
1
0
24 Feb 2025
Fine-Tuning Games: Bargaining and Adaptation for General-Purpose Models
Fine-Tuning Games: Bargaining and Adaptation for General-Purpose Models
Benjamin Laufer
Jon M. Kleinberg
Hoda Heidari
60
8
0
03 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
96
12
0
31 Dec 2024
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
Yaming Zhang
Chenqiang Gao
Fangcen Liu
Junjie Guo
Lan Wang
Xinggan Peng
Deyu Meng
106
0
0
21 Dec 2024
Unified Parameter-Efficient Unlearning for LLMs
Chenlu Ding
Jiancan Wu
Yancheng Yuan
Jinda Lu
Kai Zhang
Alex Su
Xiang Wang
Xiangnan He
MU
KELM
100
6
0
30 Nov 2024
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale
  Models
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models
Qiaoyu Tang
Le Yu
Bowen Yu
Hongyu Lin
Keming Lu
Yaojie Lu
Xianpei Han
Le Sun
MoMe
34
1
0
17 Oct 2024
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Wenlong Deng
Yize Zhao
V. Vakilian
Minghui Chen
Xiaoxiao Li
Christos Thrampoulidis
45
3
0
12 Oct 2024
An Empirical Investigation of Matrix Factorization Methods for
  Pre-trained Transformers
An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers
Ashim Gupta
Sina Mahdipour Saravani
P. Sadayappan
Vivek Srikumar
35
2
0
17 Jun 2024
Uncertainty modeling for fine-tuned implicit functions
Uncertainty modeling for fine-tuned implicit functions
A. Susmelj
Mael Macuglia
Nataša Tagasovska
Reto Sutter
Sebastiano Caprara
Jean-Philippe Thiran
E. Konukoglu
70
1
0
17 Jun 2024
LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of
  Childhood Health Outcomes Using Pre-Trained Language Models
LT4SG@SMM4H24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models
Dasun Athukoralage
Thushari Atapattu
M. Thilakaratne
Katrina Falkner
LM&MA
23
0
0
11 Jun 2024
llmNER: (Zero|Few)-Shot Named Entity Recognition, Exploiting the Power
  of Large Language Models
llmNER: (Zero|Few)-Shot Named Entity Recognition, Exploiting the Power of Large Language Models
F. Villena
Luis Miranda
Claudio Aracena
58
2
0
06 Jun 2024
Large Language Models for Cyber Security: A Systematic Literature Review
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
Kaidi Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Liu
Haoyu Wang
42
23
0
08 May 2024
FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation
  of Large Language Models
FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models
Zhuohao Yu
Chang Gao
Wenjin Yao
Yidong Wang
Zhengran Zeng
Wei Ye
Jindong Wang
Yue Zhang
Shikun Zhang
46
1
0
09 Apr 2024
From Large Language Models and Optimization to Decision Optimization
  CoPilot: A Research Manifesto
From Large Language Models and Optimization to Decision Optimization CoPilot: A Research Manifesto
Segev Wasserkrug
Léonard Boussioux
D. Hertog
F. Mirzazadeh
Ilker Birbil
Jannis Kurtz
Donato Maragno
LLMAG
46
3
0
26 Feb 2024
PL-FSCIL: Harnessing the Power of Prompts for Few-Shot Class-Incremental
  Learning
PL-FSCIL: Harnessing the Power of Prompts for Few-Shot Class-Incremental Learning
Songsong Tian
Lusi Li
Weijun Li
Hang Ran
Li Li
X. Ning
CLL
VLM
40
3
0
26 Jan 2024
Canvil: Designerly Adaptation for LLM-Powered User Experiences
Canvil: Designerly Adaptation for LLM-Powered User Experiences
K. J. Kevin Feng
Q. V. Liao
Ziang Xiao
Jennifer Wortman Vaughan
Amy X. Zhang
David W. McDonald
49
16
0
17 Jan 2024
Making LLMs Worth Every Penny: Resource-Limited Text Classification in
  Banking
Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking
Lefteris Loukas
Ilias Stogiannidis
Odysseas Diamantopoulos
Prodromos Malakasiotis
Stavros Vassos
12
45
0
10 Nov 2023
Evaluating Bias and Fairness in Gender-Neutral Pretrained
  Vision-and-Language Models
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models
Laura Cabello
Emanuele Bugliarello
Stephanie Brandl
Desmond Elliott
23
7
0
26 Oct 2023
Stranger Danger! Cross-Community Interactions with Fringe Users Increase
  the Growth of Fringe Communities on Reddit
Stranger Danger! Cross-Community Interactions with Fringe Users Increase the Growth of Fringe Communities on Reddit
Giuseppe Russo
Manoel Horta Ribeiro
Robert West
26
10
0
18 Oct 2023
Holy Grail 2.0: From Natural Language to Constraint Models
Holy Grail 2.0: From Natural Language to Constraint Models
Dimosthenis C. Tsouros
Hélène Verhaeghe
Serdar Kadiouglu
Tias Guns
37
12
0
03 Aug 2023
Investigating the Learning Behaviour of In-context Learning: A
  Comparison with Supervised Learning
Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning
Xindi Wang
Yufei Wang
Can Xu
Xiubo Geng
Bowen Zhang
Chongyang Tao
Frank Rudzicz
Robert E. Mercer
Daxin Jiang
33
11
0
28 Jul 2023
Parameter-efficient is not sufficient: Exploring Parameter, Memory, and
  Time Efficient Adapter Tuning for Dense Predictions
Parameter-efficient is not sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions
Dongshuo Yin
Xueting Han
Bin Li
Hao Feng
Jinghua Bai
VPVLM
36
18
0
16 Jun 2023
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and
  Zero-Shot Fact Verification with Pre-trained Language Models
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and Zero-Shot Fact Verification with Pre-trained Language Models
Fengzhu Zeng
Wei Gao
23
5
0
05 Jun 2023
Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net
  Estimation and Optimization
Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization
Shoujie Tong
Heming Xia
Damai Dai
Runxin Xu
Tianyu Liu
Binghuai Lin
Yunbo Cao
Zhifang Sui
30
0
0
24 May 2023
Improving Convergence and Generalization Using Parameter Symmetries
Improving Convergence and Generalization Using Parameter Symmetries
Bo Zhao
Robert Mansel Gower
Robin Walters
Rose Yu
MoMe
33
13
0
22 May 2023
Investigating the Role of Feed-Forward Networks in Transformers Using
  Parallel Attention and Feed-Forward Net Design
Investigating the Role of Feed-Forward Networks in Transformers Using Parallel Attention and Feed-Forward Net Design
Shashank Sonkar
Richard G. Baraniuk
16
3
0
22 May 2023
TaskWeb: Selecting Better Source Tasks for Multi-task NLP
TaskWeb: Selecting Better Source Tasks for Multi-task NLP
Joongwon Kim
Akari Asai
Gabriel Ilharco
Hannaneh Hajishirzi
29
11
0
22 May 2023
On the Limitations of Simulating Active Learning
On the Limitations of Simulating Active Learning
Katerina Margatina
Nikolaos Aletras
31
11
0
21 May 2023
Measuring and Mitigating Local Instability in Deep Neural Networks
Measuring and Mitigating Local Instability in Deep Neural Networks
Arghya Datta
Subhrangshu Nandi
Jingcheng Xu
Greg Ver Steeg
He Xie
Anoop Kumar
Aram Galstyan
20
3
0
18 May 2023
PEFT-Ref: A Modular Reference Architecture and Typology for
  Parameter-Efficient Finetuning Techniques
PEFT-Ref: A Modular Reference Architecture and Typology for Parameter-Efficient Finetuning Techniques
Mohammed Sabry
Anya Belz
38
8
0
24 Apr 2023
On the Variance of Neural Network Training with respect to Test Sets and
  Distributions
On the Variance of Neural Network Training with respect to Test Sets and Distributions
Keller Jordan
OOD
24
11
0
04 Apr 2023
Exploring Data Augmentation Methods on Social Media Corpora
Exploring Data Augmentation Methods on Social Media Corpora
Isabel Garcia Pietri
Kineret Stanley
35
0
0
03 Mar 2023
Make Every Example Count: On the Stability and Utility of Self-Influence
  for Learning from Noisy NLP Datasets
Make Every Example Count: On the Stability and Utility of Self-Influence for Learning from Noisy NLP Datasets
Irina Bejan
Artem Sokolov
Katja Filippova
TDI
32
9
0
27 Feb 2023
Measuring the Instability of Fine-Tuning
Measuring the Instability of Fine-Tuning
Yupei Du
D. Nguyen
27
4
0
15 Feb 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
Haoxin Li
Phillip Keung
Daniel Cheng
Jungo Kasai
Noah A. Smith
25
3
0
11 Jan 2023
Examining Political Rhetoric with Epistemic Stance Detection
Examining Political Rhetoric with Epistemic Stance Detection
Ankita Gupta
Su Lin Blodgett
Justin H. Gross
Brendan O'Connor
22
0
0
29 Dec 2022
Text classification in shipping industry using unsupervised models and
  Transformer based supervised models
Text classification in shipping industry using unsupervised models and Transformer based supervised models
Yingyi Xie
Dongping Song
33
1
0
21 Dec 2022
A Natural Bias for Language Generation Models
A Natural Bias for Language Generation Models
Clara Meister
Wojciech Stokowiec
Tiago Pimentel
Lei Yu
Laura Rimell
A. Kuncoro
MILM
33
6
0
19 Dec 2022
Editing Models with Task Arithmetic
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
72
439
0
08 Dec 2022
Task-Specific Embeddings for Ante-Hoc Explainable Text Classification
Task-Specific Embeddings for Ante-Hoc Explainable Text Classification
Kishaloy Halder
Josip Krapac
Alan Akbik
Anthony Brew
Matti Lyra
32
0
0
30 Nov 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model
  From Scratch?
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Joel Niklaus
Daniele Giofré
33
11
0
30 Nov 2022
MEAL: Stable and Active Learning for Few-Shot Prompting
MEAL: Stable and Active Learning for Few-Shot Prompting
Abdullatif Köksal
Timo Schick
Hinrich Schütze
27
25
0
15 Nov 2022
An Efficient Active Learning Pipeline for Legal Text Classification
An Efficient Active Learning Pipeline for Legal Text Classification
Sepideh Mamooler
R. Lebret
Stéphane Massonnet
Karl Aberer
AILaw
27
4
0
15 Nov 2022
Parameter-Efficient Tuning Makes a Good Classification Head
Parameter-Efficient Tuning Makes a Good Classification Head
Zhuoyi Yang
Ming Ding
Yanhui Guo
Qingsong Lv
Jie Tang
VLM
58
14
0
30 Oct 2022
IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from
  Egocentric Videos and Text
IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from Egocentric Videos and Text
Seungwhan Moon
Andrea Madotto
Zhaojiang Lin
Alireza Dirafzoon
Aparajita Saraf
Amy Bearman
Babak Damavandi
VLM
20
36
0
26 Oct 2022
We need to talk about random seeds
We need to talk about random seeds
Steven Bethard
31
8
0
24 Oct 2022
123
Next