ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04884
  4. Cited By
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and
  Strong Baselines

On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines

8 June 2020
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
ArXivPDFHTML

Papers citing "On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines"

50 / 93 papers shown
Title
Can Frontier LLMs Replace Annotators in Biomedical Text Mining? Analyzing Challenges and Exploring Solutions
Can Frontier LLMs Replace Annotators in Biomedical Text Mining? Analyzing Challenges and Exploring Solutions
Yichong Zhao
Susumu Goto
60
0
0
05 Mar 2025
Decoding Reading Goals from Eye Movements
Decoding Reading Goals from Eye Movements
Omer Shubi
Cfir Avraham Hadar
Yevgeni Berzak
AIMat
49
1
0
28 Oct 2024
HATFormer: Historic Handwritten Arabic Text Recognition with Transformers
HATFormer: Historic Handwritten Arabic Text Recognition with Transformers
Adrian Chan
Anupam Mijar
Mehreen Saeed
Chau-Wai Wong
Akram Khater
41
0
0
03 Oct 2024
Efficient LLM Context Distillation
Efficient LLM Context Distillation
Rajesh Upadhayayaya
Zachary Smith
Chritopher Kottmyer
Manish Raj Osti
42
1
0
03 Sep 2024
Self-Training for Sample-Efficient Active Learning for Text
  Classification with Pre-Trained Language Models
Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models
Christopher Schröder
Gerhard Heyer
VLM
44
0
0
13 Jun 2024
Investigating the Robustness of Modelling Decisions for Few-Shot
  Cross-Topic Stance Detection: A Preregistered Study
Investigating the Robustness of Modelling Decisions for Few-Shot Cross-Topic Stance Detection: A Preregistered Study
Myrthe Reuver
Suzan Verberne
Antske Fokkens
37
1
0
05 Apr 2024
CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for
  Optimized Learning Fusion
CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for Optimized Learning Fusion
Zijun Long
George Killick
Lipeng Zhuang
Gerardo Aragon Camarasa
Zaiqiao Meng
R. McCreadie
VLM
47
2
0
22 Feb 2024
PolyIE: A Dataset of Information Extraction from Polymer Material
  Scientific Literature
PolyIE: A Dataset of Information Extraction from Polymer Material Scientific Literature
Jerry Junyang Cheung
Yuchen Zhuang
Yinghao Li
Pranav Shetty
Wantian Zhao
Sanjeev Grampurohit
R. Ramprasad
Chao Zhang
AI4CE
14
11
0
13 Nov 2023
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
Yupei Du
Albert Gatt
Dong Nguyen
31
1
0
10 Oct 2023
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and
  Zero-Shot Fact Verification with Pre-trained Language Models
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and Zero-Shot Fact Verification with Pre-trained Language Models
Fengzhu Zeng
Wei Gao
17
5
0
05 Jun 2023
Understanding Emotion Valence is a Joint Deep Learning Task
Understanding Emotion Valence is a Joint Deep Learning Task
Gabriel Roccabruna
Seyed Mahed Mousavi
Giuseppe Riccardi
21
0
0
27 May 2023
Toward Connecting Speech Acts and Search Actions in Conversational
  Search Tasks
Toward Connecting Speech Acts and Search Actions in Conversational Search Tasks
Souvick Ghosh
Satanu Ghosh
C. Shah
25
2
0
08 May 2023
KINLP at SemEval-2023 Task 12: Kinyarwanda Tweet Sentiment Analysis
KINLP at SemEval-2023 Task 12: Kinyarwanda Tweet Sentiment Analysis
Antoine Nzeyimana
17
3
0
25 Apr 2023
On the Variance of Neural Network Training with respect to Test Sets and
  Distributions
On the Variance of Neural Network Training with respect to Test Sets and Distributions
Keller Jordan
OOD
21
10
0
04 Apr 2023
Sociocultural knowledge is needed for selection of shots in hate speech
  detection tasks
Sociocultural knowledge is needed for selection of shots in hate speech detection tasks
Antonis Maronikolakis
Abdullatif Köksal
Hinrich Schütze
43
0
0
04 Apr 2023
Finding the Needle in a Haystack: Unsupervised Rationale Extraction from
  Long Text Classifiers
Finding the Needle in a Haystack: Unsupervised Rationale Extraction from Long Text Classifiers
Kamil Bujel
Andrew Caines
H. Yannakoudakis
Marek Rei
AI4TS
19
1
0
14 Mar 2023
Multimodal Prompting with Missing Modalities for Visual Recognition
Multimodal Prompting with Missing Modalities for Visual Recognition
Yi-Lun Lee
Yi-Hsuan Tsai
Wei-Chen Chiu
Chen-Yu Lee
VPVLM
27
94
0
06 Mar 2023
Measuring the Instability of Fine-Tuning
Measuring the Instability of Fine-Tuning
Yupei Du
D. Nguyen
25
4
0
15 Feb 2023
Evaluating the Robustness of Discrete Prompts
Evaluating the Robustness of Discrete Prompts
Yoichi Ishibashi
Danushka Bollegala
Katsuhito Sudoh
Satoshi Nakamura
23
18
0
11 Feb 2023
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Asha Vishwanathan
R. Warrier
G. V. Suresh
Chandrashekhar Kandpal
11
2
0
25 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
Haoxin Li
Phillip Keung
Daniel Cheng
Jungo Kasai
Noah A. Smith
14
3
0
11 Jan 2023
InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers
InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers
Leonid Boytsov
Preksha Patel
Vivek Sourabh
Riddhi Nisar
Sayan Kundu
R. Ramanathan
Eric Nyberg
23
19
0
08 Jan 2023
Examining Political Rhetoric with Epistemic Stance Detection
Examining Political Rhetoric with Epistemic Stance Detection
Ankita Gupta
Su Lin Blodgett
Justin H. Gross
Brendan O'Connor
22
0
0
29 Dec 2022
KL Regularized Normalization Framework for Low Resource Tasks
KL Regularized Normalization Framework for Low Resource Tasks
Neeraj Kumar
Ankur Narang
Brejesh Lall
26
1
0
21 Dec 2022
Task-Specific Embeddings for Ante-Hoc Explainable Text Classification
Task-Specific Embeddings for Ante-Hoc Explainable Text Classification
Kishaloy Halder
Josip Krapac
A. Akbik
Anthony Brew
Matti Lyra
30
0
0
30 Nov 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model
  From Scratch?
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Joel Niklaus
Daniele Giofré
27
11
0
30 Nov 2022
Detecting Entities in the Astrophysics Literature: A Comparison of
  Word-based and Span-based Entity Recognition Methods
Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods
Xiang Dai
Sarvnaz Karimi
24
3
0
24 Nov 2022
MEAL: Stable and Active Learning for Few-Shot Prompting
MEAL: Stable and Active Learning for Few-Shot Prompting
Abdullatif Köksal
Timo Schick
Hinrich Schütze
19
25
0
15 Nov 2022
An Efficient Active Learning Pipeline for Legal Text Classification
An Efficient Active Learning Pipeline for Legal Text Classification
Sepideh Mamooler
R. Lebret
Stéphane Massonnet
Karl Aberer
AILaw
24
4
0
15 Nov 2022
Probing neural language models for understanding of words of estimative
  probability
Probing neural language models for understanding of words of estimative probability
Damien Sileo
Marie-Francine Moens
19
10
0
07 Nov 2022
Gradient Knowledge Distillation for Pre-trained Language Models
Gradient Knowledge Distillation for Pre-trained Language Models
Lean Wang
Lei Li
Xu Sun
VLM
23
5
0
02 Nov 2022
We need to talk about random seeds
We need to talk about random seeds
Steven Bethard
31
8
0
24 Oct 2022
Improving Stability of Fine-Tuning Pretrained Language Models via
  Component-Wise Gradient Norm Clipping
Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping
Chenghao Yang
Xuezhe Ma
35
6
0
19 Oct 2022
Hidden State Variability of Pretrained Language Models Can Guide
  Computation Reduction for Transfer Learning
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning
Shuo Xie
Jiahao Qiu
Ankita Pasad
Li Du
Qing Qu
Hongyuan Mei
35
16
0
18 Oct 2022
AD-DROP: Attribution-Driven Dropout for Robust Language Model
  Fine-Tuning
AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning
Tao Yang
Jinghao Deng
Xiaojun Quan
Qifan Wang
Shaoliang Nie
30
3
0
12 Oct 2022
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Haw-Shiuan Chang
Ruei-Yao Sun
Kathryn Ricci
Andrew McCallum
43
14
0
10 Oct 2022
UU-Tax at SemEval-2022 Task 3: Improving the generalizability of
  language models for taxonomy classification through data augmentation
UU-Tax at SemEval-2022 Task 3: Improving the generalizability of language models for taxonomy classification through data augmentation
I. Sarhan
P. Mosteiro
Marco Spruit
29
2
0
07 Oct 2022
An Empirical Study on Cross-X Transfer for Legal Judgment Prediction
An Empirical Study on Cross-X Transfer for Legal Judgment Prediction
Joel Niklaus
Matthias Sturmer
Ilias Chalkidis
ELM
AILaw
37
19
0
25 Sep 2022
Drawing Causal Inferences About Performance Effects in NLP
Drawing Causal Inferences About Performance Effects in NLP
Sandra Wankmüller
CML
16
1
0
14 Sep 2022
Heuristic-free Optimization of Force-Controlled Robot Search Strategies
  in Stochastic Environments
Heuristic-free Optimization of Force-Controlled Robot Search Strategies in Stochastic Environments
Bastian Alt
Darko Katic
Rainer Jäkel
Michael Beetz
21
6
0
15 Jul 2022
Zero-shot Cross-lingual Transfer is Under-specified Optimization
Zero-shot Cross-lingual Transfer is Under-specified Optimization
Shijie Wu
Benjamin Van Durme
Mark Dredze
25
6
0
12 Jul 2022
Pretrained Models for Multilingual Federated Learning
Pretrained Models for Multilingual Federated Learning
Orion Weller
Marc Marone
Vladimir Braverman
Dawn J Lawrie
Benjamin Van Durme
VLM
FedML
AI4CE
33
42
0
06 Jun 2022
Can Foundation Models Help Us Achieve Perfect Secrecy?
Can Foundation Models Help Us Achieve Perfect Secrecy?
Simran Arora
Christopher Ré
FedML
21
6
0
27 May 2022
Linear Connectivity Reveals Generalization Strategies
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
239
45
0
24 May 2022
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures
  of Soft Prompts
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
Akari Asai
Mohammadreza Salehi
Matthew E. Peters
Hannaneh Hajishirzi
127
100
0
24 May 2022
Calibration of Natural Language Understanding Models with Venn--ABERS
  Predictors
Calibration of Natural Language Understanding Models with Venn--ABERS Predictors
Patrizio Giovannotti
38
6
0
21 May 2022
Zero-shot Code-Mixed Offensive Span Identification through Rationale
  Extraction
Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction
Manikandan Ravikiran
Bharathi Raja Chakravarthi
22
3
0
12 May 2022
Few-shot Mining of Naturally Occurring Inputs and Outputs
Few-shot Mining of Naturally Occurring Inputs and Outputs
Mandar Joshi
Terra Blevins
M. Lewis
Daniel S. Weld
Luke Zettlemoyer
27
1
0
09 May 2022
A Comparison of Approaches for Imbalanced Classification Problems in the
  Context of Retrieving Relevant Documents for an Analysis
A Comparison of Approaches for Imbalanced Classification Problems in the Context of Retrieving Relevant Documents for an Analysis
Sandra Wankmüller
28
2
0
03 May 2022
12
Next