Assessing the Macro and Micro Effects of Random Seeds on Fine-Tuning Large Language Models

10 March 2025

Papers citing "Assessing the Macro and Micro Effects of Random Seeds on Fine-Tuning Large Language Models"

10 / 10 papers shown

Title
We need to talk about random seeds Steven Bethard 51 8 0 24 Oct 2022
Torch.manual_seed(3407) is all you need: On the influence of random seeds in deep learning architectures for computer vision David Picard 3DV VLM 62 92 0 16 Sep 2021
How many images do I need? Understanding how sample size per class affects deep learning model performance metrics for balanced designs in autonomous wildlife monitoring S. Shahinfar P. Meek G. Falzon 54 156 0 16 Oct 2020
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping Jesse Dodge Gabriel Ilharco Roy Schwartz Ali Farhadi Hannaneh Hajishirzi Noah A. Smith 99 597 0 15 Feb 2020
On Model Stability as a Function of Random Seed Pranava Madhyastha Dhruv Batra 86 63 0 23 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 659 24,464 0 26 Jul 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems Alex Jinpeng Wang Yada Pruksachatkun Nikita Nangia Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 265 2,315 0 02 May 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 94,891 0 11 Oct 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 1.1K 7,159 0 20 Apr 2018
Practical recommendations for gradient-based training of deep architectures Yoshua Bengio 3DH ODL 193 2,200 0 24 Jun 2012