ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.13507
49
0

NeurIPS 2023 LLM Efficiency Fine-tuning Competition

13 March 2025
Mark Saroufim
Yotam Perlitz
Leshem Choshen
Luca Antiga
Greg Bowyer
Christian Puhrsch
Driss Guessous
Supriya Rao
Geeta Chauhan
Ashvini Kumar
Jindal Pawan Kumar
Rajpoot Ankur Parikh
Joe Isaacson
Weiwei Yang
    ELM
ArXivPDFHTML
Abstract

Our analysis of the NeurIPS 2023 large language model (LLM) fine-tuning competition revealed the following trend: top-performing models exhibit significant overfitting on benchmark datasets, mirroring the broader issue of benchmark overfitting on popular leaderboards and that data curation is essential in order to get a high performing LLM. The competition, which consisted of two stages - an open evaluation stage with publicly available tasks and a closed evaluation stage with unseen tasks - allowed us to assess the generalizability of fine-tuned LLMs. Our results highlight the limitations of current benchmark-based evaluation schemes for generative models and demonstrate the need for more robust evaluation methods. Notably, the winning submissions utilized standard open-source libraries and focused primarily on data curation. To facilitate further research and promote reproducibility, we release all competition entries, Docker files, and evaluation infrastructure, providing a valuable resource for the community to explore fine-tuning, overfitting, and reproducibility in LLMs.

View on arXiv
@article{saroufim2025_2503.13507,
  title={ NeurIPS 2023 LLM Efficiency Fine-tuning Competition },
  author={ Mark Saroufim and Yotam Perlitz and Leshem Choshen and Luca Antiga and Greg Bowyer and Christian Puhrsch and Driss Guessous and Supriya Rao and Geeta Chauhan and Ashvini Kumar and Jindal Pawan Kumar and Rajpoot Ankur Parikh and Joe Isaacson and Weiwei Yang },
  journal={arXiv preprint arXiv:2503.13507},
  year={ 2025 }
}
Comments on this paper