ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.20282
28
3
v1v2v3 (latest)

One-shot Entropy Minimization

26 May 2025
Zitian Gao
Lynx Chen
Joey Zhou
Bryan Dai
Author Contacts:
ztgao02@ubiquant.comylchen@ubiquant.comjzhou@ubiquant.comcbdai@ubiquant.com
ArXiv (abs)PDFHTML
Main:12 Pages
7 Figures
Bibliography:1 Pages
2 Tables
Appendix:1 Pages
Abstract

We trained 13,440 large language models and found that entropy minimization requires only a single unlabeled data and 10 steps optimization to achieve performance improvements comparable to or even greater than those obtained using thousands of data and carefully designed rewards in rule-based reinforcement learning. This striking result may prompt a rethinking of post-training paradigms for large language models. Our code is avaliable atthis https URL.

View on arXiv
@article{gao2025_2505.20282,
  title={ One-shot Entropy Minimization },
  author={ Zitian Gao and Lynx Chen and Joey Zhou and Bryan Dai },
  journal={arXiv preprint arXiv:2505.20282},
  year={ 2025 }
}
Comments on this paper