ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.16737
  4. Cited By
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal
  Sampling

Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling

29 August 2024
Hritik Bansal
Arian Hosseini
Rishabh Agarwal
Vinh Q. Tran
Mehran Kazemi
    SyDa
    OffRL
    LRM
ArXivPDFHTML

Papers citing "Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling"

30 / 30 papers shown
Title
Knowledge Augmented Complex Problem Solving with Large Language Models: A Survey
Knowledge Augmented Complex Problem Solving with Large Language Models: A Survey
Da Zheng
Lun Du
Junwei Su
Yuchen Tian
Yuqi Zhu
Jintian Zhang
Lanning Wei
Ningyu Zhang
H. Chen
LRM
61
0
0
06 May 2025
Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
J. Zhang
Flood Sung
Zhengyuan Yang
Yang Gao
Chongjie Zhang
LLMAG
44
0
0
28 Apr 2025
SWE-Synth: Synthesizing Verifiable Bug-Fix Data to Enable Large Language Models in Resolving Real-World Bugs
SWE-Synth: Synthesizing Verifiable Bug-Fix Data to Enable Large Language Models in Resolving Real-World Bugs
Minh V.T. Pham
Huy N. Phan
Hoang N. Phan
Cuong Le Chi
T. Nguyen
Nghi D. Q. Bui
SyDa
29
0
0
20 Apr 2025
An Empirically Grounded Identifiability Theory Will Accelerate Self-Supervised Learning Research
An Empirically Grounded Identifiability Theory Will Accelerate Self-Supervised Learning Research
Patrik Reizinger
Randall Balestriero
David Klindt
Wieland Brendel
40
0
0
17 Apr 2025
Sleep-time Compute: Beyond Inference Scaling at Test-time
Sleep-time Compute: Beyond Inference Scaling at Test-time
Kevin Lin
Charlie Snell
Yansen Wang
Charles Packer
Sarah Wooders
Ion Stoica
Joseph E. Gonzalez
47
2
0
17 Apr 2025
Training Small Reasoning LLMs with Cognitive Preference Alignment
Training Small Reasoning LLMs with Cognitive Preference Alignment
Wenrui Cai
Chengyu Wang
Junbing Yan
Jun Huang
Xiangzhong Fang
LRM
26
1
0
14 Apr 2025
Achieving Unanimous Consensus in Decision Making Using Multi-Agents
Achieving Unanimous Consensus in Decision Making Using Multi-Agents
Apurba Pokharel
Ram Dantu
Shakila Zaman
Sirisha Talapuru
Vinh Quach
49
1
0
02 Apr 2025
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning
Nishad Singhi
Hritik Bansal
Arian Hosseini
Aditya Grover
Kai-Wei Chang
Marcus Rohrbach
Anna Rohrbach
OffRL
LRM
42
2
0
01 Apr 2025
Weak-to-Strong Generalization Even in Random Feature Networks, Provably
Marko Medvedev
Kaifeng Lyu
Dingli Yu
Sanjeev Arora
Zhiyuan Li
Nathan Srebro
107
0
0
04 Mar 2025
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling
Yiwen Ding
Zhiheng Xi
Wei He
Zhuoyuan Li
Yitao Zhai
Xiaowei Shi
Xunliang Cai
Tao Gui
Qi Zhang
Xuanjing Huang
LRM
77
3
0
24 Feb 2025
Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective
Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective
Chengyin Xu
Kaiyuan Chen
Xiao Li
Ke Shen
Chenggang Li
OffRL
56
0
0
24 Feb 2025
Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations
Kunal Handa
Alex Tamkin
Miles McCain
Saffron Huang
Esin Durmus
...
Kevin K. Troy
Dario Amodei
Jared Kaplan
Jack Clark
Deep Ganguli
MLAU
63
12
0
11 Feb 2025
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Jinyang Wu
Mingkuan Feng
Shuai Zhang
Ruihan Jin
Feihu Che
Zengqi Wen
J. Tao
LRM
68
8
0
04 Feb 2025
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Nayoung Lee
Ziyang Cai
Avi Schwarzschild
Kangwook Lee
Dimitris Papailiopoulos
ReLM
VLM
LRM
AI4CE
83
4
0
03 Feb 2025
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
Runming Yang
Taiqiang Wu
Jiahao Wang
Pengfei Hu
Ngai Wong
Yujiu Yang
Yujiu Yang
178
1
0
11 Nov 2024
Guiding Through Complexity: What Makes Good Supervision for Hard Math Reasoning Tasks?
Guiding Through Complexity: What Makes Good Supervision for Hard Math Reasoning Tasks?
Xuan He
Da Yin
Nanyun Peng
LRM
40
0
0
27 Oct 2024
Computational Bottlenecks of Training Small-scale Large Language Models
Computational Bottlenecks of Training Small-scale Large Language Models
Saleh Ashkboos
Iman Mirzadeh
Keivan Alizadeh
Mohammad Hossein Sekhavat
Moin Nabi
Mehrdad Farajtabar
Fartash Faghri
26
0
0
25 Oct 2024
Little Giants: Synthesizing High-Quality Embedding Data at Scale
Little Giants: Synthesizing High-Quality Embedding Data at Scale
Haonan Chen
Liang Wang
Nan Yang
Yichen Zhu
Ziliang Zhao
Furu Wei
Zhicheng Dou
SyDa
39
1
0
24 Oct 2024
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through
  Failure-Inducing Exploration
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration
Qintong Li
Jiahui Gao
Sheng Wang
Renjie Pi
Xueliang Zhao
Chuan Wu
Xin Jiang
Zhiyu Li
Lingpeng Kong
SyDa
28
3
0
22 Oct 2024
A Simple Model of Inference Scaling Laws
A Simple Model of Inference Scaling Laws
Noam Levi
LRM
32
6
0
21 Oct 2024
EnsemW2S: Can an Ensemble of LLMs be Leveraged to Obtain a Stronger LLM?
EnsemW2S: Can an Ensemble of LLMs be Leveraged to Obtain a Stronger LLM?
Aakriti Agrawal
Mucong Ding
Zora Che
Chenghao Deng
Anirudh Satheesh
John Langford
Furong Huang
53
4
0
06 Oct 2024
Improving LLM Reasoning through Scaling Inference Computation with
  Collaborative Verification
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification
Zhenwen Liang
Ye Liu
Tong Niu
Xiangliang Zhang
Yingbo Zhou
Semih Yavuz
LRM
34
18
0
05 Oct 2024
Quantifying Generalization Complexity for Large Language Models
Quantifying Generalization Complexity for Large Language Models
Zhenting Qi
Hongyin Luo
Xuliang Huang
Zhuokai Zhao
Yibo Jiang
Xiangjun Fan
Himabindu Lakkaraju
James Glass
LRM
ELM
34
5
0
02 Oct 2024
Not All LLM Reasoners Are Created Equal
Not All LLM Reasoners Are Created Equal
Arian Hosseini
Alessandro Sordoni
Daniel Toyama
Rameswar Panda
Rishabh Agarwal
LRM
49
11
0
02 Oct 2024
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit
  Assignment
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Amirhossein Kazemnejad
Milad Aghajohari
Eva Portelance
Alessandro Sordoni
Siva Reddy
Rameswar Panda
Nicolas Le Roux
OffRL
LRM
36
27
0
02 Oct 2024
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source
  Instruction Data
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
Shubham Toshniwal
Wei Du
Ivan Moshkov
Branislav Kisacanin
Alexan Ayrapetyan
Igor Gitman
LRM
26
51
0
02 Oct 2024
Balancing Cost and Effectiveness of Synthetic Data Generation Strategies
  for LLMs
Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs
Yung-Chieh Chan
George Pu
Apaar Shanker
Parth Suresh
Penn Jenks
John Heyer
Sam Denton
SyDa
37
8
0
29 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
63
23
0
10 Sep 2024
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
1