Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.03314
Cited By
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
6 August 2024
Charlie Snell
Jaehoon Lee
Kelvin Xu
Aviral Kumar
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters"
50 / 124 papers shown
Title
Can Frontier LLMs Replace Annotators in Biomedical Text Mining? Analyzing Challenges and Exploring Solutions
Yichong Zhao
Susumu Goto
91
0
0
05 Mar 2025
Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing
Juntai Cao
Xiang Zhang
Raymond Li
Chuyuan Li
Shafiq Joty
Shafiq Joty
Giuseppe Carenini
152
1
0
27 Feb 2025
Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
Yuan Sui
Yufei He
Tri Cao
Simeng Han
Yulin Chen
Bryan Hooi
LRM
AI4CE
175
7
0
27 Feb 2025
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
Yancheng He
Shilong Li
Jing Liu
Weixun Wang
Xingyuan Bu
...
Zhongyuan Peng
Zhenru Zhang
Zhicheng Zheng
Wenbo Su
Bo Zheng
ELM
LRM
135
16
0
26 Feb 2025
Stay Focused: Problem Drift in Multi-Agent Debate
Jonas Becker
Lars Benedikt Kaesberg
Andreas Stephan
Jan Philip Wahle
Terry Ruas
Bela Gipp
114
2
0
26 Feb 2025
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Zhaowei Zhang
Fengshuo Bai
Qizhi Chen
Chengdong Ma
Mingzhi Wang
Haoran Sun
Zilong Zheng
Yaodong Yang
131
5
0
26 Feb 2025
Spontaneous Giving and Calculated Greed in Language Models
Yuxuan Li
Hirokazu Shirado
ReLM
LRM
AI4CE
77
2
0
24 Feb 2025
Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective
Chengyin Xu
Kaiyuan Chen
Xiao Li
Ke Shen
Chenggang Li
OffRL
168
2
0
24 Feb 2025
DISC: DISC: Dynamic Decomposition Improves LLM Inference Scaling
Jonathan Light
Wei Cheng
Benjamin Rivière
Wu Yue
Masafumi Oyamada
Mengdi Wang
Yisong Yue
Santiago Paternain
Haifeng Chen
ReLM
LRM
110
2
0
23 Feb 2025
IPO: Your Language Model is Secretly a Preference Classifier
Shivank Garg
Ayush Singh
Shweta Singh
Paras Chopra
433
1
0
22 Feb 2025
Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study
Yujie Lin
Ante Wang
Moye Chen
Jingyao Liu
Hao Liu
Jinsong Su
Xinyan Xiao
LRM
114
3
0
17 Feb 2025
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving
Xin Xu
Yan Xu
Tianhao Chen
Yuchen Yan
Chengwu Liu
...
Yansen Wang
Yichun Yin
Yijiao Wang
Lifeng Shang
Qiang Liu
LRM
128
3
0
17 Feb 2025
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines
Ayan Sengupta
Ayan Sengupta
Tanmoy Chakraborty
124
0
0
17 Feb 2025
Preference Optimization for Reasoning with Pseudo Feedback
Fangkai Jiao
Geyang Guo
Xingxing Zhang
Nancy F. Chen
Shafiq Joty
Furu Wei
LRM
195
15
0
17 Feb 2025
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning
Xinyu Zhang
Yuxuan Dong
Yongpeng Wu
Jiaxing Huang
Chengyou Jia
Basura Fernando
Mike Zheng Shou
Lingling Zhang
Jun Liu
AIMat
ReLM
LRM
85
13
0
17 Feb 2025
Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
Fan Zhou
Zengzhi Wang
Qian Liu
Junlong Li
Pengfei Liu
ALM
185
14
0
17 Feb 2025
Learning to Reason from Feedback at Test-Time
Yanyang Li
Michael R. Lyu
Liwei Wang
LRM
106
4
0
16 Feb 2025
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models
Daniel Fleischer
Moshe Berchansky
Gad Markovits
Moshe Wasserblat
ReLM
ELM
LRM
143
0
0
13 Feb 2025
Typhoon T1: An Open Thai Reasoning Model
Pittawat Taveekitworachai
Potsawee Manakul
Kasima Tharnpipitchai
Kunat Pipatanakul
OffRL
LRM
200
0
0
13 Feb 2025
Bag of Tricks for Inference-time Computation of LLM Reasoning
Fan Liu
Wenshuo Chao
Naiqiang Tan
Hao Liu
OffRL
LRM
142
5
0
11 Feb 2025
When More is Less: Understanding Chain-of-Thought Length in LLMs
Yuyang Wu
Yifei Wang
Tianqi Du
Stefanie Jegelka
Yisen Wang
Yisen Wang
LRM
142
48
0
11 Feb 2025
Examining False Positives under Inference Scaling for Mathematical Reasoning
Yu Guang Wang
Nan Yang
Liang Wang
Furu Wei
LRM
123
4
0
10 Feb 2025
InSTA: Towards Internet-Scale Training For Agents
Brandon Trabucco
Gunnar Sigurdsson
Robinson Piramuthu
Ruslan Salakhutdinov
ALM
176
4
0
10 Feb 2025
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
Zongyu Lin
Yao Tang
Xingcheng Yao
Da Yin
Ziniu Hu
Ningyu Zhang
Kai-Wei Chang
LRM
127
6
0
04 Feb 2025
Policy Guided Tree Search for Enhanced LLM Reasoning
Yang Li
LRM
171
0
0
04 Feb 2025
Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods
Isha Puri
Shivchander Sudalairaj
Guangxuan Xu
Kai Xu
Akash Srivastava
LRM
175
5
0
03 Feb 2025
Learning to Generate Unit Tests for Automated Debugging
Archiki Prasad
Elias Stengel-Eskin
Justin Chih-Yao Chen
Zaid Khan
Joey Tianyi Zhou
ELM
135
4
0
03 Feb 2025
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization
Zishun Yu
Tengyu Xu
Di Jin
Karthik Abinav Sankararaman
Yun He
...
Eryk Helenowski
Chen Zhu
Sinong Wang
Hao Ma
Han Fang
LRM
198
10
0
29 Jan 2025
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Tianzhe Chu
Yuexiang Zhai
Jihan Yang
Shengbang Tong
Saining Xie
Dale Schuurmans
Quoc V. Le
Sergey Levine
Yi-An Ma
OffRL
228
123
0
28 Jan 2025
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
Haotian Luo
Li Shen
Haiying He
Yun Wang
Shiwei Liu
Wei Li
Naiqiang Tan
Xiaochun Cao
Dacheng Tao
VLM
LRM
158
99
0
22 Jan 2025
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Yuhang Zang
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Ziyu Liu
...
Haodong Duan
Wentao Zhang
Kai Chen
Dahua Lin
Jiaqi Wang
VLM
198
25
0
21 Jan 2025
T1: Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Zhenyu Hou
Xin Lv
Rui Lu
Jing Zhang
Yongqian Li
Zijun Yao
Juanzi Li
J. Tang
Yuxiao Dong
OffRL
LRM
ReLM
138
33
0
20 Jan 2025
Revisiting Rogers' Paradox in the Context of Human-AI Interaction
Katherine M. Collins
Umang Bhatt
Ilia Sucholutsky
129
1
0
16 Jan 2025
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Ruilin Luo
Zhuofan Zheng
Yifan Wang
Xinzhe Ni
Zicheng Lin
...
Yiyao Yu
C. Shi
Ruihang Chu
Jin Zeng
Yujiu Yang
LRM
153
25
0
08 Jan 2025
The Race to Efficiency: A New Perspective on AI Scaling Laws
Chien-Ping Lu
98
1
0
04 Jan 2025
Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement
Xiaoqing Zhang
Yuhan Liu
Flood Sung
Xiuying Chen
Shuo Shang
Rui Yan
59
1
0
30 Dec 2024
UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models
Boyang Xue
Fei Mi
Qi Zhu
Hongru Wang
Rui Wang
Sheng Wang
Erxin Yu
Xuming Hu
Kam-Fai Wong
HILM
193
2
0
16 Dec 2024
Predicting Emergent Capabilities by Finetuning
Charlie Snell
Eric Wallace
Dan Klein
Sergey Levine
ELM
LRM
130
5
0
25 Nov 2024
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Davide Paglieri
Bartłomiej Cupiał
Samuel Coward
Ulyana Piterbarg
Maciej Wolczyk
...
Lerrel Pinto
Rob Fergus
Jakob Foerster
Jack Parker-Holder
Tim Rocktaschel
LLMAG
LRM
198
22
0
20 Nov 2024
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
Justin Deschenaux
Çağlar Gülçehre
116
5
0
28 Oct 2024
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement
Antonis Antoniades
Albert Örwall
Kexun Zhang
Yuxi Xie
Anirudh Goyal
William Yang Wang
LLMAG
134
21
0
26 Oct 2024
Understanding Layer Significance in LLM Alignment
Guangyuan Shi
Zexin Lu
Xiaoyu Dong
Wenlong Zhang
Xuanyu Zhang
Yujie Feng
Xiao-Ming Wu
122
3
0
23 Oct 2024
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong
Shivam Agarwal
Yizhe Zhang
Jiacheng Ye
Lin Zheng
...
Peilin Zhao
W. Bi
Jiawei Han
Hao Peng
Dianbo Sui
AI4CE
117
27
0
23 Oct 2024
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs
Yujun Zhou
Jingdong Yang
Yue Huang
Kehan Guo
Zoe Emory
...
Tian Gao
Werner Geyer
Nuno Moniz
Nitesh Chawla
Xiangliang Zhang
113
7
0
18 Oct 2024
Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning
Lang Cao
Chao Peng
Renhong Chen
Wu Ning
Yingtian Zou
Yitong Li
LRM
91
0
0
18 Oct 2024
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
S. Gorti
Ilan Gofman
Zhaoyan Liu
Jiapeng Wu
Noël Vouitsis
Guangwei Yu
Jesse C. Cresswell
Rasa Hosseinzadeh
SyDa
105
12
0
16 Oct 2024
Latent Action Pretraining from Videos
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
...
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
124
44
0
15 Oct 2024
Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs
Ishan Jindal
Chandana Badrinath
Pranjal Bharti
Lakkidi Vinay
Sachin Dev Sharma
CLL
ALM
66
1
0
14 Oct 2024
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Zirui Zhao
Hanze Dong
Amrita Saha
Caiming Xiong
Doyen Sahoo
LRM
90
7
0
10 Oct 2024
CursorCore: Assist Programming through Aligning Anything
Hao Jiang
Qi Liu
Rui Li
Shengyu Ye
Shijin Wang
99
1
0
09 Oct 2024
Previous
1
2
3
Next