Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.21787
Cited By
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
3 January 2025
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Large Language Monkeys: Scaling Inference Compute with Repeated Sampling"
50 / 162 papers shown
Title
CodeMonkeys: Scaling Test-Time Compute for Software Engineering
Ryan Ehrlich
Bradley Brown
Jordan Juravsky
Ronald Clark
Christopher Ré
Azalia Mirhoseini
57
6
0
24 Jan 2025
From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning
Yafu Li
Zhilin Wang
Tingchen Fu
Ganqu Cui
Sen Yang
Yu Cheng
45
1
0
21 Jan 2025
Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Zhenyu Hou
Xin Lv
Rui Lu
J. Zhang
Yongqian Li
Zijun Yao
Juanzi Li
J. Tang
Yuxiao Dong
OffRL
LRM
ReLM
61
20
0
20 Jan 2025
Multi-Step Reasoning in Korean and the Emergent Mirage
Guijin Son
Hyunwoo Ko
Dasol Choi
LRM
ReLM
65
0
0
10 Jan 2025
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Xinyu Guan
L. Zhang
Yifei Liu
Ning Shang
Youran Sun
Yi Zhu
Fan Yang
Mao Yang
LRM
SyDa
ReLM
62
78
0
08 Jan 2025
Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying
Federico Castagna
I. Sassoon
Simon Parsons
LRM
85
0
0
19 Dec 2024
Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling
Junyi Li
Hwee Tou Ng
LRM
90
1
0
19 Dec 2024
Test-Time Alignment via Hypothesis Reweighting
Yoonho Lee
Jonathan Williams
Henrik Marklund
Archit Sharma
E. Mitchell
Anikait Singh
Chelsea Finn
91
3
0
11 Dec 2024
Smoothie: Label Free Language Model Routing
Neel Guha
Mayee F. Chen
Trevor Chow
Ishan S. Khare
Christopher Ré
71
4
0
06 Dec 2024
Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models
Yanxi Chen
Xuchen Pan
Yaliang Li
Bolin Ding
Jingren Zhou
LRM
92
9
0
29 Nov 2024
Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect Verifiers
Benedikt Stroebl
Sayash Kapoor
Arvind Narayanan
LRM
85
13
0
26 Nov 2024
VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models
Lei Li
Y. X. Wei
Zhihui Xie
Xuqing Yang
Yifan Song
...
Tianyu Liu
Sujian Li
Bill Yuchen Lin
Lingpeng Kong
Qiang Liu
CoGe
VLM
120
24
0
26 Nov 2024
Drowning in Documents: Consequences of Scaling Reranker Inference
Mathew Jacob
Erik Lindgren
Matei A. Zaharia
Michael Carbin
Omar Khattab
Andrew Drozdov
OffRL
74
4
0
18 Nov 2024
AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning
Kun Xiang
Zhili Liu
Zihao Jiang
Yunshuang Nie
Runhui Huang
...
Yihan Zeng
J. Han
Lanqing Hong
Hang Xu
Xiaodan Liang
LRM
106
10
0
18 Nov 2024
Scaling Laws for Precision
Tanishq Kumar
Zachary Ankner
Benjamin Spector
Blake Bordelon
Niklas Muennighoff
Mansheej Paul
C. Pehlevan
Christopher Ré
Aditi Raghunathan
AIFin
MoMe
46
13
0
07 Nov 2024
Scaling LLM Inference with Optimized Sample Compute Allocation
Kexun Zhang
Shang Zhou
Danqing Wang
William Yang Wang
Lei Li
50
9
0
29 Oct 2024
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
Justin Deschenaux
Çağlar Gülçehre
44
2
0
28 Oct 2024
Library Learning Doesn't: The Curious Case of the Single-Use "Library"
Ian Berlot-Attwell
Frank Rudzicz
Xujie Si
37
1
0
26 Oct 2024
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement
Antonis Antoniades
Albert Örwall
Kexun Zhang
Yuxi Xie
Anirudh Goyal
William Yang Wang
LLMAG
59
11
0
26 Oct 2024
C
2
C^2
C
2
: Scalable Auto-Feedback for LLM-based Chart Generation
Woosung Koh
Jang Han Yoon
M. Lee
Youngjin Song
Jaegwan Cho
Jaehyun Kang
Taehyeon Kim
Se-Young Yun
Youngjae Yu
B. Lee
42
0
0
24 Oct 2024
Little Giants: Synthesizing High-Quality Embedding Data at Scale
Haonan Chen
Liang Wang
Nan Yang
Bo Li
Ziliang Zhao
Furu Wei
Zhicheng Dou
SyDa
36
1
0
24 Oct 2024
A Simple Model of Inference Scaling Laws
Noam Levi
LRM
32
6
0
21 Oct 2024
Keep Guessing? When Considering Inference Scaling, Mind the Baselines
G. Yona
Or Honovich
Omer Levy
Roee Aharoni
UQLM
LRM
33
0
0
20 Oct 2024
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
Jiahao Qiu
Yifu Lu
Yifan Zeng
Jiacheng Guo
Jiayi Geng
Huazheng Wang
Kaixuan Huang
Yue Wu
Mengdi Wang
36
22
0
18 Oct 2024
GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings
Raghuveer Thirukovalluru
Bhuwan Dhingra
34
2
0
18 Oct 2024
How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Guhao Feng
Kai-Bo Yang
Yuntian Gu
Xinyue Ai
Shengjie Luo
Jiacheng Sun
Di He
ZeLin Li
Liwei Wang
LRM
37
6
0
17 Oct 2024
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
S. Gorti
Ilan Gofman
Zhaoyan Liu
Jiapeng Wu
Noël Vouitsis
Guangwei Yu
Jesse C. Cresswell
Rasa Hosseinzadeh
SyDa
55
6
0
16 Oct 2024
JudgeBench: A Benchmark for Evaluating LLM-based Judges
Sijun Tan
Siyuan Zhuang
Kyle Montgomery
William Y. Tang
Alejandro Cuadron
Chenguang Wang
Raluca A. Popa
Ion Stoica
ELM
ALM
51
38
0
16 Oct 2024
Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System
Weize Chen
Jiarui Yuan
Chen Qian
Cheng Yang
Zhiyuan Liu
Maosong Sun
LLMAG
28
4
0
10 Oct 2024
Efficient Reinforcement Learning with Large Language Model Priors
Xue Yan
Yan Song
Xidong Feng
Mengyue Yang
Haifeng Zhang
Haitham Bou Ammar
Jun Wang
OffRL
33
3
0
10 Oct 2024
ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time
Yi Ding
Bolian Li
Ruqi Zhang
MLLM
72
6
0
09 Oct 2024
Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
Mehul Damani
Idan Shenfeld
Andi Peng
Andreea Bobu
Jacob Andreas
39
16
0
07 Oct 2024
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification
Zhenwen Liang
Ye Liu
Tong Niu
Xiangliang Zhang
Yingbo Zhou
Semih Yavuz
LRM
32
17
0
05 Oct 2024
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
Murong Yue
Wenlin Yao
Haitao Mi
Dian Yu
Ziyu Yao
Dong Yu
LRM
48
4
0
04 Oct 2024
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
John Yang
Carlos E. Jimenez
Alex Zhang
K. Lieret
Joyce Yang
...
Gabriel Synnaeve
Karthik Narasimhan
Diyi Yang
Sida I. Wang
Ofir Press
41
23
0
04 Oct 2024
ToolGen: Unified Tool Retrieval and Calling via Generation
Renxi Wang
Xudong Han
Lei Ji
Shu Wang
Timothy Baldwin
Haonan Li
LLMAG
72
6
0
04 Oct 2024
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning
Di Zhang
Jianbo Wu
Jingdi Lei
Tong Che
Jiatong Li
...
Shufei Zhang
Marco Pavone
Yuqiang Li
Wanli Ouyang
Dongzhan Zhou
LRM
33
43
0
03 Oct 2024
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Ulyana Piterbarg
Lerrel Pinto
Rob Fergus
SyDa
37
2
0
03 Oct 2024
Recursive Abstractive Processing for Retrieval in Dynamic Datasets
Charbel Chucri
Rami Azouz
Joachim Ott
50
0
0
02 Oct 2024
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
Shubham Toshniwal
Wei Du
Ivan Moshkov
Branislav Kisacanin
Alexan Ayrapetyan
Igor Gitman
LRM
18
49
0
02 Oct 2024
Integrative Decoding: Improve Factuality via Implicit Self-consistency
Yi Cheng
Xiao Liang
Yeyun Gong
Wen Xiao
Song Wang
...
Wenjie Li
Jian Jiao
Qi Chen
Peng Cheng
Wayne Xiong
HILM
56
1
0
02 Oct 2024
Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling
Jinghan Li
Zhicheng Sun
Fei Li
102
1
0
02 Oct 2024
TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking
Danqing Wang
Jianxin Ma
Fei Fang
Lei Li
LLMAG
LRM
154
0
0
02 Oct 2024
Revisiting the Superficial Alignment Hypothesis
Mohit Raghavendra
Vaskar Nath
Sean Hendryx
LRM
23
0
0
27 Sep 2024
Archon: An Architecture Search Framework for Inference-Time Techniques
Jon Saad-Falcon
Adrian Gamarra Lafuente
Shlok Natarajan
Nahum Maru
Hristo Todorov
...
E. Kelly Buchanan
Mayee Chen
Neel Guha
Christopher Ré
Azalia Mirhoseini
AI4CE
33
18
0
23 Sep 2024
Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models
Sijing Chen
Yuan Feng
Laipeng He
Tianwei He
Wendi He
...
Huimin Zhang
Xiang Zhang
Guangcheng Zhao
Hongbin Zhou
Pengpeng Zou
34
4
0
18 Sep 2024
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Zachary S. Siegel
Sayash Kapoor
Nitya Nagdir
Benedikt Stroebl
Arvind Narayanan
31
8
0
17 Sep 2024
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Hritik Bansal
Arian Hosseini
Rishabh Agarwal
Vinh Q. Tran
Mehran Kazemi
SyDa
OffRL
LRM
39
37
0
29 Aug 2024
Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning
Xinglin Wang
Shaoxiong Feng
Yiwei Li
Peiwen Yuan
Y. Zhang
Boyuan Pan
Heda Wang
Yao Hu
Kan Li
LRM
40
17
0
24 Aug 2024
Preference-Guided Reflective Sampling for Aligning Language Models
Hai Ye
Hwee Tou Ng
31
3
0
22 Aug 2024
Previous
1
2
3
4
Next