Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.14275
Cited By
Solving math word problems with process- and outcome-based feedback
25 November 2022
J. Uesato
Nate Kushman
Ramana Kumar
Francis Song
Noah Y. Siegel
L. Wang
Antonia Creswell
G. Irving
I. Higgins
FaML
ReLM
AIMat
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Solving math word problems with process- and outcome-based feedback"
22 / 72 papers shown
Title
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision
Zihan Wang
Yunxuan Li
Yuexin Wu
Liangchen Luo
Le Hou
Hongkun Yu
Jingbo Shang
LRM
40
18
0
05 Feb 2024
TinyGSM: achieving >80% on GSM8k with small language models
Bingbin Liu
Sébastien Bubeck
Ronen Eldan
Janardhan Kulkarni
Yuanzhi Li
Anh Nguyen
Rachel A. Ward
Yi Zhang
ALM
24
47
0
14 Dec 2023
Scalable AI Safety via Doubly-Efficient Debate
Jonah Brown-Cohen
Geoffrey Irving
Georgios Piliouras
32
15
0
23 Nov 2023
Positional Description Matters for Transformers Arithmetic
Ruoqi Shen
Sébastien Bubeck
Ronen Eldan
Yin Tat Lee
Yuanzhi Li
Yi Zhang
34
37
0
22 Nov 2023
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models
Keming Lu
Hongyi Yuan
Runji Lin
Junyang Lin
Zheng Yuan
Chang Zhou
Jingren Zhou
MoE
LRM
40
52
0
15 Nov 2023
Improving Large Language Model Fine-tuning for Solving Math Problems
Yixin Liu
Avi Singh
C. D. Freeman
John D. Co-Reyes
Peter J. Liu
LRM
ReLM
37
45
0
16 Oct 2023
Constructive Large Language Models Alignment with Diverse Feedback
Tianshu Yu
Ting-En Lin
Yuchuan Wu
Min Yang
Fei Huang
Yongbin Li
ALM
40
9
0
10 Oct 2023
Don't throw away your value model! Generating more preferable text with Value-Guided Monte-Carlo Tree Search decoding
Jiacheng Liu
Andrew Cohen
Ramakanth Pasunuru
Yejin Choi
Hannaneh Hajishirzi
Asli Celikyilmaz
16
24
0
26 Sep 2023
GPT Can Solve Mathematical Problems Without a Calculator
Z. Yang
Ming Ding
Qingsong Lv
Zhihuan Jiang
Zehai He
Yuyi Guo
Jinfeng Bai
Jie Tang
RALM
LRM
36
52
0
06 Sep 2023
Benchmarks for Detecting Measurement Tampering
Fabien Roger
Ryan Greenblatt
Max Nadeau
Buck Shlegeris
Nate Thomas
28
2
0
29 Aug 2023
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Jiasheng Ye
Zaixiang Zheng
Yu Bao
Lihua Qian
Quanquan Gu
DiffM
54
14
0
23 Aug 2023
Learning from Mistakes via Cooperative Study Assistant for Large Language Models
Danqing Wang
Lei Li
32
6
0
23 May 2023
Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes
Justin Reppert
Ben Rachbach
Charlie George
Luke Stebbing
Ju-Seung Byun
Maggie Appleton
Andreas Stuhlmuller
ReLM
LRM
38
17
0
04 Jan 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,106
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
314
3,248
0
21 Mar 2022
Teaching language models to support answers with verified quotes
Jacob Menick
Maja Trebacz
Vladimir Mikulik
John Aslanides
Francis Song
...
Mia Glaese
Susannah Young
Lucy Campbell-Gillingham
G. Irving
Nat McAleese
ELM
RALM
246
259
0
21 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
373
8,495
0
28 Jan 2022
Explaining Answers with Entailment Trees
Bhavana Dalvi
Peter Alexander Jansen
Oyvind Tafjord
Zhengnan Xie
Hannah Smith
Leighanna Pipatanangkura
Peter Clark
ReLM
FAtt
LRM
239
184
0
17 Apr 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
250
677
0
06 Jan 2021
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
280
1,595
0
18 Sep 2019
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
204
200
0
02 May 2018
Previous
1
2