Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.08439
Cited By
Thinking Fast and Slow with Deep Learning and Tree Search
23 May 2017
Thomas W. Anthony
Zheng Tian
David Barber
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Thinking Fast and Slow with Deep Learning and Tree Search"
50 / 62 papers shown
Title
Deep Symbolic Optimization: Reinforcement Learning for Symbolic Mathematics
Conor F. Hayes
Felipe Leno Da Silva
Jiachen Yang
T. Nathan Mundhenk
Chak Shing Lee
...
Ahmet Can Solak
Thomas Desautels
Daniel Faissol
Brenden K. Petersen
Mikel Landajuela
14
0
0
16 May 2025
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
Jiarui Yao
Yifan Hao
Hanning Zhang
Hanze Dong
Wei Xiong
Nan Jiang
Tong Zhang
LRM
62
0
0
05 May 2025
Aligning Constraint Generation with Design Intent in Parametric CAD
Evan Casey
Tianyu Zhang
Shu Ishida
John Roger Thompson
Amir Hosein Khasahmadi
Joseph George Lambourne
P. Jayaraman
K. Willis
38
0
0
17 Apr 2025
Learning Autonomous Code Integration for Math Language Models
Haozhe Wang
Long Li
C. Qu
Fengming Zhu
Weidi Xu
Wei Chu
Fangzhen Lin
56
1
0
02 Feb 2025
GraphXForm: Graph transformer for computer-aided molecular design
Jonathan Pirnay
Jan G. Rittig
Alexander B. Wolf
Martin Grohe
Jakob Burger
Alexander Mitsos
D. G. Grimm
AI4CE
58
1
0
03 Nov 2024
BlendRL: A Framework for Merging Symbolic and Neural Policy Learning
Hikaru Shindo
Quentin Delfosse
Devendra Singh Dhami
Kristian Kersting
43
3
0
15 Oct 2024
SAPIENT: Mastering Multi-turn Conversational Recommendation with Strategic Planning and Monte Carlo Tree Search
Hanwen Du
B. Peng
Xia Ning
38
0
0
12 Oct 2024
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Zirui Zhao
Hanze Dong
Amrita Saha
Caiming Xiong
Doyen Sahoo
LRM
35
3
0
10 Oct 2024
Lean-STaR: Learning to Interleave Thinking and Proving
Haohan Lin
Zhiqing Sun
Yiming Yang
Sean Welleck
ReLM
LRM
72
25
0
14 Jul 2024
GOAL: A Generalist Combinatorial Optimization Agent Learner
Darko Drakulic
Sofia Michel
J. Andreoli
39
6
0
21 Jun 2024
Stress-Testing Capability Elicitation With Password-Locked Models
Ryan Greenblatt
Fabien Roger
Dmitrii Krasheninnikov
David M. Krueger
38
14
0
29 May 2024
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation
Chengxing Jia
Pengyuan Wang
Ziniu Li
Yi-Chen Li
Zhilong Zhang
Nan Tang
Yang Yu
OffRL
39
1
0
27 May 2024
Vertical Symbolic Regression
Nan Jiang
Md Nasim
Yexiang Xue
27
1
0
19 Dec 2023
Curriculum Learning for Cooperation in Multi-Agent Reinforcement Learning
R. Bhati
S. Gottipati
Clodéric Mars
Matthew E. Taylor
37
0
0
19 Dec 2023
Boot and Switch: Alternating Distillation for Zero-Shot Dense Retrieval
Fan Jiang
Qiongkai Xu
Tom Drummond
Trevor Cohn
21
2
0
27 Nov 2023
Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal
Leah A. Chrestien
Tomás Pevný
Stefan Edelkamp
Antonín Komenda
36
9
0
30 Oct 2023
From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning
Zheyuan Zhang
Shane Storks
Fengyuan Hu
Sungryull Sohn
Moontae Lee
Honglak Lee
Joyce Chai
LRM
39
3
0
24 Oct 2023
Tree Search in DAG Space with Model-based Reinforcement Learning for Causal Discovery
Victor-Alexandru Darvariu
Stephen Hailes
Mirco Musolesi
CML
46
2
0
20 Oct 2023
Know your Enemy: Investigating Monte-Carlo Tree Search with Opponent Models in Pommerman
Jannis Weil
Johannes Czech
Tobias Meuser
Kristian Kersting
11
2
0
22 May 2023
Beyond Games: A Systematic Review of Neural Monte Carlo Tree Search Applications
Marco Kemmerling
Daniel Lutticke
Robert H. Schmitt
32
14
0
14 Mar 2023
Deep Generative Symbolic Regression with Monte-Carlo-Tree-Search
Pierre-Alexandre Kamienny
Guillaume Lample
Sylvain Lamprier
M. Virgolin
34
25
0
22 Feb 2023
Abstracting Imperfect Information Away from Two-Player Zero-Sum Games
Samuel Sokota
Ryan DÓrazio
Chun Kai Ling
David J. Wu
J. Zico Kolter
Noam Brown
27
4
0
22 Jan 2023
Peano: Learning Formal Mathematical Reasoning
Gabriel Poesia
Noah D. Goodman
LRM
25
20
0
29 Nov 2022
Learning to design without prior data: Discovering generalizable design strategies using deep learning and tree search
Ayush Raina
Jonathan Cagan
Christopher McComb
AI4CE
25
9
0
28 Nov 2022
Fine-tuning language models to find agreement among humans with diverse preferences
Michiel A. Bakker
Martin Chadwick
Hannah R. Sheahan
Michael Henry Tessler
Lucy Campbell-Gillingham
...
Nat McAleese
Amelia Glaese
John Aslanides
M. Botvinick
Christopher Summerfield
ALM
46
215
0
28 Nov 2022
LEMMA: Bootstrapping High-Level Mathematical Reasoning with Learned Symbolic Abstractions
Zhening Li
Gabriel Poesia
Omar Costilla-Reyes
Noah D. Goodman
Armando Solar-Lezama
21
7
0
16 Nov 2022
Dynamic Collaborative Multi-Agent Reinforcement Learning Communication for Autonomous Drone Reforestation
P. D. Siedler
AI4CE
25
4
0
14 Nov 2022
Hindsight Learning for MDPs with Exogenous Inputs
Sean R. Sinclair
Felipe Vieira Frujeri
Ching-An Cheng
Luke Marshall
Hugo Barbalho
Jingling Li
Jennifer Neville
Ishai Menache
Adith Swaminathan
18
22
0
13 Jul 2022
A Survey on Model-based Reinforcement Learning
Fan Luo
Tian Xu
Hang Lai
Xiong-Hui Chen
Weinan Zhang
Yang Yu
OffRL
LRM
50
101
0
19 Jun 2022
TALM: Tool Augmented Language Models
Aaron T Parisi
Yao-Min Zhao
Noah Fiedel
KELM
RALM
LLMAG
32
144
0
24 May 2022
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation
Maximilian Igl
Daewoo Kim
Alex Kuefler
Paul Mougin
Punit Shah
K. Shiarlis
Drago Anguelov
Mark Palatucci
Brandyn White
Shimon Whiteson
35
64
0
06 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
333
11,953
0
04 Mar 2022
Generative Cooperative Networks for Natural Language Generation
Sylvain Lamprier
Thomas Scialom
Antoine Chaffin
Vincent Claveau
Ewa Kijak
Jacopo Staiano
Benjamin Piwowarski
GAN
54
13
0
28 Jan 2022
AlphaD3M: Machine Learning Pipeline Synthesis
Iddo Drori
Yamuna Krishnamurthy
Rémi Rampin
Raoni Lourenço
Jorge Piazentin Ono
Kyunghyun Cho
Claudio Silva
J. Freire
22
85
0
03 Nov 2021
On The Ingredients of an Effective Zero-shot Semantic Parser
Pengcheng Yin
John Wieting
Avirup Sil
Graham Neubig
50
15
0
15 Oct 2021
Design Strategy Network: A deep hierarchical framework to represent generative design strategies in complex action spaces
Ayush Raina
Jonathan Cagan
Christopher McComb
AI4CE
25
13
0
07 Oct 2021
Goal-Directed Design Agents: Integrating Visual Imitation with One-Step Lookahead Optimization for Generative Design
Ayush Raina
Lucas Puentes
Jonathan Cagan
Christopher McComb
AI4CE
26
6
0
07 Oct 2021
Thinking Fast and Slow in AI: the Role of Metacognition
M. B. Ganapini
Murray Campbell
F. Fabiano
L. Horesh
J. Lenchner
Andrea Loreggia
N. Mattei
F. Rossi
Biplav Srivastava
K. Venable
LLMAG
AI4CE
41
17
0
05 Oct 2021
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
35
294
0
22 Sep 2021
Planning Spatial Networks with Monte Carlo Tree Search
Victor-Alexandru Darvariu
Stephen Hailes
Mirco Musolesi
27
7
0
12 Jun 2021
Monte Carlo Tree Search: A Review of Recent Modifications and Applications
M. Świechowski
Konrad Godlewski
B. Sawicki
Jacek Mańdziuk
41
249
0
08 Mar 2021
Transfer of Fully Convolutional Policy-Value Networks Between Games and Game Variants
Dennis J. N. J. Soemers
Vegard Mella
Éric Piette
Matthew Stephenson
C. Browne
O. Teytaud
OffRL
23
8
0
24 Feb 2021
Learning to Play Two-Player Perfect-Information Games without Knowledge
Quentin Cohen-Solal
OffRL
39
13
0
03 Aug 2020
Combining Deep Reinforcement Learning and Search for Imperfect-Information Games
Noam Brown
A. Bakhtin
Adam Lerer
Qucheng Gong
17
133
0
27 Jul 2020
Guiding Deep Molecular Optimization with Genetic Exploration
Sungsoo Ahn
Junsup Kim
Hankook Lee
Jinwoo Shin
29
70
0
04 Jul 2020
Learning to Play No-Press Diplomacy with Best Response Policy Iteration
Thomas W. Anthony
Tom Eccles
Andrea Tacchetti
János Kramár
I. Gemp
...
Richard Everett
Roman Werpachowski
Satinder Singh
T. Graepel
Yoram Bachrach
13
42
0
08 Jun 2020
Plan2Vec: Unsupervised Representation Learning by Latent Plans
Ge Yang
Amy Zhang
Ari S. Morcos
Joelle Pineau
Pieter Abbeel
Roberto Calandra
SSL
OffRL
28
27
0
07 May 2020
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI OpenAI
:
Christopher Berner
Greg Brockman
Brooke Chan
...
Szymon Sidor
Ilya Sutskever
Jie Tang
Filip Wolski
Susan Zhang
GNN
VLM
CLL
AI4CE
LRM
41
1,795
0
13 Dec 2019
Combining Q-Learning and Search with Amortized Value Estimates
Jessica B. Hamrick
V. Bapst
Alvaro Sanchez-Gonzalez
Tobias Pfaff
T. Weber
Lars Buesing
Peter W. Battaglia
OffRL
27
47
0
05 Dec 2019
DeepLine: AutoML Tool for Pipelines Generation using Deep Reinforcement Learning and Hierarchical Actions Filtering
Yuval Heffetz
Roman Vainshtein
Gilad Katz
Lior Rokach
17
39
0
31 Oct 2019
1
2
Next