Thinking Fast and Slow with Deep Learning and Tree Search

23 May 2017

Papers citing "Thinking Fast and Slow with Deep Learning and Tree Search"

50 / 62 papers shown

Title
Deep Symbolic Optimization: Reinforcement Learning for Symbolic Mathematics Conor F. Hayes Felipe Leno Da Silva Jiachen Yang T. Nathan Mundhenk Chak Shing Lee ... Ahmet Can Solak Thomas Desautels Daniel Faissol Brenden K. Petersen Mikel Landajuela 14 0 0 16 May 2025
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL Jiarui Yao Yifan Hao Hanning Zhang Hanze Dong Wei Xiong Nan Jiang Tong Zhang LRM 62 0 0 05 May 2025
Aligning Constraint Generation with Design Intent in Parametric CAD Evan Casey Tianyu Zhang Shu Ishida John Roger Thompson Amir Hosein Khasahmadi Joseph George Lambourne P. Jayaraman K. Willis 38 0 0 17 Apr 2025
Learning Autonomous Code Integration for Math Language Models Haozhe Wang Long Li C. Qu Fengming Zhu Weidi Xu Wei Chu Fangzhen Lin 56 1 0 02 Feb 2025
GraphXForm: Graph transformer for computer-aided molecular design Jonathan Pirnay Jan G. Rittig Alexander B. Wolf Martin Grohe Jakob Burger Alexander Mitsos D. G. Grimm AI4CE 58 1 0 03 Nov 2024
BlendRL: A Framework for Merging Symbolic and Neural Policy Learning Hikaru Shindo Quentin Delfosse Devendra Singh Dhami Kristian Kersting 43 3 0 15 Oct 2024
SAPIENT: Mastering Multi-turn Conversational Recommendation with Strategic Planning and Monte Carlo Tree Search Hanwen Du B. Peng Xia Ning 38 0 0 12 Oct 2024
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning Zirui Zhao Hanze Dong Amrita Saha Caiming Xiong Doyen Sahoo LRM 35 3 0 10 Oct 2024
Lean-STaR: Learning to Interleave Thinking and Proving Haohan Lin Zhiqing Sun Yiming Yang Sean Welleck ReLM LRM 72 25 0 14 Jul 2024
GOAL: A Generalist Combinatorial Optimization Agent Learner Darko Drakulic Sofia Michel J. Andreoli 39 6 0 21 Jun 2024
Stress-Testing Capability Elicitation With Password-Locked Models Ryan Greenblatt Fabien Roger Dmitrii Krasheninnikov David M. Krueger 38 14 0 29 May 2024
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation Chengxing Jia Pengyuan Wang Ziniu Li Yi-Chen Li Zhilong Zhang Nan Tang Yang Yu OffRL 39 1 0 27 May 2024
Vertical Symbolic Regression Nan Jiang Md Nasim Yexiang Xue 27 1 0 19 Dec 2023
Curriculum Learning for Cooperation in Multi-Agent Reinforcement Learning R. Bhati S. Gottipati Clodéric Mars Matthew E. Taylor 37 0 0 19 Dec 2023
Boot and Switch: Alternating Distillation for Zero-Shot Dense Retrieval Fan Jiang Qiongkai Xu Tom Drummond Trevor Cohn 21 2 0 27 Nov 2023
Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal Leah A. Chrestien Tomás Pevný Stefan Edelkamp Antonín Komenda 36 9 0 30 Oct 2023
From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning Zheyuan Zhang Shane Storks Fengyuan Hu Sungryull Sohn Moontae Lee Honglak Lee Joyce Chai LRM 39 3 0 24 Oct 2023
Tree Search in DAG Space with Model-based Reinforcement Learning for Causal Discovery Victor-Alexandru Darvariu Stephen Hailes Mirco Musolesi CML 46 2 0 20 Oct 2023
Know your Enemy: Investigating Monte-Carlo Tree Search with Opponent Models in Pommerman Jannis Weil Johannes Czech Tobias Meuser Kristian Kersting 11 2 0 22 May 2023
Beyond Games: A Systematic Review of Neural Monte Carlo Tree Search Applications Marco Kemmerling Daniel Lutticke Robert H. Schmitt 32 14 0 14 Mar 2023
Deep Generative Symbolic Regression with Monte-Carlo-Tree-Search Pierre-Alexandre Kamienny Guillaume Lample Sylvain Lamprier M. Virgolin 34 25 0 22 Feb 2023
Abstracting Imperfect Information Away from Two-Player Zero-Sum Games Samuel Sokota Ryan DÓrazio Chun Kai Ling David J. Wu J. Zico Kolter Noam Brown 27 4 0 22 Jan 2023
Peano: Learning Formal Mathematical Reasoning Gabriel Poesia Noah D. Goodman LRM 25 20 0 29 Nov 2022
Learning to design without prior data: Discovering generalizable design strategies using deep learning and tree search Ayush Raina Jonathan Cagan Christopher McComb AI4CE 25 9 0 28 Nov 2022
Fine-tuning language models to find agreement among humans with diverse preferences Michiel A. Bakker Martin Chadwick Hannah R. Sheahan Michael Henry Tessler Lucy Campbell-Gillingham ... Nat McAleese Amelia Glaese John Aslanides M. Botvinick Christopher Summerfield ALM 46 215 0 28 Nov 2022
LEMMA: Bootstrapping High-Level Mathematical Reasoning with Learned Symbolic Abstractions Zhening Li Gabriel Poesia Omar Costilla-Reyes Noah D. Goodman Armando Solar-Lezama 21 7 0 16 Nov 2022
Dynamic Collaborative Multi-Agent Reinforcement Learning Communication for Autonomous Drone Reforestation P. D. Siedler AI4CE 25 4 0 14 Nov 2022
Hindsight Learning for MDPs with Exogenous Inputs Sean R. Sinclair Felipe Vieira Frujeri Ching-An Cheng Luke Marshall Hugo Barbalho Jingling Li Jennifer Neville Ishai Menache Adith Swaminathan 18 22 0 13 Jul 2022
A Survey on Model-based Reinforcement Learning Fan Luo Tian Xu Hang Lai Xiong-Hui Chen Weinan Zhang Yang Yu OffRL LRM 50 101 0 19 Jun 2022
TALM: Tool Augmented Language Models Aaron T Parisi Yao-Min Zhao Noah Fiedel KELM RALM LLMAG 32 144 0 24 May 2022
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation Maximilian Igl Daewoo Kim Alex Kuefler Paul Mougin Punit Shah K. Shiarlis Drago Anguelov Mark Palatucci Brandyn White Shimon Whiteson 35 64 0 06 May 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 333 11,953 0 04 Mar 2022
Generative Cooperative Networks for Natural Language Generation Sylvain Lamprier Thomas Scialom Antoine Chaffin Vincent Claveau Ewa Kijak Jacopo Staiano Benjamin Piwowarski GAN 54 13 0 28 Jan 2022
AlphaD3M: Machine Learning Pipeline Synthesis Iddo Drori Yamuna Krishnamurthy Rémi Rampin Raoni Lourenço Jorge Piazentin Ono Kyunghyun Cho Claudio Silva J. Freire 22 85 0 03 Nov 2021
On The Ingredients of an Effective Zero-shot Semantic Parser Pengcheng Yin John Wieting Avirup Sil Graham Neubig 50 15 0 15 Oct 2021
Design Strategy Network: A deep hierarchical framework to represent generative design strategies in complex action spaces Ayush Raina Jonathan Cagan Christopher McComb AI4CE 25 13 0 07 Oct 2021
Goal-Directed Design Agents: Integrating Visual Imitation with One-Step Lookahead Optimization for Generative Design Ayush Raina Lucas Puentes Jonathan Cagan Christopher McComb AI4CE 26 6 0 07 Oct 2021
Thinking Fast and Slow in AI: the Role of Metacognition M. B. Ganapini Murray Campbell F. Fabiano L. Horesh J. Lenchner Andrea Loreggia N. Mattei F. Rossi Biplav Srivastava K. Venable LLMAG AI4CE 41 17 0 05 Oct 2021
Recursively Summarizing Books with Human Feedback Jeff Wu Long Ouyang Daniel M. Ziegler Nissan Stiennon Ryan J. Lowe Jan Leike Paul Christiano ALM 35 294 0 22 Sep 2021
Planning Spatial Networks with Monte Carlo Tree Search Victor-Alexandru Darvariu Stephen Hailes Mirco Musolesi 27 7 0 12 Jun 2021
Monte Carlo Tree Search: A Review of Recent Modifications and Applications M. Świechowski Konrad Godlewski B. Sawicki Jacek Mańdziuk 41 249 0 08 Mar 2021
Transfer of Fully Convolutional Policy-Value Networks Between Games and Game Variants Dennis J. N. J. Soemers Vegard Mella Éric Piette Matthew Stephenson C. Browne O. Teytaud OffRL 23 8 0 24 Feb 2021
Learning to Play Two-Player Perfect-Information Games without Knowledge Quentin Cohen-Solal OffRL 39 13 0 03 Aug 2020
Combining Deep Reinforcement Learning and Search for Imperfect-Information Games Noam Brown A. Bakhtin Adam Lerer Qucheng Gong 17 133 0 27 Jul 2020
Guiding Deep Molecular Optimization with Genetic Exploration Sungsoo Ahn Junsup Kim Hankook Lee Jinwoo Shin 29 70 0 04 Jul 2020
Learning to Play No-Press Diplomacy with Best Response Policy Iteration Thomas W. Anthony Tom Eccles Andrea Tacchetti János Kramár I. Gemp ... Richard Everett Roman Werpachowski Satinder Singh T. Graepel Yoram Bachrach 13 42 0 08 Jun 2020
Plan2Vec: Unsupervised Representation Learning by Latent Plans Ge Yang Amy Zhang Ari S. Morcos Joelle Pineau Pieter Abbeel Roberto Calandra SSL OffRL 28 27 0 07 May 2020
Dota 2 with Large Scale Deep Reinforcement Learning OpenAI OpenAI : Christopher Berner Greg Brockman Brooke Chan ... Szymon Sidor Ilya Sutskever Jie Tang Filip Wolski Susan Zhang GNN VLM CLL AI4CE LRM 41 1,795 0 13 Dec 2019
Combining Q-Learning and Search with Amortized Value Estimates Jessica B. Hamrick V. Bapst Alvaro Sanchez-Gonzalez Tobias Pfaff T. Weber Lars Buesing Peter W. Battaglia OffRL 27 47 0 05 Dec 2019
DeepLine: AutoML Tool for Pipelines Generation using Deep Reinforcement Learning and Hierarchical Actions Filtering Yuval Heffetz Roman Vainshtein Gilad Katz Lior Rokach 17 39 0 31 Oct 2019