Deep Learning for Reward Design to Improve Monte Carlo Tree Search in
ATARI Games

Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games

24 April 2016

Richard L. Lewis

Papers citing "Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games"

15 / 15 papers shown

Title
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning Yulai Zhao Zhuoran Yang Zhaoran Wang Jason D. Lee 45 3 0 08 May 2023
The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning Haotian Hu Yiqin Yang Qianchuan Zhao Chongjie Zhang OffRL 11 5 0 27 Feb 2023
Redeeming Intrinsic Rewards via Constrained Optimization Eric Chen Zhang-Wei Hong Joni Pajarinen Pulkit Agrawal OnRL 36 24 0 14 Nov 2022
Turning Mathematics Problems into Games: Reinforcement Learning and Gröbner bases together solve Integer Feasibility Problems Yue Wu J. D. Loera 21 4 0 25 Aug 2022
Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks Sungryull Sohn Sungtae Lee Jongwook Choi H. V. Seijen Mehdi Fatemi Honglak Lee 173 3 0 13 Jul 2021
Circuit Routing Using Monte Carlo Tree Search and Deep Neural Networks Youbiao He F. S. Bao 13 13 0 24 Jun 2020
Reinforcement Learning with Goal-Distance Gradient Kai Jiang X. Qin 14 0 0 01 Jan 2020
How Should an Agent Practice? Janarthanan Rajendran Richard L. Lewis Vivek Veeriah Honglak Lee Satinder Singh 26 9 0 15 Dec 2019
Enforcing constraints for time series prediction in supervised, unsupervised and reinforcement learning P. Stinis AI4TS AI4CE 30 11 0 17 May 2019
Generative Adversarial Self-Imitation Learning Yijie Guo Junhyuk Oh Satinder Singh Honglak Lee GAN 15 58 0 03 Dec 2018
Evolved Policy Gradients Rein Houthooft Richard Y. Chen Phillip Isola Bradly C. Stadie Filip Wolski Jonathan Ho Pieter Abbeel 49 227 0 13 Feb 2018
Trial without Error: Towards Safe Reinforcement Learning via Human Intervention William Saunders Girish Sastry Andreas Stuhlmuller Owain Evans OffRL 24 229 0 17 Jul 2017
Automatic Goal Generation for Reinforcement Learning Agents Carlos Florensa David Held Xinyang Geng Pieter Abbeel 78 499 0 17 May 2017
Probabilistically Safe Policy Transfer David Held Zoe McCarthy Michael Zhang Fred Shentu Pieter Abbeel 26 19 0 15 May 2017
Value Iteration Networks Aviv Tamar Yi Wu G. Thomas Sergey Levine Pieter Abbeel 27 649 0 09 Feb 2016