Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1604.07095
Cited By
Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games
24 April 2016
Xiaoxiao Guo
Satinder Singh
Richard L. Lewis
Honglak Lee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games"
15 / 15 papers shown
Title
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
Yulai Zhao
Zhuoran Yang
Zhaoran Wang
Jason D. Lee
45
3
0
08 May 2023
The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning
Haotian Hu
Yiqin Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
11
5
0
27 Feb 2023
Redeeming Intrinsic Rewards via Constrained Optimization
Eric Chen
Zhang-Wei Hong
Joni Pajarinen
Pulkit Agrawal
OnRL
36
24
0
14 Nov 2022
Turning Mathematics Problems into Games: Reinforcement Learning and Gröbner bases together solve Integer Feasibility Problems
Yue Wu
J. D. Loera
21
4
0
25 Aug 2022
Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks
Sungryull Sohn
Sungtae Lee
Jongwook Choi
H. V. Seijen
Mehdi Fatemi
Honglak Lee
173
3
0
13 Jul 2021
Circuit Routing Using Monte Carlo Tree Search and Deep Neural Networks
Youbiao He
F. S. Bao
13
13
0
24 Jun 2020
Reinforcement Learning with Goal-Distance Gradient
Kai Jiang
X. Qin
14
0
0
01 Jan 2020
How Should an Agent Practice?
Janarthanan Rajendran
Richard L. Lewis
Vivek Veeriah
Honglak Lee
Satinder Singh
26
9
0
15 Dec 2019
Enforcing constraints for time series prediction in supervised, unsupervised and reinforcement learning
P. Stinis
AI4TS
AI4CE
30
11
0
17 May 2019
Generative Adversarial Self-Imitation Learning
Yijie Guo
Junhyuk Oh
Satinder Singh
Honglak Lee
GAN
15
58
0
03 Dec 2018
Evolved Policy Gradients
Rein Houthooft
Richard Y. Chen
Phillip Isola
Bradly C. Stadie
Filip Wolski
Jonathan Ho
Pieter Abbeel
49
227
0
13 Feb 2018
Trial without Error: Towards Safe Reinforcement Learning via Human Intervention
William Saunders
Girish Sastry
Andreas Stuhlmuller
Owain Evans
OffRL
24
229
0
17 Jul 2017
Automatic Goal Generation for Reinforcement Learning Agents
Carlos Florensa
David Held
Xinyang Geng
Pieter Abbeel
78
499
0
17 May 2017
Probabilistically Safe Policy Transfer
David Held
Zoe McCarthy
Michael Zhang
Fred Shentu
Pieter Abbeel
26
19
0
15 May 2017
Value Iteration Networks
Aviv Tamar
Yi Wu
G. Thomas
Sergey Levine
Pieter Abbeel
27
649
0
09 Feb 2016
1