Learning Continuous Control Policies by Stochastic Value Gradients

30 October 2015

David Silver

Papers citing "Learning Continuous Control Policies by Stochastic Value Gradients"

29 / 329 papers shown

Title
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation I. Popov N. Heess Timothy Lillicrap Roland Hafner Gabriel Barth-Maron Matej Vecerík Thomas Lampe Yuval Tassa Tom Erez Martin Riedmiller OffRL 31 263 0 10 Apr 2017
Stochastic Neural Networks for Hierarchical Reinforcement Learning Carlos Florensa Yan Duan Pieter Abbeel BDL 47 360 0 10 Apr 2017
One-Shot Imitation Learning Yan Duan Marcin Andrychowicz Bradly C. Stadie Jonathan Ho Jonas Schneider Ilya Sutskever Pieter Abbeel Wojciech Zaremba OffRL 23 682 0 21 Mar 2017
Sensor Fusion for Robot Control through Deep Reinforcement Learning Steven Bohez Tim Verbelen E. D. Coninck B. Vankeirsbilck Pieter Simoens Bart Dhoedt SSL 12 29 0 13 Mar 2017
Prediction and Control with Temporal Segment Models Nikhil Mishra Pieter Abbeel Igor Mordatch BDL 29 64 0 12 Mar 2017
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning Yevgen Chebotar Karol Hausman Marvin Zhang Gaurav Sukhatme S. Schaal Sergey Levine 37 159 0 08 Mar 2017
Towards Generalization and Simplicity in Continuous Control Aravind Rajeswaran Kendall Lowrey E. Todorov Sham Kakade OffRL 55 276 0 08 Mar 2017
Understanding Synthetic Gradients and Decoupled Neural Interfaces Wojciech M. Czarnecki G. Swirszcz Max Jaderberg Simon Osindero Oriol Vinyals Koray Kavukcuoglu 33 81 0 01 Mar 2017
Trainable Greedy Decoding for Neural Machine Translation Jiatao Gu Kyunghyun Cho V. Li 24 73 0 08 Feb 2017
Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning Francois Belletti Daniel Haziza G. Gomes Alexandre M. Bayen 19 139 0 30 Jan 2017
Model-based Adversarial Imitation Learning Nir Baram Oron Anschel Shie Mannor GAN 22 42 0 07 Dec 2016
RL $^2$ : Fast Reinforcement Learning via Slow Reinforcement Learning Yan Duan John Schulman Xi Chen Peter L. Bartlett Ilya Sutskever Pieter Abbeel OffRL 35 1,008 0 09 Nov 2016
Reparameterization trick for discrete variables Seiya Tokui Issei Sato 37 11 0 04 Nov 2016
Sample Efficient Actor-Critic with Experience Replay Ziyun Wang V. Bapst N. Heess Volodymyr Mnih Rémi Munos Koray Kavukcuoglu Nando de Freitas 33 755 0 03 Nov 2016
Towards Lifelong Self-Supervision: A Deep Learning Direction for Robotics J. M. Wong 27 11 0 01 Nov 2016
Learning and Transfer of Modulated Locomotor Controllers N. Heess Greg Wayne Yuval Tassa Timothy Lillicrap Martin Riedmiller David Silver 35 207 0 17 Oct 2016
Sim-to-Real Robot Learning from Pixels with Progressive Nets Andrei A. Rusu Matej Vecerík Thomas Rothörl N. Heess Razvan Pascanu R. Hadsell 39 532 0 13 Oct 2016
Connecting Generative Adversarial Networks and Actor-Critic Methods David Pfau Oriol Vinyals OffRL AI4CE 30 186 0 06 Oct 2016
Playing FPS Games with Deep Reinforcement Learning Guillaume Lample Devendra Singh Chaplot OffRL EgoV 39 583 0 18 Sep 2016
Decoupled Neural Interfaces using Synthetic Gradients Max Jaderberg Wojciech M. Czarnecki Simon Osindero Oriol Vinyals Alex Graves David Silver Koray Kavukcuoglu 47 354 0 18 Aug 2016
Actor-critic versus direct policy search: a comparison based on sample complexity Arnaud de Froissard de Broissia Olivier Sigaud 28 12 0 29 Jun 2016
Review of state-of-the-arts in artificial intelligence with application to AI safety problem V. Shakirov 20 10 0 11 May 2016
Benchmarking Deep Reinforcement Learning for Continuous Control Yan Duan Xi Chen Rein Houthooft John Schulman Pieter Abbeel OffRL 20 1,687 0 22 Apr 2016
Continuous Deep Q-Learning with Model-based Acceleration S. Gu Timothy Lillicrap Ilya Sutskever Sergey Levine 42 1,008 0 02 Mar 2016
PLATO: Policy Learning using Adaptive Trajectory Optimization G. Kahn Tianhao Zhang Sergey Levine Pieter Abbeel 32 136 0 02 Mar 2016
Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms Tom Zahavy Bingyi Kang Alex Sivak Jiashi Feng Huan Xu Shie Mannor OOD AAML 39 12 0 07 Feb 2016
Memory-based control with recurrent neural networks N. Heess Jonathan J. Hunt Timothy Lillicrap David Silver 35 301 0 14 Dec 2015
Continuous control with deep reinforcement learning Timothy Lillicrap Jonathan J. Hunt Alexander Pritzel N. Heess Tom Erez Yuval Tassa David Silver Daan Wierstra 52 13,120 0 09 Sep 2015
High-Dimensional Continuous Control Using Generalized Advantage Estimation John Schulman Philipp Moritz Sergey Levine Michael I. Jordan Pieter Abbeel OffRL 13 3,322 0 08 Jun 2015