Trust Region Policy Optimization

19 February 2015

Pieter Abbeel

Papers citing "Trust Region Policy Optimization"

50 / 3,098 papers shown

Title
Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning Anusha Nagabandi I. Clavera Simin Liu R. Fearing Pieter Abbeel Sergey Levine Chelsea Finn 26 537 0 30 Mar 2018
Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system Kendall Lowrey S. Kolev Jeremy Dao Aravind Rajeswaran E. Todorov 26 58 0 28 Mar 2018
Neuronal Circuit Policies Mathias Lechner Ramin M. Hasani Radu Grosu 19 7 0 22 Mar 2018
Natural Gradient Deep Q-learning Ethan Knight Osher Lerner 24 10 0 20 Mar 2018
Optimizing Sponsored Search Ranking Strategy by Deep Reinforcement Learning Li He Liang Wang Kaipeng Liu Bo Wu Weinan Zhang 29 7 0 20 Mar 2018
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines Cathy Wu Aravind Rajeswaran Yan Duan Vikash Kumar Alexandre M. Bayen Sham Kakade Igor Mordatch Pieter Abbeel OffRL 22 151 0 20 Mar 2018
Setting up a Reinforcement Learning Task with a Real-World Robot A. R. Mahmood D. Korenkevych Brent Komer James Bergstra 26 75 0 19 Mar 2018
Simple random search provides a competitive approach to reinforcement learning Horia Mania Aurelia Guy Benjamin Recht 22 315 0 19 Mar 2018
Learning to Explore with Meta-Policy Gradient Tianbing Xu Qiang Liu Liang Zhao Jian Peng 20 54 0 13 Mar 2018
Comparing Task Simplifications to Learn Closed-Loop Object Picking Using Deep Reinforcement Learning Michel Breyer Fadri Furrer Tonci Novkovic Roland Siegwart Juan I. Nieto SSL OffRL 29 47 0 13 Mar 2018
Policy Search in Continuous Action Domains: an Overview Olivier Sigaud F. Stulp 16 72 0 13 Mar 2018
Transfer Learning with Neural AutoML Catherine Wong N. Houlsby Yifeng Lu Andrea Gesmundo 19 114 0 07 Mar 2018
Smoothed Action Value Functions for Learning Gaussian Policies Ofir Nachum Mohammad Norouzi George Tucker Dale Schuurmans 18 28 0 06 Mar 2018
Recurrent Predictive State Policy Networks Ahmed S. Hefny Zita Marinho Wen Sun S. Srinivasa Geoffrey J. Gordon 37 19 0 05 Mar 2018
The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches Md. Zahangir Alom T. Taha C. Yakopcic Stefan Westberg P. Sidike Mst Shamima Nasrin B. Van Essen A. Awwal V. Asari VLM 31 875 0 03 Mar 2018
OIL: Observational Imitation Learning Ge Li Matthias Muller Vincent Casser Neil G. Smith D. L. Michels Guohao Li 22 41 0 03 Mar 2018
Some Considerations on Learning to Explore via Meta-Reinforcement Learning Bradly C. Stadie Ge Yang Rein Houthooft Xi Chen Yan Duan Yuhuai Wu Pieter Abbeel Ilya Sutskever LRM 40 116 0 03 Mar 2018
Multi-Agent Imitation Learning for Driving Simulation Raunak P. Bhattacharyya Derek J. Phillips Blake Wulfe Jeremy Morton Alex Kuefler Mykel J. Kochenderfer 22 118 0 02 Mar 2018
Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application Yujing Hu Qing Da Anxiang Zeng Yang Yu Yinghui Xu 22 179 0 02 Mar 2018
Model-Ensemble Trust-Region Policy Optimization Thanard Kurutach I. Clavera Yan Duan Aviv Tamar Pieter Abbeel 20 449 0 28 Feb 2018
Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods Deirdre Quillen Eric Jang Ofir Nachum Chelsea Finn Julian Ibarz Sergey Levine OOD OffRL 35 202 0 28 Feb 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning George Tucker Surya Bhupatiraju S. Gu Richard Turner Zoubin Ghahramani Sergey Levine OffRL 30 126 0 27 Feb 2018
Reinforcement and Imitation Learning for Diverse Visuomotor Skills Yuke Zhu Ziyun Wang J. Merel Andrei A. Rusu Tom Erez ... S. Tunyasuvunakool János Kramár R. Hadsell Nando de Freitas N. Heess SSL 34 316 0 26 Feb 2018
Addressing Function Approximation Error in Actor-Critic Methods Scott Fujimoto H. V. Hoof David Meger OffRL 49 5,069 0 26 Feb 2018
Structured Control Nets for Deep Reinforcement Learning Mario Srouji Jian Zhang Ruslan Salakhutdinov 33 43 0 22 Feb 2018
Learning to Gather without Communication El-Mahdi El-Mhamdi R. Guerraoui Alexandre Maurer Vladislav Tempez FedML 11 1 0 21 Feb 2018
Variational Inference for Policy Gradient Tianbing Xu BDL 23 0 0 21 Feb 2018
Clipped Action Policy Gradient Yasuhiro Fujita S. Maeda OffRL 34 37 0 21 Feb 2018
Meta-Reinforcement Learning of Structured Exploration Strategies Abhishek Gupta Russell Mendonca YuXuan Liu Pieter Abbeel Sergey Levine OffRL 24 343 0 20 Feb 2018
Layer-wise synapse optimization for implementing neural networks on general neuromorphic architectures John Mern Jayesh K. Gupta Mykel Kochenderfer 40 1 0 20 Feb 2018
Fourier Policy Gradients M. Fellows K. Ciosek Shimon Whiteson 35 15 0 19 Feb 2018
Accelerated Primal-Dual Policy Optimization for Safe Reinforcement Learning Qingkai Liang Fanyu Que E. Modiano 29 101 0 19 Feb 2018
Diversity is All You Need: Learning Skills without a Reward Function Benjamin Eysenbach Abhishek Gupta Julian Ibarz Sergey Levine 42 1,063 0 16 Feb 2018
Reinforcement Learning from Imperfect Demonstrations Yang Gao Huazhe Xu Ji Lin Feng Yu Sergey Levine Trevor Darrell 29 200 0 14 Feb 2018
DiCE: The Infinitely Differentiable Monte-Carlo Estimator Jakob N. Foerster Gregory Farquhar Maruan Al-Shedivat Tim Rocktaschel Eric Xing Shimon Whiteson 13 97 0 14 Feb 2018
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms Cédric Colas Olivier Sigaud Pierre-Yves Oudeyer 29 157 0 14 Feb 2018
Evolved Policy Gradients Rein Houthooft Richard Y. Chen Phillip Isola Bradly C. Stadie Filip Wolski Jonathan Ho Pieter Abbeel 49 227 0 13 Feb 2018
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning Zhang-Wei Hong Tzu-Yun Shann Shih-Yang Su Yi-Hsiang Chang Chun-Yi Lee 10 122 0 13 Feb 2018
Learning Robust and Adaptive Real-World Continuous Control Using Simulation and Transfer Learning M. Ferguson K. Law 13 2 0 13 Feb 2018
Efficient Exploration through Bayesian Deep Q-Networks Kamyar Azizzadenesheli Anima Anandkumar OffRL BDL 26 162 0 13 Feb 2018
Hierarchical Learning for Modular Robots R. Kojcev Nora Etxezarreta Alejandro Hernández Víctor Mayoral 24 4 0 12 Feb 2018
Taking gradients through experiments: LSTMs and memory proximal policy optimization for black-box quantum control Moritz August José Miguel Hernández-Lobato 26 41 0 12 Feb 2018
Sample Efficient Deep Reinforcement Learning for Dialogue Systems with Large Action Spaces Gellert Weisz Paweł Budzianowski Pei-hao Su Milica Gasic 10 82 0 11 Feb 2018
Beyond the One Step Greedy Approach in Reinforcement Learning Yonathan Efroni Gal Dalal B. Scherrer Shie Mannor OffRL 59 48 0 10 Feb 2018
Path Consistency Learning in Tsallis Entropy Regularized MDPs Ofir Nachum Yinlam Chow Mohammad Ghavamzadeh 26 45 0 10 Feb 2018
Balancing Two-Player Stochastic Games with Soft Q-Learning Jordi Grau-Moya Felix Leibfried Haitham Bou-Ammar 24 42 0 09 Feb 2018
Learning and Querying Fast Generative Models for Reinforcement Learning Lars Buesing T. Weber S. Racanière S. M. Ali Eslami Danilo Jimenez Rezende ... Fabio Viola F. Besse Karol Gregor Demis Hassabis Daan Wierstra OffRL 35 134 0 08 Feb 2018
Evaluation of Deep Reinforcement Learning Methods for Modular Robots R. Kojcev Nora Etxezarreta Alejandro Hernández Víctor Mayoral OffRL 23 4 0 07 Feb 2018
VR-Goggles for Robots: Real-to-sim Domain Adaptation for Visual Control Jingwei Zhang L. Tai Peng Yun Yufeng Xiong Ming Liu Joschka Boedecker Wolfram Burgard 21 122 0 01 Feb 2018
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations Xiaoqin Zhang Huimin Ma OffRL 43 38 0 31 Jan 2018