v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018

Pieter Abbeel

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,130 papers shown

Title
Quinoa: a Q-function You Infer Normalized Over Actions Jonas Degrave A. Abdolmaleki Jost Tobias Springenberg N. Heess Martin Riedmiller 42 5 0 05 Nov 2019
DeepRacer: Educational Autonomous Racing Platform for Experimentation with Sim2Real Reinforcement Learning Bharathan Balaji S. Mallya Sahika Genc Saurabh Gupta Leo Dirac ... Yunzhe Tao Brian Townsend E. Calleja Sunil Muralidhara Dhanasekar Karuppasamy 80 57 0 05 Nov 2019
Maximum Entropy Diverse Exploration: Disentangling Maximum Entropy Reinforcement Learning Andrew Cohen Lei Yu Xingye Qiao Xiangrong Tong 62 2 0 03 Nov 2019
Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation Risto Vuorio Shao-Hua Sun Hexiang Hu Joseph J. Lim 108 218 0 30 Oct 2019
Learning to Manipulate Deformable Objects without Demonstrations Yilin Wu Wilson Yan Thanard Kurutach Lerrel Pinto Pieter Abbeel OffRL 78 202 0 29 Oct 2019
Certified Adversarial Robustness for Deep Reinforcement Learning Björn Lütjens Michael Everett Jonathan P. How AAML 111 96 0 28 Oct 2019
Better Exploration with Optimistic Actor-Critic K. Ciosek Q. Vuong R. Loftin Katja Hofmann 77 156 0 28 Oct 2019
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning Xinyue Chen Zijian Zhou Ziyi Wang Che Wang Yanqiu Wu George Andriopoulos OffRL 125 125 0 27 Oct 2019
Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning Tianhe Yu Deirdre Quillen Zhanpeng He Ryan Julian Avnish Narayan Hayden Shively Adithya Bellathur Karol Hausman Chelsea Finn Sergey Levine OffRL 340 1,182 0 24 Oct 2019
Teach Biped Robots to Walk via Gait Principles and Reinforcement Learning with Adversarial Critics Kuangen Zhang Zhimin Hou Clarence W. de Silva Haoyong Yu Chenglong Fu 41 8 0 22 Oct 2019
Learning Resilient Behaviors for Navigation Under Uncertainty Tingxiang Fan Pinxin Long Wenxi Liu Jia Pan Ruigang Yang Tianyi Zhou 92 20 0 22 Oct 2019
Momentum in Reinforcement Learning Nino Vieillard B. Scherrer Olivier Pietquin Matthieu Geist 90 35 0 21 Oct 2019
Regularization Matters in Policy Optimization Zhuang Liu Xuanlin Li Bingyi Kang Trevor Darrell OffRL 83 33 0 21 Oct 2019
A New Framework for Multi-Agent Reinforcement Learning -- Centralized Training and Exploration with Decentralized Execution via Policy Distillation Gang Chen 64 41 0 21 Oct 2019
OffWorld Gym: open-access physical robotics environment for real-world reinforcement learning benchmark and research Ashish Kumar Toby Buckley John B. Lanier Qiaozhi Wang A. Kavelaars Ilya Kuzovkin OffRL 101 14 0 18 Oct 2019
Reinforcement Learning for Robotic Manipulation using Simulated Locomotion Demonstrations Ozsel Kilinc Giovanni Montana 54 39 0 16 Oct 2019
Teacher algorithms for curriculum learning of Deep RL in continuously parameterized environments Rémy Portelas Cédric Colas Katja Hofmann Pierre-Yves Oudeyer 68 146 0 16 Oct 2019
Soft Actor-Critic for Discrete Action Settings Petros Christodoulou OffRL 168 305 0 16 Oct 2019
Regularizing Model-Based Planning with Energy-Based Models Rinu Boney Arno Solin Alexander Ilin 81 18 0 12 Oct 2019
Integrating Behavior Cloning and Reinforcement Learning for Improved Performance in Dense and Sparse Reward Environments Vinicius G. Goecks Gregory M. Gremillion Vernon J. Lawhern J. Valasek Nicholas R. Waytowich OffRL 112 31 0 09 Oct 2019
Ctrl-Z: Recovering from Instability in Reinforcement Learning Vibhavari Dasagi Jake Bruce T. Peynot Jurgen Leitner 58 10 0 09 Oct 2019
Receding Horizon Curiosity M. Schultheis Boris Belousov Hany Abdulsamad Jan Peters 75 15 0 08 Oct 2019
Attention-based Fault-tolerant Approach for Multi-agent Reinforcement Learning Systems Mingyang Geng Kele Xu Yiying Li Shuqi Liu Bo Ding Huaimin Wang AAML 419 10 0 05 Oct 2019
Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling Che Wang Yanqiu Wu Q. Vuong George Andriopoulos 36 6 0 05 Oct 2019
If MaxEnt RL is the Answer, What is the Question? Benjamin Eysenbach Sergey Levine 77 59 0 04 Oct 2019
Never Worse, Mostly Better: Stable Policy Improvement in Deep Reinforcement Learning P. Khanna Guy Tennenholtz Nadav Merlis Shie Mannor Chen Tessler OffRL 26 1 0 02 Oct 2019
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning Xue Bin Peng Aviral Kumar Grace Zhang Sergey Levine OffRL 176 571 0 01 Oct 2019
RLCache: Automated Cache Management Using Reinforcement Learning Sami Alabed 45 4 0 30 Sep 2019
MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics M. Barekatain Ryo Yonetani Masashi Hamaya 95 26 0 28 Sep 2019
Multi-Agent Actor-Critic with Hierarchical Graph Attention Network Heechang Ryu Hayong Shin Jinkyoo Park 60 118 0 27 Sep 2019
CAQL: Continuous Action Q-Learning Moonkyung Ryu Yinlam Chow Ross Anderson Christian Tjandraatmadja Craig Boutilier 289 43 0 26 Sep 2019
RLBench: The Robot Learning Benchmark & Learning Environment Stephen James Z. Ma David Rovick Arrojo Andrew J. Davison SSL VLM OffRL 147 563 0 26 Sep 2019
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control H. F. Song A. Abdolmaleki Jost Tobias Springenberg Aidan Clark Hubert Soyer ... Dhruva Tirumala N. Heess Dan Belov Martin Riedmiller M. Botvinick 121 126 0 26 Sep 2019
Model Imitation for Model-Based Reinforcement Learning Yueh-hua Wu Ting-Han Fan Peter J. Ramadge H. Su OffRL 52 16 0 25 Sep 2019
Deep Dynamics Models for Learning Dexterous Manipulation Anusha Nagabandi K. Konolige Sergey Levine Vikash Kumar 246 417 0 25 Sep 2019
Off-Policy Actor-Critic with Shared Experience Replay Simon Schmitt Matteo Hessel Karen Simonyan OffRL 80 68 0 25 Sep 2019
Multi-task Batch Reinforcement Learning with Metric Learning Jiachen Li Q. Vuong Shuang Liu Minghua Liu K. Ciosek George Andriopoulos Henrik I. Christensen H. Su OffRL 65 2 0 25 Sep 2019
On the Convergence of Approximate and Regularized Policy Iteration Schemes E. Smirnova Elvis Dohmatob 45 5 0 20 Sep 2019
A Hierarchical Two-tier Approach to Hyper-parameter Optimization in Reinforcement Learning Juan Cruz Barsce J. Palombarini E. Martínez OffRL 30 0 0 18 Sep 2019
Learning to Manipulate Object Collections Using Grounded State Representations Matthew Wilson Tucker Hermans SSL 104 27 0 17 Sep 2019
Visualizing Movement Control Optimization Landscapes Perttu Hämäläinen Juuso Toikka Amin Babadi Karen Liu 59 7 0 17 Sep 2019
MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning Raghunandan Rajan Jessica Lizeth Borja Diaz Suresh Guttikonda Fabio Ferreira André Biedenkapp Jan Ole von Hartz Frank Hutter 145 4 0 17 Sep 2019
Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning T. Doan Bogdan Mazoure Moloud Abdar A. Durand Joelle Pineau R. Devon Hjelm 73 15 0 17 Sep 2019
Model Based Planning with Energy Based Models Yilun Du Toru Lin Igor Mordatch 97 38 0 15 Sep 2019
State Representation Learning from Demonstration Astrid Merckling Michael Pearce Loic Cressot Stéphane Doncieux Matthias Poloczek OffRL 78 8 0 15 Sep 2019
VILD: Variational Imitation Learning with Diverse-quality Demonstrations Voot Tangkaratt Bo Han Mohammad Emtiyaz Khan Masashi Sugiyama 78 20 0 15 Sep 2019
ISL: A novel approach for deep exploration Lucas Cassano Ali H. Sayed 62 1 0 13 Sep 2019
Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning Felix Leibfried Jordi Grau-Moya 79 22 0 11 Sep 2019
AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers Andrey Kurenkov Ajay Mandlekar R. M. Martin Silvio Savarese Animesh Garg 75 48 0 09 Sep 2019
Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses Matt Grenander Yue Dong Jackie C.K. Cheung Annie Louis 79 36 0 08 Sep 2019