Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning

17 October 2020

Papers citing "Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning"

50 / 50 papers shown

Title
Feasibility-Aware Pessimistic Estimation: Toward Long-Horizon Safety in Offline RL Zhikun Tao Gang Xiong He Fang Zhen Shen Yunjun Han Qing-Shan Jia OffRL 105 0 0 13 May 2025
Dynamic Bottleneck for Robust Self-Supervised Exploration Chenjia Bai Lingxiao Wang Lei Han Animesh Garg Jianye Hao Peng Liu Zhaoran Wang 44 29 0 20 Oct 2021
Principled Exploration via Optimistic Bootstrapping and Backward Induction Chenjia Bai Lingxiao Wang Lei Han Jianye Hao Animesh Garg Peng Liu Zhaoran Wang OffRL 46 38 0 13 May 2021
Planning to Explore via Self-Supervised World Models Ramanan Sekar Oleh Rybkin Kostas Daniilidis Pieter Abbeel Danijar Hafner Deepak Pathak SSL 62 406 0 12 May 2020
Agent57: Outperforming the Atari Human Benchmark Adria Puigdomenech Badia Bilal Piot Steven Kapturowski Pablo Sprechmann Alex Vitvitskyi Daniel Guo Charles Blundell OffRL 63 519 0 30 Mar 2020
Optimistic Exploration even with a Pessimistic Initialisation Tabish Rashid Bei Peng Wendelin Bohmer Shimon Whiteson OffRL OnRL 40 44 0 26 Feb 2020
Never Give Up: Learning Directed Exploration Strategies Adria Puigdomenech Badia Pablo Sprechmann Alex Vitvitskyi Daniel Guo Bilal Piot ... O. Tieleman Martín Arjovsky Alexander Pritzel Andew Bolt Charles Blundell 70 298 0 14 Feb 2020
What Can Learned Intrinsic Rewards Capture? Zeyu Zheng Junhyuk Oh Matteo Hessel Zhongwen Xu M. Kroiss H. V. Hasselt David Silver Satinder Singh 52 77 0 11 Dec 2019
Better Exploration with Optimistic Actor-Critic K. Ciosek Q. Vuong R. Loftin Katja Hofmann 57 154 0 28 Oct 2019
Tutorial and Survey on Probabilistic Graphical Model and Variational Inference in Deep Reinforcement Learning Xudong Sun B. Bischl BDL 47 9 0 25 Aug 2019
MULEX: Disentangling Exploitation from Exploration in Deep RL Lucas Beyer Damien Vincent O. Teboul Sylvain Gelly Matthieu Geist Olivier Pietquin 42 14 0 01 Jul 2019
Exploration via Hindsight Goal Generation Zhizhou Ren Kefan Dong Yuanshuo Zhou Qiang Liu Jian-wei Peng 67 89 0 10 Jun 2019
Self-Supervised Exploration via Disagreement Deepak Pathak Dhiraj Gandhi Abhinav Gupta SSL 73 382 0 10 Jun 2019
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning Rui Zhao Xudong Sun Volker Tresp 50 82 0 21 May 2019
Learning Novel Policies For Tasks Yunbo Zhang Wenhao Yu Greg Turk 41 34 0 13 May 2019
Scheduled Intrinsic Drive: A Hierarchical Take on Intrinsically Motivated Exploration Jingwei Zhang Niklas Wetzel Nicolai Dorka Joschka Boedecker Wolfram Burgard 43 26 0 18 Mar 2019
Model-Based Reinforcement Learning for Atari Lukasz Kaiser Mohammad Babaeizadeh Piotr Milos B. Osinski R. Campbell ... Sergey Levine Afroz Mohiuddin Ryan Sepassi George Tucker Henryk Michalewski OffRL 124 860 0 01 Mar 2019
Contingency-Aware Exploration in Reinforcement Learning Jongwook Choi Yijie Guo Marcin Moczulski Junhyuk Oh Neal Wu Mohammad Norouzi Honglak Lee 54 73 0 05 Nov 2018
VIREL: A Variational Inference Framework for Reinforcement Learning M. Fellows Anuj Mahajan Tim G. J. Rudner Shimon Whiteson DRL 59 56 0 03 Nov 2018
Exploration by Random Network Distillation Yuri Burda Harrison Edwards Amos Storkey Oleg Klimov 157 1,331 0 30 Oct 2018
Model-Based Active Exploration Pranav Shyam Wojciech Ja'skowski Faustino J. Gomez 73 179 0 29 Oct 2018
Episodic Curiosity through Reachability Nikolay Savinov Anton Raichuk Raphaël Marinier Damien Vincent Marc Pollefeys Timothy Lillicrap Sylvain Gelly 54 269 0 04 Oct 2018
Large-Scale Study of Curiosity-Driven Learning Yuri Burda Harrison Edwards Deepak Pathak Amos Storkey Trevor Darrell Alexei A. Efros LRM 69 703 0 13 Aug 2018
Randomized Prior Functions for Deep Reinforcement Learning Ian Osband John Aslanides Albin Cassirer UQCV BDL 66 379 0 08 Jun 2018
Variational Autoencoder with Arbitrary Conditioning Oleg Ivanov Michael Figurnov Dmitry Vetrov BDL DRL 55 147 0 06 Jun 2018
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models Kurtland Chua Roberto Calandra R. McAllister Sergey Levine BDL 221 1,277 0 30 May 2018
A Stochastic Decoder for Neural Machine Translation P. Schulz Wilker Aziz Trevor Cohn BDL 64 29 0 28 May 2018
On Learning Intrinsic Rewards for Policy Gradient Methods Zeyu Zheng Junhyuk Oh Satinder Singh 57 205 0 17 Apr 2018
Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations Dimitri Bertsekas OffRL 57 131 0 12 Apr 2018
DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills Xue Bin Peng Pieter Abbeel Sergey Levine M. van de Panne AI4CE 224 499 0 08 Apr 2018
Learning to Play with Intrinsically-Motivated Self-Aware Agents Nick Haber Damian Mrowca Li Fei-Fei Daniel L. K. Yamins LRM 60 120 0 21 Feb 2018
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning Zhang-Wei Hong Tzu-Yun Shann Shih-Yang Su Yi-Hsiang Chang Chun-Yi Lee 57 124 0 13 Feb 2018
Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation Dane S. Corneil W. Gerstner Johanni Brea OffRL 53 62 0 12 Feb 2018
Progressive Growing of GANs for Improved Quality, Stability, and Variation Tero Karras Timo Aila S. Laine J. Lehtinen GAN 129 7,353 0 27 Oct 2017
Rainbow: Combining Improvements in Deep Reinforcement Learning Matteo Hessel Joseph Modayil H. V. Hasselt Tom Schaul Georg Ostrovski Will Dabney Dan Horgan Bilal Piot M. G. Azar David Silver OffRL 107 2,264 0 06 Oct 2017
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents Marlos C. Machado Marc G. Bellemare Erik Talvitie J. Veness Matthew J. Hausknecht Michael Bowling 71 552 0 18 Sep 2017
A Distributional Perspective on Reinforcement Learning Marc G. Bellemare Will Dabney Rémi Munos OffRL 96 1,504 0 21 Jul 2017
Proximal Policy Optimization Algorithms John Schulman Filip Wolski Prafulla Dhariwal Alec Radford Oleg Klimov OffRL 478 19,019 0 20 Jul 2017
Hindsight Experience Replay Marcin Andrychowicz Dwight Crow Alex Ray Jonas Schneider Rachel Fong Peter Welinder Bob McGrew Joshua Tobin Pieter Abbeel Wojciech Zaremba OffRL 248 2,326 0 05 Jul 2017
Noisy Networks for Exploration Meire Fortunato M. G. Azar Bilal Piot Jacob Menick Ian Osband ... Rémi Munos Demis Hassabis Olivier Pietquin Charles Blundell Shane Legg 79 895 0 30 Jun 2017
Parameter Space Noise for Exploration Matthias Plappert Rein Houthooft Prafulla Dhariwal Szymon Sidor Richard Y. Chen Xi Chen Tamim Asfour Pieter Abbeel Marcin Andrychowicz 54 596 0 06 Jun 2017
Curiosity-driven Exploration by Self-supervised Prediction Deepak Pathak Pulkit Agrawal Alexei A. Efros Trevor Darrell LRM SSL 106 2,436 0 15 May 2017
Count-Based Exploration with Neural Density Models Georg Ostrovski Marc G. Bellemare Aaron van den Oord Rémi Munos 84 622 0 03 Mar 2017
Unifying Count-Based Exploration and Intrinsic Motivation Marc G. Bellemare S. Srinivasan Georg Ostrovski Tom Schaul D. Saxton Rémi Munos 167 1,478 0 06 Jun 2016
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection Sergey Levine P. Pastor A. Krizhevsky Deirdre Quillen 169 2,072 0 07 Mar 2016
Deep Exploration via Bootstrapped DQN Ian Osband Charles Blundell Alexander Pritzel Benjamin Van Roy 121 1,308 0 15 Feb 2016
Variational Inference: A Review for Statisticians David M. Blei A. Kucukelbir Jon D. McAuliffe BDL 264 4,787 0 04 Jan 2016
High-Dimensional Continuous Control Using Generalized Advantage Estimation John Schulman Philipp Moritz Sergey Levine Michael I. Jordan Pieter Abbeel OffRL 90 3,406 0 08 Jun 2015
Trust Region Policy Optimization John Schulman Sergey Levine Philipp Moritz Michael I. Jordan Pieter Abbeel 277 6,767 0 19 Feb 2015
Auto-Encoding Variational Bayes Diederik P. Kingma Max Welling BDL 450 16,940 0 20 Dec 2013