How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies

7 December 2015

Papers citing "How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies"

15 / 15 papers shown

Title
On shallow planning under partial observability Randy Lefebvre Audrey Durand OffRL 39 0 0 22 Jul 2024
Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control Michal Nauman M. Ostaszewski Krzysztof Jankowski Piotr Milo's Marek Cygan OffRL 45 16 0 25 May 2024
Hard-Thresholding Meets Evolution Strategies in Reinforcement Learning Chengqian Gao William de Vazelhes Hualin Zhang Bin Gu Zhiqiang Xu 54 0 0 02 May 2024
Bigger, Better, Faster: Human-level Atari with human-level efficiency Max Schwarzer J. Obando-Ceron Rameswar Panda Marc G. Bellemare Rishabh Agarwal Pablo Samuel Castro OffRL 54 82 0 30 May 2023
Truncating Trajectories in Monte Carlo Reinforcement Learning Riccardo Poiani Alberto Maria Metelli Marcello Restelli 24 2 0 07 May 2023
Factors of Influence of the Overestimation Bias of Q-Learning Julius Wagenbach M. Sabatelli 15 1 0 11 Oct 2022
Optimizing the Long-Term Behaviour of Deep Reinforcement Learning for Pushing and Grasping Rodrigo Chau 33 0 0 07 Apr 2022
Deep Reinforcement Learning Versus Evolution Strategies: A Comparative Survey Amjad Yousef Majid Serge Saaybi Tomas van Rietbergen Vincent François-Lavet R. V. Prasad Chris Verhoeven OffRL 60 54 0 28 Sep 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods Xin Guo Anran Hu Junzi Zhang OffRL 25 6 0 13 Sep 2021
Taylor Expansion of Discount Factors Yunhao Tang Mark Rowland Rémi Munos Michal Valko OffRL 29 5 0 11 Jun 2021
Automatic Curriculum Learning For Deep RL: A Short Survey Rémy Portelas Cédric Colas Lilian Weng Katja Hofmann Pierre-Yves Oudeyer ODL 19 167 0 10 Mar 2020
Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity Peng Liao Kristjan Greenewald P. Klasnja S. Murphy 17 83 0 08 Sep 2019
Hyperbolic Discounting and Learning over Multiple Horizons W. Fedus Carles Gelada Yoshua Bengio Marc G. Bellemare Hugo Larochelle 29 105 0 19 Feb 2019
Fast Efficient Hyperparameter Tuning for Policy Gradients Supratik Paul Vitaly Kurin Shimon Whiteson 22 32 0 18 Feb 2019
Online Meta-learning by Parallel Algorithm Competition Stefan Elfwing E. Uchibe Kenji Doya 23 22 0 24 Feb 2017