Enforcing KL Regularization in General Tsallis Entropy Reinforcement
Learning via Advantage Learning

Enforcing KL Regularization in General Tsallis Entropy Reinforcement Learning via Advantage Learning

16 May 2022

Zheng Chen

Takamitsu Matsubara

ArXiv (abs)PDF HTML

Papers citing "Enforcing KL Regularization in General Tsallis Entropy Reinforcement Learning via Advantage Learning"

17 / 17 papers shown

Title
Self-Imitation Advantage Learning Johan Ferret Olivier Pietquin Matthieu Geist 157 20 0 22 Dec 2020
Munchausen Reinforcement Learning Nino Vieillard Olivier Pietquin Matthieu Geist OffRL 44 90 0 28 Jul 2020
Leverage the Average: an Analysis of KL Regularization in RL Nino Vieillard Tadashi Kozuno B. Scherrer Olivier Pietquin Rémi Munos Matthieu Geist 55 43 0 31 Mar 2020
Diagnosing Bottlenecks in Deep Q-learning Algorithms Justin Fu Aviral Kumar Matthew Soh Sergey Levine OffRL 79 142 0 26 Feb 2019
A Theory of Regularized Markov Decision Processes Matthieu Geist B. Scherrer Olivier Pietquin 128 331 0 31 Jan 2019
Understanding the impact of entropy on policy optimization Zafarali Ahmed Nicolas Le Roux Mohammad Norouzi Dale Schuurmans 73 233 0 27 Nov 2018
Effective Exploration for Deep Reinforcement Learning via Bootstrapped Q-Ensembles under Tsallis Entropy Regularization Gang Chen Yiming Peng Mengjie Zhang 29 14 0 02 Sep 2018
Learning Dexterous In-Hand Manipulation OpenAI OpenAI Marcin Andrychowicz Bowen Baker Maciek Chociej Rafal Jozefowicz ... Szymon Sidor Joshua Tobin Peter Welinder Lilian Weng Wojciech Zaremba 153 1,880 0 01 Aug 2018
Addressing Function Approximation Error in Actor-Critic Methods Scott Fujimoto H. V. Hoof David Meger OffRL 180 5,187 0 26 Feb 2018
Path Consistency Learning in Tsallis Entropy Regularized MDPs Ofir Nachum Yinlam Chow Mohammad Ghavamzadeh 64 45 0 10 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor Tuomas Haarnoja Aurick Zhou Pieter Abbeel Sergey Levine 311 8,352 0 04 Jan 2018
Distributional Reinforcement Learning with Quantile Regression Will Dabney Mark Rowland Marc G. Bellemare Rémi Munos 92 760 0 27 Oct 2017
Sparse Markov Decision Processes with Causal Sparse Tsallis Entropy Regularization for Reinforcement Learning Kyungjae Lee Sungjoon Choi Songhwai Oh 53 68 0 19 Sep 2017
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification André F. T. Martins Ramón Fernández Astudillo 179 726 0 05 Feb 2016
Increasing the Action Gap: New Operators for Reinforcement Learning Marc G. Bellemare Georg Ostrovski A. Guez Philip S. Thomas Rémi Munos 74 157 0 15 Dec 2015
The Arcade Learning Environment: An Evaluation Platform for General Agents Marc G. Bellemare Yavar Naddaf J. Veness Michael Bowling 120 3,006 0 19 Jul 2012
Dynamic Policy Programming M. G. Azar Vicencc Gómez H. Kappen 111 123 0 12 Apr 2010