Revisiting Gaussian mixture critics in off-policy reinforcement
learning: a sample-based approach

v1v2 (latest)

Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach

21 April 2022

Bobak Shahriari

Arunkumar Byravan

Jost Tobias Springenberg

Matthew W. Hoffman

Martin Riedmiller

ArXiv (abs)PDF HTML

Papers citing "Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach"

13 / 13 papers shown

Title
Generalized Gaussian Temporal Difference Error for Uncertainty-aware Reinforcement Learning Seyeon Kim Joonhun Lee Namhoon Cho Sungjun Han Seungeon Baek 111 0 0 05 Aug 2024
GMAC: A Distributional Perspective on Actor-Critic Framework D. W. Nam Younghoon Kim Chan Y. Park 51 17 0 24 May 2021
Acme: A Research Framework for Distributed Reinforcement Learning Matthew W. Hoffman Bobak Shahriari John Aslanides Gabriel Barth-Maron Nikola Momchev ... Srivatsan Srinivasan A. Cowie Ziyun Wang Bilal Piot Nando de Freitas 126 226 0 01 Jun 2020
Relative Entropy Regularized Policy Iteration A. Abdolmaleki Jost Tobias Springenberg Jonas Degrave Steven Bohez Yuval Tassa Dan Belov N. Heess Martin Riedmiller 68 72 0 05 Dec 2018
Implicit Quantile Networks for Distributional Reinforcement Learning Will Dabney Georg Ostrovski David Silver Rémi Munos OffRL 139 532 0 14 Jun 2018
Maximum a Posteriori Policy Optimisation A. Abdolmaleki Jost Tobias Springenberg Yuval Tassa Rémi Munos N. Heess Martin Riedmiller 81 478 0 14 Jun 2018
Distributed Distributional Deterministic Policy Gradients Gabriel Barth-Maron Matthew W. Hoffman David Budden Will Dabney Dan Horgan TB Dhruva Alistair Muldal N. Heess Timothy Lillicrap OffRL 98 480 0 23 Apr 2018
An Analysis of Categorical Distributional Reinforcement Learning Mark Rowland Marc G. Bellemare Will Dabney Rémi Munos Yee Whye Teh 70 102 0 22 Feb 2018
Distributional Reinforcement Learning with Quantile Regression Will Dabney Mark Rowland Marc G. Bellemare Rémi Munos 95 764 0 27 Oct 2017
A Distributional Perspective on Reinforcement Learning Marc G. Bellemare Will Dabney Rémi Munos OffRL 103 1,506 0 21 Jul 2017
The Cramer Distance as a Solution to Biased Wasserstein Gradients Marc G. Bellemare Ivo Danihelka Will Dabney S. Mohamed Balaji Lakshminarayanan Stephan Hoyer Rémi Munos GAN 87 344 0 30 May 2017
Learning values across many orders of magnitude H. V. Hasselt A. Guez Matteo Hessel Volodymyr Mnih David Silver 83 170 0 24 Feb 2016
Parametric Return Density Estimation for Reinforcement Learning Tetsuro Morimura Masashi Sugiyama H. Kashima Hirotaka Hachiya Toshiyuki Tanaka 91 112 0 15 Mar 2012

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.