ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.07041
  4. Cited By
Implicitly Regularized RL with Implicit Q-Values

Implicitly Regularized RL with Implicit Q-Values

16 August 2021
Nino Vieillard
Marcin Andrychowicz
Anton Raichuk
Olivier Pietquin
Matthieu Geist
    OffRL
ArXivPDFHTML

Papers citing "Implicitly Regularized RL with Implicit Q-Values"

25 / 25 papers shown
Title
Reinforcement learning
Reinforcement learning
Florentin Wörgötter
36
2,569
0
16 May 2024
Munchausen Reinforcement Learning
Munchausen Reinforcement Learning
Nino Vieillard
Olivier Pietquin
Matthieu Geist
OffRL
13
90
0
28 Jul 2020
Acme: A Research Framework for Distributed Reinforcement Learning
Acme: A Research Framework for Distributed Reinforcement Learning
Matthew W. Hoffman
Bobak Shahriari
John Aslanides
Gabriel Barth-Maron
Nikola Momchev
...
Srivatsan Srinivasan
A. Cowie
Ziyun Wang
Bilal Piot
Nando de Freitas
90
225
0
01 Jun 2020
Leverage the Average: an Analysis of KL Regularization in RL
Leverage the Average: an Analysis of KL Regularization in RL
Nino Vieillard
Tadashi Kozuno
B. Scherrer
Olivier Pietquin
Rémi Munos
Matthieu Geist
29
43
0
31 Mar 2020
Quinoa: a Q-function You Infer Normalized Over Actions
Quinoa: a Q-function You Infer Normalized Over Actions
Jonas Degrave
A. Abdolmaleki
Jost Tobias Springenberg
N. Heess
Martin Riedmiller
21
5
0
05 Nov 2019
Solving Rubik's Cube with a Robot Hand
Solving Rubik's Cube with a Robot Hand
OpenAI
Ilge Akkaya
Marcin Andrychowicz
Maciek Chociej
Ma-teusz Litwin
...
Peter Welinder
Lilian Weng
Qiming Yuan
Wojciech Zaremba
Lei Zhang
ODL
53
1,215
0
16 Oct 2019
Improving Exploration in Soft-Actor-Critic with Normalizing Flows
  Policies
Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies
Patrick Nadeem Ward
Ariella Smofsky
A. Bose
16
58
0
06 Jun 2019
A Theory of Regularized Markov Decision Processes
A Theory of Regularized Markov Decision Processes
Matthieu Geist
B. Scherrer
Olivier Pietquin
82
317
0
31 Jan 2019
Discretizing Continuous Action Space for On-Policy Optimization
Discretizing Continuous Action Space for On-Policy Optimization
Yunhao Tang
Shipra Agrawal
OffRL
32
119
0
29 Jan 2019
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
94
2,391
0
13 Dec 2018
Maximum a Posteriori Policy Optimisation
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
59
471
0
14 Jun 2018
Addressing Function Approximation Error in Actor-Critic Methods
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
132
5,121
0
26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
172
8,236
0
04 Jan 2018
Action Branching Architectures for Deep Reinforcement Learning
Action Branching Architectures for Deep Reinforcement Learning
Arash Tavakoli
Fabio Pardo
Petar Kormushev
34
260
0
24 Nov 2017
Rainbow: Combining Improvements in Deep Reinforcement Learning
Rainbow: Combining Improvements in Deep Reinforcement Learning
Matteo Hessel
Joseph Modayil
H. V. Hasselt
Tom Schaul
Georg Ostrovski
Will Dabney
Dan Horgan
Bilal Piot
M. G. Azar
David Silver
OffRL
85
2,255
0
06 Oct 2017
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning
  and Demonstrations
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
Aravind Rajeswaran
Vikash Kumar
Abhishek Gupta
Giulia Vezzani
John Schulman
E. Todorov
Sergey Levine
85
1,079
0
28 Sep 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
42
106
0
06 Jul 2017
Discrete Sequential Prediction of Continuous Actions for Deep RL
Discrete Sequential Prediction of Continuous Actions for Deep RL
Luke Metz
Julian Ibarz
Navdeep Jaitly
James Davidson
BDL
OffRL
44
117
0
14 May 2017
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
92
469
0
28 Feb 2017
Continuous Deep Q-Learning with Model-based Acceleration
Continuous Deep Q-Learning with Model-based Acceleration
S. Gu
Timothy Lillicrap
Ilya Sutskever
Sergey Levine
53
1,009
0
02 Mar 2016
Increasing the Action Gap: New Operators for Reinforcement Learning
Increasing the Action Gap: New Operators for Reinforcement Learning
Marc G. Bellemare
Georg Ostrovski
A. Guez
Philip S. Thomas
Rémi Munos
30
156
0
15 Dec 2015
Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning
Ziyun Wang
Tom Schaul
Matteo Hessel
H. V. Hasselt
Marc Lanctot
Nando de Freitas
OffRL
56
3,742
0
20 Nov 2015
Variational Inference with Normalizing Flows
Variational Inference with Normalizing Flows
Danilo Jimenez Rezende
S. Mohamed
DRL
BDL
226
4,143
0
21 May 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
524
149,474
0
22 Dec 2014
Off-Policy Actor-Critic
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
205
220
0
22 May 2012
1