ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.06487
  4. Cited By
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning

Maxmin Q-learning: Controlling the Estimation Bias of Q-learning

16 February 2020
Qingfeng Lan
Yangchen Pan
Alona Fyshe
Martha White
ArXivPDFHTML

Papers citing "Maxmin Q-learning: Controlling the Estimation Bias of Q-learning"

49 / 99 papers shown
Title
Anti-Overestimation Dialogue Policy Learning for Task-Completion
  Dialogue System
Anti-Overestimation Dialogue Policy Learning for Task-Completion Dialogue System
T. Chang
Wenpeng Yin
Marie-Francine Moens
OffRL
33
4
0
24 Jul 2022
Sampling Efficient Deep Reinforcement Learning through Preference-Guided
  Stochastic Exploration
Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration
Wenhui Huang
Cong Zhang
Jingda Wu
Xiangkun He
Jie Zhang
Chengqi Lv
16
8
0
20 Jun 2022
Memory-efficient Reinforcement Learning with Value-based Knowledge
  Consolidation
Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation
Qingfeng Lan
Yangchen Pan
Jun Luo
A. R. Mahmood
OffRL
36
8
0
22 May 2022
$q$-Munchausen Reinforcement Learning
qqq-Munchausen Reinforcement Learning
Lingwei Zhu
Zheng Chen
E. Uchibe
Takamitsu Matsubara
OffRL
16
0
0
16 May 2022
Action Candidate Driven Clipped Double Q-learning for Discrete and
  Continuous Action Tasks
Action Candidate Driven Clipped Double Q-learning for Discrete and Continuous Action Tasks
Haobo Jiang
Jin Xie
Jian Yang
OffRL
11
10
0
22 Mar 2022
Smoothing Advantage Learning
Smoothing Advantage Learning
Yaozhong Gan
Zhe Zhang
Xiaoyang Tan
AAML
20
2
0
20 Mar 2022
VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
Che Wang
Xufang Luo
Keith Ross
Dongsheng Li
OffRL
30
49
0
17 Feb 2022
Regularized Q-learning
Regularized Q-learning
Han-Dong Lim
Donghwan Lee
29
10
0
11 Feb 2022
Exploration with Multi-Sample Target Values for Distributional
  Reinforcement Learning
Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning
Michael Teng
M. van de Panne
Frank Wood
OOD
OffRL
14
1
0
06 Feb 2022
DNS: Determinantal Point Process Based Neural Network Sampler for
  Ensemble Reinforcement Learning
DNS: Determinantal Point Process Based Neural Network Sampler for Ensemble Reinforcement Learning
Hassam Sheikh
Kizza M Nandyose Frisbee
Mariano Phielipp
25
8
0
31 Jan 2022
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Nicolai Dorka
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
OffRL
32
9
0
24 Nov 2021
Aggressive Q-Learning with Ensembles: Achieving Both High Sample
  Efficiency and High Asymptotic Performance
Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
Yanqiu Wu
Xinyue Chen
Che Wang
Yiming Zhang
Keith Ross
OffRL
17
9
0
17 Nov 2021
AWD3: Dynamic Reduction of the Estimation Bias
AWD3: Dynamic Reduction of the Estimation Bias
Dogan C. Cicek
Enes Duran
Baturay Saglam
Kagan Kaya
Furkan B. Mutlu
Suleyman Serdar Kozat
OffRL
11
7
0
12 Nov 2021
Balanced Q-learning: Combining the Influence of Optimistic and
  Pessimistic Targets
Balanced Q-learning: Combining the Influence of Optimistic and Pessimistic Targets
Thommen George Karimpanal
Hung Le
Majid Abdolshah
Santu Rana
Sunil R. Gupta
T. Tran
Svetha Venkatesh
19
5
0
03 Nov 2021
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates
Litian Liang
Yaosheng Xu
Stephen Marcus McAleer
Dailin Hu
Alexander Ihler
Pieter Abbeel
Roy Fox
15
4
0
28 Oct 2021
Automating Control of Overestimation Bias for Reinforcement Learning
Automating Control of Overestimation Bias for Reinforcement Learning
Arsenii Kuznetsov
Alexander Grishin
Artem Tsypin
Arsenii Ashukha
Artur Kadurin
Dmitry Vetrov
OffRL
6
2
0
26 Oct 2021
Balancing Value Underestimation and Overestimation with Realistic
  Actor-Critic
Balancing Value Underestimation and Overestimation with Realistic Actor-Critic
Sicen Li
Qinyun Tang
G. Wang
Xinmeng Ma
Li-quan Wang
OffRL
17
4
0
19 Oct 2021
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise
  Datasets
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets
J. E. Grigsby
Yanjun Qi
OffRL
34
5
0
10 Oct 2021
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement
  Learning
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning
Edoardo Cetin
Oya Celiktutan
OffRL
47
17
0
07 Oct 2021
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
Takuya Hiraoka
Takahisa Imagawa
Taisei Hashimoto
Takashi Onishi
Yoshimasa Tsuruoka
13
105
0
05 Oct 2021
Uncertainty-Based Offline Reinforcement Learning with Diversified
  Q-Ensemble
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
Gaon An
Seungyong Moon
Jang-Hyun Kim
Hyun Oh Song
OffRL
105
265
0
04 Oct 2021
On the Estimation Bias in Double Q-Learning
On the Estimation Bias in Double Q-Learning
Zhizhou Ren
Guangxiang Zhu
Haotian Hu
Beining Han
Jian-Hai Chen
Chongjie Zhang
24
17
0
29 Sep 2021
Parameter-free Reduction of the Estimation Bias in Deep Reinforcement
  Learning for Deterministic Policy Gradients
Parameter-free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy Gradients
Baturay Saglam
Furkan B. Mutlu
Dogan C. Cicek
Suleyman Serdar Kozat
OffRL
14
3
0
24 Sep 2021
Estimation Error Correction in Deep Reinforcement Learning for
  Deterministic Actor-Critic Methods
Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic Methods
Baturay Saglam
Enes Duran
Dogan C. Cicek
Furkan B. Mutlu
Suleyman Serdar Kozat
OffRL
50
12
0
22 Sep 2021
MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep
  Reinforcement Learning
MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep Reinforcement Learning
Qiang He
Yuxun Qu
Chen Gong
Xinwen Hou
OffRL
22
10
0
22 Sep 2021
Offline-to-Online Reinforcement Learning via Balanced Replay and
  Pessimistic Q-Ensemble
Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble
Seunghyun Lee
Younggyo Seo
Kimin Lee
Pieter Abbeel
Jinwoo Shin
OffRL
OnRL
22
182
0
01 Jul 2021
Efficient Continuous Control with Double Actors and Regularized Critics
Efficient Continuous Control with Double Actors and Regularized Critics
Jiafei Lyu
Xiaoteng Ma
Jiangpeng Yan
Xiu Li
OffRL
19
48
0
06 Jun 2021
In Defense of the Paper
In Defense of the Paper
Owen Lockwood
19
0
0
16 Apr 2021
Regularized Softmax Deep Multi-Agent $Q$-Learning
Regularized Softmax Deep Multi-Agent QQQ-Learning
L. Pan
Tabish Rashid
Bei Peng
Longbo Huang
Shimon Whiteson
42
31
0
22 Mar 2021
Generalizable Episodic Memory for Deep Reinforcement Learning
Generalizable Episodic Memory for Deep Reinforcement Learning
Haotian Hu
Jianing Ye
Guangxiang Zhu
Zhizhou Ren
Chongjie Zhang
OffRL
33
39
0
11 Mar 2021
Foresee then Evaluate: Decomposing Value Estimation with Latent Future
  Prediction
Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
Hongyao Tang
Jianye Hao
Guangyong Chen
Pengfei Chen
Chong Chen
Yaodong Yang
Lu Zhang
Wulong Liu
Zhaopeng Meng
OffRL
35
4
0
03 Mar 2021
Ensemble Bootstrapping for Q-Learning
Ensemble Bootstrapping for Q-Learning
Oren Peer
Chen Tessler
Nadav Merlis
Ron Meir
19
42
0
28 Feb 2021
Greedy-Step Off-Policy Reinforcement Learning
Greedy-Step Off-Policy Reinforcement Learning
Yuhui Wang
Qingyuan Wu
Pengcheng He
Xiaoyang Tan
OffRL
26
1
0
23 Feb 2021
Continuous Doubly Constrained Batch Reinforcement Learning
Continuous Doubly Constrained Batch Reinforcement Learning
Rasool Fakoor
Jonas W. Mueller
Kavosh Asadi
Pratik Chaudhari
Alex Smola
OffRL
204
27
0
18 Feb 2021
RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement
  Learning Agents
RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents
Wei Qiu
Xinrun Wang
Runsheng Yu
Xu He
R. Wang
Bo An
S. Obraztsova
Zinovi Rabinovich
35
50
0
16 Feb 2021
Model-Augmented Q-learning
Model-Augmented Q-learning
Youngmin Oh
Jinwoo Shin
Eunho Yang
Sung Ju Hwang
OffRL
24
1
0
07 Feb 2021
What About Inputing Policy in Value Function: Policy Representation and
  Policy-extended Value Function Approximator
What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function Approximator
Hongyao Tang
Zhaopeng Meng
Jianye Hao
Chong Chen
D. Graves
...
Hangyu Mao
Wulong Liu
Yaodong Yang
Wenyuan Tao
Li Wang
OffRL
24
7
0
19 Oct 2020
Softmax Deep Double Deterministic Policy Gradients
Softmax Deep Double Deterministic Policy Gradients
Ling Pan
Qingpeng Cai
Longbo Huang
72
86
0
19 Oct 2020
Learning Intrinsic Symbolic Rewards in Reinforcement Learning
Learning Intrinsic Symbolic Rewards in Reinforcement Learning
Hassam Sheikh
Shauharda Khadka
Santiago Miret
Somdeb Majumdar
OffRL
29
7
0
08 Oct 2020
Energy-based Surprise Minimization for Multi-Agent Value Factorization
Energy-based Surprise Minimization for Multi-Agent Value Factorization
Karush Suri
Xiaolong Shi
Konstantinos Plataniotis
Y. Lawryshyn
21
1
0
16 Sep 2020
Maximum Mutation Reinforcement Learning for Scalable Control
Maximum Mutation Reinforcement Learning for Scalable Control
Karush Suri
Xiaolong Shi
Konstantinos N. Plataniotis
Y. Lawryshyn
25
4
0
24 Jul 2020
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
  Reinforcement Learning
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
Kimin Lee
Michael Laskin
A. Srinivas
Pieter Abbeel
OffRL
25
199
0
09 Jul 2020
Preventing Value Function Collapse in Ensemble {Q}-Learning by
  Maximizing Representation Diversity
Preventing Value Function Collapse in Ensemble {Q}-Learning by Maximizing Representation Diversity
Hassam Sheikh
Ladislau Bölöni
11
0
0
24 Jun 2020
WD3: Taming the Estimation Bias in Deep Reinforcement Learning
WD3: Taming the Estimation Bias in Deep Reinforcement Learning
Qiang He
Xinwen Hou
OffRL
10
28
0
18 Jun 2020
Self-Imitation Learning via Generalized Lower Bound Q-learning
Self-Imitation Learning via Generalized Lower Bound Q-learning
Yunhao Tang
SSL
33
24
0
12 Jun 2020
Decorrelated Double Q-learning
Decorrelated Double Q-learning
Gang Chen
11
2
0
12 Jun 2020
Controlling Overestimation Bias with Truncated Mixture of Continuous
  Distributional Quantile Critics
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
Arsenii Kuznetsov
Pavel Shvechikov
Alexander Grishin
Dmitry Vetrov
136
188
0
08 May 2020
Deep Reinforcement Learning with Weighted Q-Learning
Deep Reinforcement Learning with Weighted Q-Learning
Andrea Cini
Carlo DÉramo
Jan Peters
Cesare Alippi
OffRL
26
9
0
20 Mar 2020
Context-Dependent Upper-Confidence Bounds for Directed Exploration
Context-Dependent Upper-Confidence Bounds for Directed Exploration
Raksha Kumaraswamy
M. Schlegel
Adam White
Martha White
OffRL
20
12
0
15 Nov 2018
Previous
12