Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.06487
Cited By
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning
16 February 2020
Qingfeng Lan
Yangchen Pan
Alona Fyshe
Martha White
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Maxmin Q-learning: Controlling the Estimation Bias of Q-learning"
49 / 99 papers shown
Title
Anti-Overestimation Dialogue Policy Learning for Task-Completion Dialogue System
T. Chang
Wenpeng Yin
Marie-Francine Moens
OffRL
33
4
0
24 Jul 2022
Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration
Wenhui Huang
Cong Zhang
Jingda Wu
Xiangkun He
Jie Zhang
Chengqi Lv
16
8
0
20 Jun 2022
Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation
Qingfeng Lan
Yangchen Pan
Jun Luo
A. R. Mahmood
OffRL
36
8
0
22 May 2022
q
q
q
-Munchausen Reinforcement Learning
Lingwei Zhu
Zheng Chen
E. Uchibe
Takamitsu Matsubara
OffRL
16
0
0
16 May 2022
Action Candidate Driven Clipped Double Q-learning for Discrete and Continuous Action Tasks
Haobo Jiang
Jin Xie
Jian Yang
OffRL
11
10
0
22 Mar 2022
Smoothing Advantage Learning
Yaozhong Gan
Zhe Zhang
Xiaoyang Tan
AAML
20
2
0
20 Mar 2022
VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
Che Wang
Xufang Luo
Keith Ross
Dongsheng Li
OffRL
30
49
0
17 Feb 2022
Regularized Q-learning
Han-Dong Lim
Donghwan Lee
29
10
0
11 Feb 2022
Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning
Michael Teng
M. van de Panne
Frank Wood
OOD
OffRL
14
1
0
06 Feb 2022
DNS: Determinantal Point Process Based Neural Network Sampler for Ensemble Reinforcement Learning
Hassam Sheikh
Kizza M Nandyose Frisbee
Mariano Phielipp
25
8
0
31 Jan 2022
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Nicolai Dorka
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
OffRL
32
9
0
24 Nov 2021
Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
Yanqiu Wu
Xinyue Chen
Che Wang
Yiming Zhang
Keith Ross
OffRL
17
9
0
17 Nov 2021
AWD3: Dynamic Reduction of the Estimation Bias
Dogan C. Cicek
Enes Duran
Baturay Saglam
Kagan Kaya
Furkan B. Mutlu
Suleyman Serdar Kozat
OffRL
11
7
0
12 Nov 2021
Balanced Q-learning: Combining the Influence of Optimistic and Pessimistic Targets
Thommen George Karimpanal
Hung Le
Majid Abdolshah
Santu Rana
Sunil R. Gupta
T. Tran
Svetha Venkatesh
19
5
0
03 Nov 2021
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates
Litian Liang
Yaosheng Xu
Stephen Marcus McAleer
Dailin Hu
Alexander Ihler
Pieter Abbeel
Roy Fox
15
4
0
28 Oct 2021
Automating Control of Overestimation Bias for Reinforcement Learning
Arsenii Kuznetsov
Alexander Grishin
Artem Tsypin
Arsenii Ashukha
Artur Kadurin
Dmitry Vetrov
OffRL
6
2
0
26 Oct 2021
Balancing Value Underestimation and Overestimation with Realistic Actor-Critic
Sicen Li
Qinyun Tang
G. Wang
Xinmeng Ma
Li-quan Wang
OffRL
17
4
0
19 Oct 2021
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets
J. E. Grigsby
Yanjun Qi
OffRL
34
5
0
10 Oct 2021
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning
Edoardo Cetin
Oya Celiktutan
OffRL
47
17
0
07 Oct 2021
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
Takuya Hiraoka
Takahisa Imagawa
Taisei Hashimoto
Takashi Onishi
Yoshimasa Tsuruoka
13
105
0
05 Oct 2021
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
Gaon An
Seungyong Moon
Jang-Hyun Kim
Hyun Oh Song
OffRL
105
265
0
04 Oct 2021
On the Estimation Bias in Double Q-Learning
Zhizhou Ren
Guangxiang Zhu
Haotian Hu
Beining Han
Jian-Hai Chen
Chongjie Zhang
24
17
0
29 Sep 2021
Parameter-free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy Gradients
Baturay Saglam
Furkan B. Mutlu
Dogan C. Cicek
Suleyman Serdar Kozat
OffRL
14
3
0
24 Sep 2021
Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic Methods
Baturay Saglam
Enes Duran
Dogan C. Cicek
Furkan B. Mutlu
Suleyman Serdar Kozat
OffRL
50
12
0
22 Sep 2021
MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep Reinforcement Learning
Qiang He
Yuxun Qu
Chen Gong
Xinwen Hou
OffRL
22
10
0
22 Sep 2021
Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble
Seunghyun Lee
Younggyo Seo
Kimin Lee
Pieter Abbeel
Jinwoo Shin
OffRL
OnRL
22
182
0
01 Jul 2021
Efficient Continuous Control with Double Actors and Regularized Critics
Jiafei Lyu
Xiaoteng Ma
Jiangpeng Yan
Xiu Li
OffRL
19
48
0
06 Jun 2021
In Defense of the Paper
Owen Lockwood
19
0
0
16 Apr 2021
Regularized Softmax Deep Multi-Agent
Q
Q
Q
-Learning
L. Pan
Tabish Rashid
Bei Peng
Longbo Huang
Shimon Whiteson
42
31
0
22 Mar 2021
Generalizable Episodic Memory for Deep Reinforcement Learning
Haotian Hu
Jianing Ye
Guangxiang Zhu
Zhizhou Ren
Chongjie Zhang
OffRL
33
39
0
11 Mar 2021
Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
Hongyao Tang
Jianye Hao
Guangyong Chen
Pengfei Chen
Chong Chen
Yaodong Yang
Lu Zhang
Wulong Liu
Zhaopeng Meng
OffRL
35
4
0
03 Mar 2021
Ensemble Bootstrapping for Q-Learning
Oren Peer
Chen Tessler
Nadav Merlis
Ron Meir
19
42
0
28 Feb 2021
Greedy-Step Off-Policy Reinforcement Learning
Yuhui Wang
Qingyuan Wu
Pengcheng He
Xiaoyang Tan
OffRL
26
1
0
23 Feb 2021
Continuous Doubly Constrained Batch Reinforcement Learning
Rasool Fakoor
Jonas W. Mueller
Kavosh Asadi
Pratik Chaudhari
Alex Smola
OffRL
204
27
0
18 Feb 2021
RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents
Wei Qiu
Xinrun Wang
Runsheng Yu
Xu He
R. Wang
Bo An
S. Obraztsova
Zinovi Rabinovich
35
50
0
16 Feb 2021
Model-Augmented Q-learning
Youngmin Oh
Jinwoo Shin
Eunho Yang
Sung Ju Hwang
OffRL
24
1
0
07 Feb 2021
What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function Approximator
Hongyao Tang
Zhaopeng Meng
Jianye Hao
Chong Chen
D. Graves
...
Hangyu Mao
Wulong Liu
Yaodong Yang
Wenyuan Tao
Li Wang
OffRL
24
7
0
19 Oct 2020
Softmax Deep Double Deterministic Policy Gradients
Ling Pan
Qingpeng Cai
Longbo Huang
72
86
0
19 Oct 2020
Learning Intrinsic Symbolic Rewards in Reinforcement Learning
Hassam Sheikh
Shauharda Khadka
Santiago Miret
Somdeb Majumdar
OffRL
29
7
0
08 Oct 2020
Energy-based Surprise Minimization for Multi-Agent Value Factorization
Karush Suri
Xiaolong Shi
Konstantinos Plataniotis
Y. Lawryshyn
21
1
0
16 Sep 2020
Maximum Mutation Reinforcement Learning for Scalable Control
Karush Suri
Xiaolong Shi
Konstantinos N. Plataniotis
Y. Lawryshyn
25
4
0
24 Jul 2020
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
Kimin Lee
Michael Laskin
A. Srinivas
Pieter Abbeel
OffRL
25
199
0
09 Jul 2020
Preventing Value Function Collapse in Ensemble {Q}-Learning by Maximizing Representation Diversity
Hassam Sheikh
Ladislau Bölöni
11
0
0
24 Jun 2020
WD3: Taming the Estimation Bias in Deep Reinforcement Learning
Qiang He
Xinwen Hou
OffRL
10
28
0
18 Jun 2020
Self-Imitation Learning via Generalized Lower Bound Q-learning
Yunhao Tang
SSL
33
24
0
12 Jun 2020
Decorrelated Double Q-learning
Gang Chen
11
2
0
12 Jun 2020
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
Arsenii Kuznetsov
Pavel Shvechikov
Alexander Grishin
Dmitry Vetrov
136
188
0
08 May 2020
Deep Reinforcement Learning with Weighted Q-Learning
Andrea Cini
Carlo DÉramo
Jan Peters
Cesare Alippi
OffRL
26
9
0
20 Mar 2020
Context-Dependent Upper-Confidence Bounds for Directed Exploration
Raksha Kumaraswamy
M. Schlegel
Adam White
Martha White
OffRL
20
12
0
15 Nov 2018
Previous
1
2