ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.01929
  4. Cited By
Averaged-DQN: Variance Reduction and Stabilization for Deep
  Reinforcement Learning

Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning

7 November 2016
Oron Anschel
Nir Baram
N. Shimkin
ArXivPDFHTML

Papers citing "Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning"

44 / 44 papers shown
Title
IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic
IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic
Stefano Viel
Luca Viano
V. Cevher
92
0
0
27 Feb 2025
A Multi-Agent Multi-Environment Mixed Q-Learning for Partially Decentralized Wireless Network Optimization
A Multi-Agent Multi-Environment Mixed Q-Learning for Partially Decentralized Wireless Network Optimization
Talha Bozkus
Urbashi Mitra
50
1
0
31 Dec 2024
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
C. Voelcker
Marcel Hussing
Eric Eaton
Amir-massoud Farahmand
Igor Gilitschenski
46
2
0
11 Oct 2024
Dissecting Deep RL with High Update Ratios: Combatting Value Divergence
Dissecting Deep RL with High Update Ratios: Combatting Value Divergence
Marcel Hussing
C. Voelcker
Igor Gilitschenski
Amir-massoud Farahmand
Eric Eaton
47
3
0
09 Mar 2024
Self-evolving Autoencoder Embedded Q-Network
Self-evolving Autoencoder Embedded Q-Network
Ieee J. Senthilnath Senior Member
Zhen Bangjian Zhou
Wei Ng
Deeksha Aggarwal
Rajdeep Dutta
Ji Wei Yoon
Phyu Aung
Keyu Wu
Ieee Li Fellow
Xiaoli Li
64
1
0
18 Feb 2024
Dataset Clustering for Improved Offline Policy Learning
Dataset Clustering for Improved Offline Policy Learning
Qiang Wang
Yixin Deng
Francisco Roldan Sanchez
Keru Wang
Kevin McGuinness
Noel E. O'Connor
Stephen J. Redmond
OffRL
34
2
0
14 Feb 2024
Conservative Exploration for Policy Optimization via Off-Policy Policy
  Evaluation
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
26
0
0
24 Dec 2023
One is More: Diverse Perspectives within a Single Network for Efficient
  DRL
One is More: Diverse Perspectives within a Single Network for Efficient DRL
Yiqin Tan
Ling Pan
Longbo Huang
OffRL
43
0
0
21 Oct 2023
PASTA: Pretrained Action-State Transformer Agents
PASTA: Pretrained Action-State Transformer Agents
Raphael Boige
Yannis Flet-Berliac
Arthur Flajolet
Guillaume Richard
Thomas Pierrot
LM&Ro
OffRL
45
5
0
20 Jul 2023
Adaptive Ensemble Q-learning: Minimizing Estimation Bias via Error
  Feedback
Adaptive Ensemble Q-learning: Minimizing Estimation Bias via Error Feedback
Hang Wang
Sen Lin
Junshan Zhang
26
19
0
20 Jun 2023
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online
  Reinforcement Learning
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning
Yi Zhao
Rinu Boney
Alexander Ilin
Arno Solin
Joni Pajarinen
OffRL
OnRL
28
39
0
25 Oct 2022
The Pump Scheduling Problem: A Real-World Scenario for Reinforcement Learning
The Pump Scheduling Problem: A Real-World Scenario for Reinforcement Learning
Henrique Donancio
L. Vercouter
H. Roclawski
AI4CE
18
1
0
20 Oct 2022
Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in
  Reinforcement Learning
Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning
R. G. Oliveira
W. Caarls
OffRL
31
0
0
29 Sep 2022
On the Convergence Theory of Meta Reinforcement Learning with
  Personalized Policies
On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies
Haozhi Wang
Qing Wang
Yunfeng Shao
Dong Li
Jianye Hao
Yinchuan Li
36
0
0
21 Sep 2022
MAN: Multi-Action Networks Learning
MAN: Multi-Action Networks Learning
Keqin Wang
Alison Bartsch
A. Farimani
21
3
0
19 Sep 2022
Prediction Based Decision Making for Autonomous Highway Driving
Prediction Based Decision Making for Autonomous Highway Driving
Mustafa Yildirim
Sajjad Mozaffari
Lucy McCutcheon
M. Dianati
Alireza Tamaddoni-Nezhad Saber Fallah
16
7
0
05 Sep 2022
Do We Need to Penalize Variance of Losses for Learning with Label Noise?
Do We Need to Penalize Variance of Losses for Learning with Label Noise?
Yexiong Lin
Yu Yao
Yuxuan Du
Jun Yu
Bo Han
Biwei Huang
Tongliang Liu
NoLa
53
3
0
30 Jan 2022
Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Angelos Filos
Eszter Vértes
Zita Marinho
Gregory Farquhar
Diana Borsa
A. Friesen
Feryal M. P. Behbahani
Tom Schaul
André Barreto
Simon Osindero
44
7
0
08 Dec 2021
Aggressive Q-Learning with Ensembles: Achieving Both High Sample
  Efficiency and High Asymptotic Performance
Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
Yanqiu Wu
Xinyue Chen
Che Wang
Yiming Zhang
Keith Ross
OffRL
17
9
0
17 Nov 2021
Balanced Q-learning: Combining the Influence of Optimistic and
  Pessimistic Targets
Balanced Q-learning: Combining the Influence of Optimistic and Pessimistic Targets
Thommen George Karimpanal
Hung Le
Majid Abdolshah
Santu Rana
Sunil R. Gupta
T. Tran
Svetha Venkatesh
17
5
0
03 Nov 2021
Automating Control of Overestimation Bias for Reinforcement Learning
Automating Control of Overestimation Bias for Reinforcement Learning
Arsenii Kuznetsov
Alexander Grishin
Artem Tsypin
Arsenii Ashukha
Artur Kadurin
Dmitry Vetrov
OffRL
6
2
0
26 Oct 2021
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise
  Datasets
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets
J. E. Grigsby
Yanjun Qi
OffRL
34
5
0
10 Oct 2021
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement
  Learning
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning
Edoardo Cetin
Oya Celiktutan
OffRL
47
17
0
07 Oct 2021
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
Takuya Hiraoka
Takahisa Imagawa
Taisei Hashimoto
Takashi Onishi
Yoshimasa Tsuruoka
13
105
0
05 Oct 2021
On the Estimation Bias in Double Q-Learning
On the Estimation Bias in Double Q-Learning
Zhizhou Ren
Guangxiang Zhu
Haotian Hu
Beining Han
Jian-Hai Chen
Chongjie Zhang
24
17
0
29 Sep 2021
Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
Yue Wu
Shuangfei Zhai
Nitish Srivastava
J. Susskind
Jian Zhang
Ruslan Salakhutdinov
Hanlin Goh
EDL
OffRL
OnRL
21
184
0
17 May 2021
Regularized Softmax Deep Multi-Agent $Q$-Learning
Regularized Softmax Deep Multi-Agent QQQ-Learning
L. Pan
Tabish Rashid
Bei Peng
Longbo Huang
Shimon Whiteson
42
31
0
22 Mar 2021
Foresee then Evaluate: Decomposing Value Estimation with Latent Future
  Prediction
Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
Hongyao Tang
Jianye Hao
Guangyong Chen
Pengfei Chen
Chong Chen
Yaodong Yang
Lu Zhang
Wulong Liu
Zhaopeng Meng
OffRL
35
4
0
03 Mar 2021
Softmax Deep Double Deterministic Policy Gradients
Softmax Deep Double Deterministic Policy Gradients
Ling Pan
Qingpeng Cai
Longbo Huang
72
86
0
19 Oct 2020
Variance Reduction for Deep Q-Learning using Stochastic Recursive
  Gradient
Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient
Hao Jia
Xiao Zhang
Jun Xu
Wei Zeng
Hao Jiang
Xiao Yan
Ji-Rong Wen
25
3
0
25 Jul 2020
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
  Reinforcement Learning
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
Kimin Lee
Michael Laskin
A. Srinivas
Pieter Abbeel
OffRL
25
199
0
09 Jul 2020
The Effect of Multi-step Methods on Overestimation in Deep Reinforcement
  Learning
The Effect of Multi-step Methods on Overestimation in Deep Reinforcement Learning
Lingheng Meng
R. Gorbet
Dana Kulić
OffRL
30
27
0
23 Jun 2020
Self-Imitation Learning via Generalized Lower Bound Q-learning
Self-Imitation Learning via Generalized Lower Bound Q-learning
Yunhao Tang
SSL
33
24
0
12 Jun 2020
Gradient Monitored Reinforcement Learning
Gradient Monitored Reinforcement Learning
Mohammed Sharafath Abdul Hameed
Gavneet Singh Chadha
Andreas Schwung
S. Ding
33
10
0
25 May 2020
A Deep Ensemble Multi-Agent Reinforcement Learning Approach for Air
  Traffic Control
A Deep Ensemble Multi-Agent Reinforcement Learning Approach for Air Traffic Control
Supriyo Ghosh
Sean Laguna
Shiau Hong Lim
L. Wynter
Hasan A. Poonawala
35
14
0
03 Apr 2020
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning
Qingfeng Lan
Yangchen Pan
Alona Fyshe
Martha White
8
176
0
16 Feb 2020
A Survey of Deep Reinforcement Learning in Video Games
A Survey of Deep Reinforcement Learning in Video Games
Kun Shao
Zhentao Tang
Yuanheng Zhu
Nannan Li
Dongbin Zhao
OffRL
AI4TS
43
188
0
23 Dec 2019
Multi-Path Policy Optimization
Multi-Path Policy Optimization
L. Pan
Qingpeng Cai
Longbo Huang
18
2
0
11 Nov 2019
In Hindsight: A Smooth Reward for Steady Exploration
In Hindsight: A Smooth Reward for Steady Exploration
H. Jomaa
Josif Grabocka
Lars Schmidt-Thieme
11
0
0
24 Jun 2019
Learning Manipulation Skills Via Hierarchical Spatial Attention
Learning Manipulation Skills Via Hierarchical Spatial Attention
Marcus Gualtieri
Robert W. Platt
33
13
0
19 Apr 2019
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy
  Critics
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Denis Steckelmacher
Hélène Plisnier
D. Roijers
A. Nowé
OffRL
26
17
0
11 Mar 2019
Parametrized Deep Q-Networks Learning: Reinforcement Learning with
  Discrete-Continuous Hybrid Action Space
Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space
Jiechao Xiong
Qing Wang
Zhuoran Yang
Peng Sun
Lei Han
Yang Zheng
Haobo Fu
Tong Zhang
Ji Liu
Han Liu
37
169
0
10 Oct 2018
Qualitative Measurements of Policy Discrepancy for Return-Based Deep
  Q-Network
Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network
Wenjia Meng
Qian Zheng
L. Yang
Pengfei Li
Gang Pan
20
21
0
14 Jun 2018
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,505
0
25 Jan 2017
1