ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.00949
  4. Cited By
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

3 June 2019
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
    OffRL
    OnRL
ArXivPDFHTML

Papers citing "Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction"

50 / 271 papers shown
Title
Importance Weighted Actor-Critic for Optimal Conservative Offline
  Reinforcement Learning
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
45
15
0
30 Jan 2023
Constrained Policy Optimization with Explicit Behavior Density for
  Offline Reinforcement Learning
Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning
Jing Zhang
Chi Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
35
7
0
28 Jan 2023
Benchmarks and Algorithms for Offline Preference-Based Reward Learning
Benchmarks and Algorithms for Offline Preference-Based Reward Learning
Daniel Shin
Anca Dragan
Daniel S. Brown
OffRL
19
53
0
03 Jan 2023
Offline Policy Optimization in RL with Variance Regularizaton
Offline Policy Optimization in RL with Variance Regularizaton
Riashat Islam
Samarth Sinha
Homanga Bharadhwaj
Samin Yeasar Arnob
Zhuoran Yang
Animesh Garg
Zhaoran Wang
Lihong Li
Doina Precup
OffRL
28
0
0
29 Dec 2022
On Pathologies in KL-Regularized Reinforcement Learning from Expert
  Demonstrations
On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations
Tim G. J. Rudner
Cong Lu
Michael A. Osborne
Yarin Gal
Yee Whye Teh
OffRL
38
27
0
28 Dec 2022
Multi-embodiment Legged Robot Control as a Sequence Modeling Problem
Multi-embodiment Legged Robot Control as a Sequence Modeling Problem
Chenyi Yu
Weinan Zhang
H. Lai
Zheng Tian
L. Kneip
Jun Wang
35
15
0
18 Dec 2022
Confidence-Conditioned Value Functions for Offline Reinforcement
  Learning
Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Joey Hong
Aviral Kumar
Sergey Levine
OffRL
39
20
0
08 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy
  Constraints and Q-Ensemble
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
32
0
0
07 Dec 2022
Launchpad: Learning to Schedule Using Offline and Online RL Methods
Launchpad: Learning to Schedule Using Offline and Online RL Methods
V. Venkataswamy
J. E. Grigsby
A. Grimshaw
Yanjun Qi
OffRL
OnRL
24
1
0
01 Dec 2022
Behavior Estimation from Multi-Source Data for Offline Reinforcement
  Learning
Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning
Guoxi Zhang
H. Kashima
OffRL
29
2
0
29 Nov 2022
Causal Deep Reinforcement Learning Using Observational Data
Causal Deep Reinforcement Learning Using Observational Data
Wenxuan Zhu
Chao Yu
Qiaosheng Zhang
CML
OffRL
26
5
0
28 Nov 2022
Offline Q-Learning on Diverse Multi-Task Data Both Scales And
  Generalizes
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes
Aviral Kumar
Rishabh Agarwal
Xinyang Geng
George Tucker
Sergey Levine
OffRL
44
48
0
28 Nov 2022
Multi-Environment Pretraining Enables Transfer to Action Limited
  Datasets
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
David Venuto
Sherry Yang
Pieter Abbeel
Doina Precup
Igor Mordatch
Ofir Nachum
OffRL
25
5
0
23 Nov 2022
Improving TD3-BC: Relaxed Policy Constraint for Offline Learning and
  Stable Online Fine-Tuning
Improving TD3-BC: Relaxed Policy Constraint for Offline Learning and Stable Online Fine-Tuning
Alex Beeson
Giovanni Montana
OffRL
OnRL
26
23
0
21 Nov 2022
Model-based Trajectory Stitching for Improved Offline Reinforcement
  Learning
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning
Charles A. Hepburn
Giovanni Montana
OffRL
37
13
0
21 Nov 2022
Let Offline RL Flow: Training Conservative Agents in the Latent Space of
  Normalizing Flows
Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows
D. Akimov
Vladislav Kurenkov
Alexander Nikulin
Denis Tarasov
Sergey Kolesnikov
OffRL
24
9
0
20 Nov 2022
Learning Reward Functions for Robotic Manipulation by Observing Humans
Learning Reward Functions for Robotic Manipulation by Observing Humans
Minttu Alakuijala
Gabriel Dulac-Arnold
Julien Mairal
Jean Ponce
Cordelia Schmid
OffRL
41
27
0
16 Nov 2022
Offline Reinforcement Learning with Adaptive Behavior Regularization
Offline Reinforcement Learning with Adaptive Behavior Regularization
Yunfan Zhou
Xijun Li
Qingyu Qu
OffRL
27
1
0
15 Nov 2022
Spatio-temporal Incentives Optimization for Ride-hailing Services with
  Offline Deep Reinforcement Learning
Spatio-temporal Incentives Optimization for Ride-hailing Services with Offline Deep Reinforcement Learning
Yanqiu Wu
Qingyang Li
Zhiwei Qin
OffRL
20
3
0
06 Nov 2022
Wall Street Tree Search: Risk-Aware Planning for Offline Reinforcement
  Learning
Wall Street Tree Search: Risk-Aware Planning for Offline Reinforcement Learning
D. Elbaz
Gal Novik
Oren Salzman
OffRL
33
0
0
06 Nov 2022
Dual Generator Offline Reinforcement Learning
Dual Generator Offline Reinforcement Learning
Q. Vuong
Aviral Kumar
Sergey Levine
Yevgen Chebotar
OffRL
34
1
0
02 Nov 2022
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online
  Reinforcement Learning
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning
Yi Zhao
Rinu Boney
Alexander Ilin
Arno Solin
Joni Pajarinen
OffRL
OnRL
28
39
0
25 Oct 2022
Robust Offline Reinforcement Learning with Gradient Penalty and
  Constraint Relaxation
Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation
Chengqian Gao
Kelvin Xu
Liu Liu
Deheng Ye
P. Zhao
Zhiqiang Xu
OffRL
45
2
0
19 Oct 2022
Boosting Offline Reinforcement Learning via Data Rebalancing
Boosting Offline Reinforcement Learning via Data Rebalancing
Yang Yue
Bingyi Kang
Xiao Ma
Zhongwen Xu
Gao Huang
Shuicheng Yan
OffRL
26
22
0
17 Oct 2022
A Policy-Guided Imitation Approach for Offline Reinforcement Learning
A Policy-Guided Imitation Approach for Offline Reinforcement Learning
Haoran Xu
Li Jiang
Jianxiong Li
Xianyuan Zhan
OffRL
31
62
0
15 Oct 2022
Sustainable Online Reinforcement Learning for Auto-bidding
Sustainable Online Reinforcement Learning for Auto-bidding
Zhiyu Mou
Yusen Huo
Rongquan Bai
Mingzhou Xie
Chuan Yu
Jian Xu
Bo Zheng
OffRL
OnRL
34
15
0
13 Oct 2022
Semi-Supervised Offline Reinforcement Learning with Action-Free
  Trajectories
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
Qinqing Zheng
Mikael Henaff
Brandon Amos
Aditya Grover
OffRL
26
20
0
12 Oct 2022
Reliable Conditioning of Behavioral Cloning for Offline Reinforcement
  Learning
Reliable Conditioning of Behavioral Cloning for Offline Reinforcement Learning
Tung Nguyen
Qinqing Zheng
Aditya Grover
OffRL
39
6
0
11 Oct 2022
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
Chen Gong
Zhou Yang
Yunru Bai
Junda He
Jieke Shi
...
Arunesh Sinha
Bowen Xu
Xinwen Hou
David Lo
Guoliang Fan
AAML
OffRL
29
7
0
07 Oct 2022
B2RL: An open-source Dataset for Building Batch Reinforcement Learning
B2RL: An open-source Dataset for Building Batch Reinforcement Learning
Hsin-Yu Liu
Xiaohan Fu
Bharathan Balaji
Rajesh E. Gupta
Dezhi Hong
OffRL
27
4
0
30 Sep 2022
S2P: State-conditioned Image Synthesis for Data Augmentation in Offline
  Reinforcement Learning
S2P: State-conditioned Image Synthesis for Data Augmentation in Offline Reinforcement Learning
Daesol Cho
D. Shim
H. J. Kim
OffRL
45
11
0
30 Sep 2022
Offline Reinforcement Learning via High-Fidelity Generative Behavior
  Modeling
Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling
Huayu Chen
Cheng Lu
Chengyang Ying
Hang Su
Jun Zhu
DiffM
OffRL
108
106
0
29 Sep 2022
Age of Semantics in Cooperative Communications: To Expedite Simulation
  Towards Real via Offline Reinforcement Learning
Age of Semantics in Cooperative Communications: To Expedite Simulation Towards Real via Offline Reinforcement Learning
Xianfu Chen
Zhifeng Zhao
S. Mao
Celimuge Wu
Honggang Zhang
M. Bennis
OffRL
31
3
0
19 Sep 2022
Reducing Variance in Temporal-Difference Value Estimation via Ensemble
  of Deep Networks
Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks
Litian Liang
Yaosheng Xu
Stephen Marcus McAleer
Dailin Hu
Alexander Ihler
Pieter Abbeel
Roy Fox
OOD
29
16
0
16 Sep 2022
Distributionally Robust Offline Reinforcement Learning with Linear
  Function Approximation
Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation
Xiaoteng Ma
Zhipeng Liang
Jose H. Blanchet
MingWen Liu
Li Xia
Jiheng Zhang
Qianchuan Zhao
Zhengyuan Zhou
OOD
OffRL
41
22
0
14 Sep 2022
Q-learning Decision Transformer: Leveraging Dynamic Programming for
  Conditional Sequence Modelling in Offline RL
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL
Taku Yamagata
Ahmed Khalil
Raúl Santos-Rodríguez
OffRL
160
72
0
08 Sep 2022
Dialogue Evaluation with Offline Reinforcement Learning
Dialogue Evaluation with Offline Reinforcement Learning
Nurul Lubis
Christian Geishauser
Hsien-Chin Lin
Carel van Niekerk
Michael Heck
Shutong Feng
Milica Gavsić
OffRL
27
4
0
02 Sep 2022
Efficient Planning in a Compact Latent Action Space
Efficient Planning in a Compact Latent Action Space
Zhengyao Jiang
Tianjun Zhang
Michael Janner
Yueying Li
Tim Rocktaschel
Edward Grefenstette
Yuandong Tian
OffRL
24
37
0
22 Aug 2022
Impact Makes a Sound and Sound Makes an Impact: Sound Guides
  Representations and Explorations
Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and Explorations
Xufeng Zhao
C. Weber
Muhammad Burhan Hafez
S. Wermter
28
8
0
04 Aug 2022
Robot Policy Learning from Demonstration Using Advantage Weighting and
  Early Termination
Robot Policy Learning from Demonstration Using Advantage Weighting and Early Termination
A. Mohtasib
Gerhard Neumann
Heriberto Cuayáhuitl
OffRL
44
2
0
31 Jul 2022
Offline Reinforcement Learning at Multiple Frequencies
Offline Reinforcement Learning at Multiple Frequencies
Kaylee Burns
Tianhe Yu
Chelsea Finn
Karol Hausman
OffRL
33
6
0
26 Jul 2022
Discriminator-Weighted Offline Imitation Learning from Suboptimal
  Demonstrations
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations
Haoran Xu
Xianyuan Zhan
Honglei Yin
Huiling Qin
OffRL
33
66
0
20 Jul 2022
Making Linear MDPs Practical via Contrastive Representation Learning
Making Linear MDPs Practical via Contrastive Representation Learning
Tianjun Zhang
Tongzheng Ren
Mengjiao Yang
Joseph E. Gonzalez
Dale Schuurmans
Bo Dai
25
44
0
14 Jul 2022
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic
  Reinforcement Learning
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning
Homer Walke
Jonathan Yang
Albert Yu
Aviral Kumar
Jedrzej Orbik
Avi Singh
Sergey Levine
OffRL
OnRL
29
32
0
11 Jul 2022
Offline RL Policies Should be Trained to be Adaptive
Offline RL Policies Should be Trained to be Adaptive
Dibya Ghosh
Anurag Ajay
Pulkit Agrawal
Sergey Levine
OffRL
35
45
0
05 Jul 2022
Offline Policy Optimization with Eligible Actions
Offline Policy Optimization with Eligible Actions
Yao Liu
Yannis Flet-Berliac
Emma Brunskill
OffRL
25
5
0
01 Jul 2022
Watch and Match: Supercharging Imitation with Regularized Optimal
  Transport
Watch and Match: Supercharging Imitation with Regularized Optimal Transport
Siddhant Haldar
Vaibhav Mathur
Denis Yarats
Lerrel Pinto
61
62
0
30 Jun 2022
SMPL: Simulated Industrial Manufacturing and Process Control Learning
  Environments
SMPL: Simulated Industrial Manufacturing and Process Control Learning Environments
Mohan Zhang
Xiaozhou Wang
Benjamin Decardi-Nelson
Bo Song
A. Zhang
...
Jiayi Cheng
Xiaohong Liu
DengDeng Yu
Matthew Poon
Animesh Garg
26
4
0
17 Jun 2022
Relative Policy-Transition Optimization for Fast Policy Transfer
Relative Policy-Transition Optimization for Fast Policy Transfer
Jiawei Xu
Cheng Zhou
Yizheng Zhang
Zhengyou Zhang
Lei Han
21
0
0
13 Jun 2022
Challenges and Opportunities in Offline Reinforcement Learning from
  Visual Observations
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
Cong Lu
Philip J. Ball
Tim G. J. Rudner
Jack Parker-Holder
Michael A. Osborne
Yee Whye Teh
OffRL
32
52
0
09 Jun 2022
Previous
123456
Next