ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.00949
  4. Cited By
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

3 June 2019
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
    OffRL
    OnRL
ArXivPDFHTML

Papers citing "Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction"

50 / 269 papers shown
Title
Imagination-Limited Q-Learning for Offline Reinforcement Learning
Imagination-Limited Q-Learning for Offline Reinforcement Learning
Wenhui Liu
Zhijian Wu
Jingchao Wang
Dingjiang Huang
Shuigeng Zhou
OffRL
24
0
0
18 May 2025
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts
Jing-Cheng Pang
Kaiyuan Li
Yali Wang
Si-Hang Yang
Shengyi Jiang
Yang Yu
OffRL
LLMAG
LM&Ro
LRM
19
0
0
15 May 2025
Feasibility-Aware Pessimistic Estimation: Toward Long-Horizon Safety in Offline RL
Feasibility-Aware Pessimistic Estimation: Toward Long-Horizon Safety in Offline RL
Zhikun Tao
Gang Xiong
He Fang
Zhen Shen
Yunjun Han
Qing-Shan Jia
OffRL
36
0
0
13 May 2025
DARLR: Dual-Agent Offline Reinforcement Learning for Recommender Systems with Dynamic Reward
DARLR: Dual-Agent Offline Reinforcement Learning for Recommender Systems with Dynamic Reward
Yi Zhang
Ruihong Qiu
Xuwei Xu
Jiajun Liu
Sen Wang
OffRL
34
0
0
12 May 2025
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach
Minting Pan
Yitao Zheng
Jiajian Li
Yunbo Wang
Xiaokang Yang
OffRL
50
0
0
10 May 2025
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
Jake Grigsby
Yuke Zhu
Michael S Ryoo
Juan Carlos Niebles
OffRL
VLM
46
0
0
06 May 2025
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Jifeng Hu
Sili Huang
Zheng Yang
Shengchao Hu
Li Shen
Hechang Chen
Lichao Sun
Yi-Ju Chang
Dacheng Tao
OffRL
233
0
0
03 May 2025
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
37
0
0
02 May 2025
Learning Neural Control Barrier Functions from Offline Data with Conservatism
Learning Neural Control Barrier Functions from Offline Data with Conservatism
Ihab Tabbara
Hussein Sibai
OffRL
65
0
0
01 May 2025
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
Chan Kim
Seung-Woo Seo
Seong-Woo Kim
OODD
250
0
0
21 Mar 2025
Mitigating Preference Hacking in Policy Optimization with Pessimism
Dhawal Gupta
Adam Fisch
Christoph Dann
Alekh Agarwal
76
0
0
10 Mar 2025
M3HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality
Ziyan Wang
Zhicheng Zhang
Fei Fang
Yali Du
49
1
0
03 Mar 2025
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Lixing Lyu
Jiashuo Jiang
Wang Chi Cheung
42
1
0
24 Feb 2025
Data Center Cooling System Optimization Using Offline Reinforcement Learning
Data Center Cooling System Optimization Using Offline Reinforcement Learning
Xianyuan Zhan
Xiangyu Zhu
Peng Cheng
Xiao Hu
Ziteng He
...
Chenhui Liu
Tianshun Hong
Huiwen Zheng
Yunxin Liu
Feng Zhao
AI4CE
64
0
0
17 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
120
4
0
06 Feb 2025
FuzzyLight: A Robust Two-Stage Fuzzy Approach for Traffic Signal Control Works in Real Cities
Mingyuan Li
Jiahao Wang
Bo Du
Jun Shen
Qiang Wu
59
1
0
28 Jan 2025
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
Zijian Guo
Weichao Zhou
Wenchao Li
OffRL
105
2
0
28 Jan 2025
Evolution and The Knightian Blindspot of Machine Learning
Evolution and The Knightian Blindspot of Machine Learning
Joel Lehman
Elliot Meyerson
Tarek El-Gaaly
Kenneth O. Stanley
Tarin Ziyaee
99
2
0
22 Jan 2025
Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning
Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning
Abdullah Akgul
Manuel Haußmann
M. Kandemir
OffRL
80
1
0
17 Jan 2025
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
Keru Chen
Honghao Wei
Zhigang Deng
Sen Lin
OffRL
OnRL
96
0
0
31 Dec 2024
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
Kun Wu
Yinuo Zhao
Zhihao Xu
Zhengping Che
Chengxiang Yin
C. Liu
Qinru Qiu
Feiferi Feng
OffRL
109
1
0
22 Dec 2024
OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control
OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control
Rohit Bokade
Xiaoning Jin
OffRL
39
0
0
10 Nov 2024
Acceleration for Deep Reinforcement Learning using Parallel and
  Distributed Computing: A Survey
Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey
Zhihong Liu
Xin Xu
Peng Qiao
Dongsheng Li
OffRL
34
2
0
08 Nov 2024
Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
Marvin Alles
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
OffRL
41
1
0
07 Nov 2024
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
Jing Zhang
Linjiajie Fang
Kexin Shi
Wenjia Wang
Bing-Yi Jing
OffRL
44
0
0
27 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
79
14
0
17 Oct 2024
DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive
  Revaluation
DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation
Jaehyun Park
Yunho Kim
Sejin Kim
Byung-Jun Lee
Sundong Kim
OffRL
39
1
0
15 Oct 2024
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Jiayu Chen
Wentse Chen
Jeff Schneider
OffRL
40
2
0
15 Oct 2024
SAPIENT: Mastering Multi-turn Conversational Recommendation with Strategic Planning and Monte Carlo Tree Search
SAPIENT: Mastering Multi-turn Conversational Recommendation with Strategic Planning and Monte Carlo Tree Search
Hanwen Du
B. Peng
Xia Ning
38
0
0
12 Oct 2024
Predictive Coding for Decision Transformer
Predictive Coding for Decision Transformer
Tung M. Luu
Donghoon Lee
Chang D. Yoo
OffRL
66
2
0
04 Oct 2024
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Jie Cheng
Ruixi Qiao
Gang Xiong
Binhua Li
Yingwei Ma
Binhua Li
Yongbin Li
Yisheng Lv
OffRL
OnRL
LM&Ro
50
3
0
01 Oct 2024
An Enhanced-State Reinforcement Learning Algorithm for Multi-Task Fusion in Large-Scale Recommender Systems
An Enhanced-State Reinforcement Learning Algorithm for Multi-Task Fusion in Large-Scale Recommender Systems
Peng Liu
Jiawei Zhu
Cong Xu
Ming Zhao
Bin Wang
31
1
0
18 Sep 2024
SAMBO-RL: Shifts-aware Model-based Offline Reinforcement Learning
SAMBO-RL: Shifts-aware Model-based Offline Reinforcement Learning
Wang Luo
Haoran Li
Zicheng Zhang
Congying Han
Jiayu Lv
Tiande Guo
OffRL
50
1
0
23 Aug 2024
Domain Adaptation for Offline Reinforcement Learning with Limited Samples
Domain Adaptation for Offline Reinforcement Learning with Limited Samples
Weiqin Chen
Sandipan Mishra
Santiago Paternain
OffRL
51
2
0
22 Aug 2024
Hokoff: Real Game Dataset from Honor of Kings and its Offline
  Reinforcement Learning Benchmarks
Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks
Yun Qu
Boyuan Wang
Jianzhun Shao
Yuhang Jiang
Chen Chen
...
Qiang Fu
Wei Yang
Guang Yang
Lanxiao Huang
Xiangyang Ji
OffRL
54
9
0
20 Aug 2024
ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems
ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems
Yi Zhang
Ruihong Qiu
Jiajun Liu
Sen Wang
OffRL
21
0
0
18 Jul 2024
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
Tao Ma
Xuzhi Yang
Zoltan Szabo
OffRL
73
0
0
01 Jul 2024
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Vivek Myers
Chongyi Zheng
Anca Dragan
Sergey Levine
Benjamin Eysenbach
OffRL
50
8
0
24 Jun 2024
Residual Learning and Context Encoding for Adaptive Offline-to-Online
  Reinforcement Learning
Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning
Mohammadreza Nakhaei
Aidan Scannell
Joni Pajarinen
OffRL
55
1
0
12 Jun 2024
CDSA: Conservative Denoising Score-based Algorithm for Offline
  Reinforcement Learning
CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning
Zeyuan Liu
Kai Yang
Xiu Li
OffRL
46
0
0
11 Jun 2024
Augmenting Offline RL with Unlabeled Data
Augmenting Offline RL with Unlabeled Data
Zhao Wang
Briti Gangopadhyay
Jia-Fong Yeh
Shingo Takamatsu
OffRL
33
0
0
11 Jun 2024
Integrating Domain Knowledge for handling Limited Data in Offline RL
Integrating Domain Knowledge for handling Limited Data in Offline RL
Briti Gangopadhyay
Zhao Wang
Jia-Fong Yeh
Shingo Takamatsu
OffRL
32
0
0
11 Jun 2024
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL
Qi Lv
Xiang Deng
Gongwei Chen
Michael Yu Wang
Liqiang Nie
78
7
0
08 Jun 2024
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
Subhojyoti Mukherjee
Josiah P. Hanna
Qiaomin Xie
Robert Nowak
89
2
0
07 Jun 2024
UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function
  in Offline Reinforcement Learning
UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning
Yu Zhang
Rui Yu
Zhipeng Yao
Wenyuan Zhang
Jun Wang
Liming Zhang
OffRL
58
0
0
05 Jun 2024
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Linjiajie Fang
Ruoxue Liu
Jing Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
59
3
0
31 May 2024
GTA: Generative Trajectory Augmentation with Guidance for Offline
  Reinforcement Learning
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
Jaewoo Lee
Sujin Yun
Taeyoung Yun
Jinkyoo Park
52
7
0
27 May 2024
State-Constrained Offline Reinforcement Learning
State-Constrained Offline Reinforcement Learning
Charles A. Hepburn
Yue Jin
Giovanni Montana
OffRL
49
0
0
23 May 2024
Exclusively Penalized Q-learning for Offline Reinforcement Learning
Exclusively Penalized Q-learning for Offline Reinforcement Learning
Junghyuk Yeom
Yonghyeon Jo
Jungmo Kim
Sanghyeon Lee
Seungyul Han
OffRL
46
2
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
45
0
23 May 2024
123456
Next