Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.04779
Cited By
Conservative Q-Learning for Offline Reinforcement Learning
8 June 2020
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Conservative Q-Learning for Offline Reinforcement Learning"
50 / 409 papers shown
Title
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRL
OnRL
33
1
0
16 May 2025
Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer
Minh Hoang Nguyen
Linh Le Pham Van
Thommen George Karimpanal
Sunil Gupta
Hung Le
OffRL
LRM
37
0
0
14 May 2025
Feasibility-Aware Pessimistic Estimation: Toward Long-Horizon Safety in Offline RL
Zhikun Tao
Gang Xiong
He Fang
Zhen Shen
Yunjun Han
Qing-Shan Jia
OffRL
34
0
0
13 May 2025
What Matters for Batch Online Reinforcement Learning in Robotics?
Perry Dong
Suvir Mirchandani
Dorsa Sadigh
Chelsea Finn
OffRL
31
0
0
12 May 2025
VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making
Jake Grigsby
Yuke Zhu
Michael S Ryoo
Juan Carlos Niebles
OffRL
VLM
41
0
0
06 May 2025
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Jifeng Hu
Sili Huang
Zhengyuan Yang
Shengchao Hu
Li Shen
H. Chen
Lichao Sun
Yi-Ju Chang
Dacheng Tao
OffRL
170
0
0
03 May 2025
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Yuhan Li
Eugene Han
Yifan Hu
Wenzhuo Zhou
Zhengling Qi
Yifan Cui
Ruoqing Zhu
OffRL
162
0
0
01 May 2025
Uncertainty-aware Latent Safety Filters for Avoiding Out-of-Distribution Failures
Junwon Seo
Kensuke Nakamura
Andrea V. Bajcsy
56
0
0
01 May 2025
Learning Neural Control Barrier Functions from Offline Data with Conservatism
Ihab Tabbara
Hussein Sibai
OffRL
65
0
0
01 May 2025
Fine-Tuning without Performance Degradation
Han Wang
Adam White
Martha White
OnRL
187
0
0
01 May 2025
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Wenjun Cao
52
0
0
26 Apr 2025
Offline Learning of Controllable Diverse Behaviors
Mathieu Petitbois
Rémy Portelas
Sylvain Lamprier
Ludovic Denoyer
OffRL
38
0
0
25 Apr 2025
Playing Non-Embedded Card-Based Games with Reinforcement Learning
Tianyang Wu
Lipeng Wan
Yuhang Wang
Qiang Wan
Xuguang Lan
OffRL
30
0
0
07 Apr 2025
Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning
Younghwan Lee
Tung M. Luu
Donghoon Lee
Chang D. Yoo
3DV
VLM
OffRL
41
0
0
03 Apr 2025
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
Chan Kim
Seung-Woo Seo
Seong-Woo Kim
OODD
187
0
0
21 Mar 2025
Quantization-Free Autoregressive Action Transformer
Ziyad Sheebaelhamd
Michael Tschannen
Michael Muehlebach
Claire Vernade
49
0
0
18 Mar 2025
SE(3)-Equivariant Robot Learning and Control: A Tutorial Survey
Joohwan Seo
Soochul Yoo
Junwoo Chang
Hyunseok An
Hyunwoo Ryu
Soomi Lee
Arvind Kruthiventy
Jongeun Choi
R. Horowitz
71
2
0
12 Mar 2025
Mitigating Preference Hacking in Policy Optimization with Pessimism
Dhawal Gupta
Adam Fisch
Christoph Dann
Alekh Agarwal
76
0
0
10 Mar 2025
DPR: Diffusion Preference-based Reward for Offline Reinforcement Learning
Teng Pang
Bingzheng Wang
Guoqiang Wu
Yilong Yin
OffRL
73
0
0
03 Mar 2025
Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models
Zhanpeng He
Yifeng Cao
M. Ciocarlie
69
0
0
26 Feb 2025
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Lixing Lyu
Jiashuo Jiang
Wang Chi Cheung
42
1
0
24 Feb 2025
Yes, Q-learning Helps Offline In-Context RL
Denis Tarasov
Alexander Nikulin
Ilya Zisman
Albina Klepach
Andrei Polubarov
Nikita Lyubaykin
Alexander Derevyagin
Igor Kiselev
Vladislav Kurenkov
OffRL
OnRL
199
0
0
24 Feb 2025
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hojoon Lee
Youngdo Lee
Takuma Seno
Donghu Kim
Peter Stone
Jaegul Choo
63
1
0
24 Feb 2025
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen
Jincheng Mei
Katayoon Goshvadi
Hanjun Dai
Tong Yang
Sherry Yang
Dale Schuurmans
Yuejie Chi
Bo Dai
OffRL
65
23
0
20 Feb 2025
Data Center Cooling System Optimization Using Offline Reinforcement Learning
Xianyuan Zhan
Xiangyu Zhu
Peng Cheng
Xiao Hu
Ziteng He
...
Chenhui Liu
Tianshun Hong
Yan Liang
Yunxin Liu
Feng Zhao
AI4CE
62
0
0
17 Feb 2025
Learning Strategy Representation for Imitation Learning in Multi-Agent Games
Shiqi Lei
Kanghon Lee
Linjing Li
Jinkyoo Park
OffRL
42
0
0
17 Feb 2025
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu
Lingfeng Zhao
Shivangi Agarwal
Jinghan Liu
Audrey Huang
P. Amortila
Nan Jiang
OODD
OffRL
101
0
0
11 Feb 2025
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Shenghong He
OffRL
192
0
0
10 Feb 2025
Skill Expansion and Composition in Parameter Space
Tenglong Liu
J. Li
Yinan Zheng
Haoyi Niu
Yixing Lan
Xin Xu
Xianyuan Zhan
58
4
0
09 Feb 2025
Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning
Wesley A. Suttle
A. Suresh
Carlos Nieto-Granda
OffRL
97
0
0
06 Feb 2025
Learning from Active Human Involvement through Proxy Value Propagation
Zhenghao Peng
Wenjie Mo
Chenda Duan
Quanyi Li
Bolei Zhou
107
14
0
05 Feb 2025
Dual Alignment Maximin Optimization for Offline Model-based RL
Chi Zhou
Wang Luo
Haoran Li
Congying Han
Tiande Guo
Zicheng Zhang
OffRL
71
0
0
02 Feb 2025
B3C: A Minimalist Approach to Offline Multi-Agent Reinforcement Learning
Woojun Kim
Katia P. Sycara
OffRL
94
0
0
30 Jan 2025
Reinforcement Teaching
Alex Lewandowski
Calarina Muslimani
Dale Schuurmans
Matthew E. Taylor
Jun Luo
81
1
0
28 Jan 2025
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
Hao Sun
M. Schaar
94
14
0
28 Jan 2025
Coordinating Ride-Pooling with Public Transit using Reward-Guided Conservative Q-Learning: An Offline Training and Online Fine-Tuning Reinforcement Learning Framework
Yulong Hu
Tingting Dong
Sen Li
OffRL
OnRL
64
0
0
24 Jan 2025
State Combinatorial Generalization In Decision Making With Conditional Diffusion Models
Xintong Duan
Yutong He
Fahim Tajwar
Wen-Tse Chen
Ruslan Salakhutdinov
Jeff Schneider
OffRL
AI4CE
101
0
0
22 Jan 2025
An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management
Eslam Eldeeb
Hirley Alves
OffRL
80
0
0
22 Jan 2025
Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning
Abdullah Akgul
Manuel Haußmann
M. Kandemir
OffRL
76
1
0
17 Jan 2025
Methodology for Interpretable Reinforcement Learning for Optimizing Mechanical Ventilation
Joo Seung Lee
Malini Mahendra
Anil Aswani
OffRL
67
1
0
10 Jan 2025
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
Wall Kim
Mamba
60
0
0
10 Jan 2025
Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning
Tao Liu
Qi Xu
Wei Shi
Zhigang Hua
Shuang Yang
OffRL
43
0
0
09 Jan 2025
SR-Reward: Taking The Path More Traveled
Seyed Mahdi Basiri Azad
Zahra Padar
Gabriel Kalweit
Joschka Boedecker
OffRL
67
0
0
04 Jan 2025
MADiff: Offline Multi-agent Learning with Diffusion Models
Zhengbang Zhu
Minghuan Liu
Liyuan Mao
Bingyi Kang
Minkai Xu
Yong Yu
Stefano Ermon
Weinan Zhang
DiffM
OffRL
88
34
0
03 Jan 2025
OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
Yooseok Lim
Sujee Lee
OffRL
150
0
0
03 Jan 2025
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI
Fangwei Zhong
Kui Wu
Churan Wang
Hao Chen
Hai Ci
Zhoujun Li
Yizhou Wang
VGen
40
0
0
31 Dec 2024
ACL-QL: Adaptive Conservative Level in Q-Learning for Offline Reinforcement Learning
Kun Wu
Yinuo Zhao
Zhihao Xu
Zhengping Che
Chengxiang Yin
C. Liu
Qinru Qiu
Feiferi Feng
OffRL
102
1
0
22 Dec 2024
Dense Dynamics-Aware Reward Synthesis: Integrating Prior Experience with Demonstrations
Cevahir Köprülü
Po-han Li
Tianyu Qiu
Ruihan Zhao
T. Westenbroek
David Fridovich-Keil
Sandeep P. Chinchali
Ufuk Topcu
OffRL
94
0
0
02 Dec 2024
OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control
Rohit Bokade
Xiaoning Jin
OffRL
39
0
0
10 Nov 2024
Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
Marvin Alles
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
OffRL
39
0
0
07 Nov 2024
1
2
3
4
5
6
7
8
9
Next