ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.05440
  4. Cited By
Dealing with the Unknown: Pessimistic Offline Reinforcement Learning

Dealing with the Unknown: Pessimistic Offline Reinforcement Learning

9 November 2021
Jinning Li
Chen Tang
Masayoshi Tomizuka
Wei Zhan
    OffRL
ArXivPDFHTML

Papers citing "Dealing with the Unknown: Pessimistic Offline Reinforcement Learning"

19 / 19 papers shown
Title
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
Haoran Xu
Shuozhe Li
Harshit S. Sikchi
S. Niekum
Amy Zhang
OffRL
27
0
0
17 Apr 2025
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
Chan Kim
Seung-Woo Seo
Seong-Woo Kim
OODD
187
0
0
21 Mar 2025
Uncertainty-Penalized Direct Preference Optimization
Uncertainty-Penalized Direct Preference Optimization
Sam Houliston
Alizée Pace
Alexander Immer
Gunnar Rätsch
34
0
0
26 Oct 2024
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with
  Stationary Distribution Shift Regularization
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization
The Viet Bui
Thanh Hong Nguyen
Tien Mai
OffRL
33
0
0
02 Oct 2024
Adaptive Prediction Ensemble: Improving Out-of-Distribution
  Generalization of Motion Forecasting
Adaptive Prediction Ensemble: Improving Out-of-Distribution Generalization of Motion Forecasting
Jinning Li
Jiachen Li
Sangjae Bae
David Isele
39
4
0
12 Jul 2024
Residual-MPPI: Online Policy Customization for Continuous Control
Residual-MPPI: Online Policy Customization for Continuous Control
Pengcheng Wang
Chenran Li
Catherine Weaver
Kenta Kawamoto
Masayoshi Tomizuka
Chen Tang
Wei Zhan
OffRL
37
3
0
01 Jul 2024
ODICE: Revealing the Mystery of Distribution Correction Estimation via
  Orthogonal-gradient Update
ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update
Liyuan Mao
Haoran Xu
Weinan Zhang
Xianyuan Zhan
34
10
0
01 Feb 2024
SeRO: Self-Supervised Reinforcement Learning for Recovery from
  Out-of-Distribution Situations
SeRO: Self-Supervised Reinforcement Learning for Recovery from Out-of-Distribution Situations
Chan Kim
JaeKyung Cho
C. Bobda
Seung-Woo Seo
Seong-Woo Kim
22
3
0
07 Nov 2023
Efficient Sim-to-real Transfer of Contact-Rich Manipulation Skills with
  Online Admittance Residual Learning
Efficient Sim-to-real Transfer of Contact-Rich Manipulation Skills with Online Admittance Residual Learning
Xiang Zhang
Changhao Wang
Lingfeng Sun
Zheng Wu
Xinghao Zhu
Masayoshi Tomizuka
OffRL
35
21
0
16 Oct 2023
Guided Online Distillation: Promoting Safe Reinforcement Learning by
  Offline Demonstration
Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration
Jinning Li
Xinyi Liu
Banghua Zhu
Jiantao Jiao
Masayoshi Tomizuka
Chen Tang
Wei Zhan
OffRL
OnRL
69
9
0
18 Sep 2023
DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning
DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning
Xiao-Yin Liu
Xiao-Hu Zhou
Xiaoliang Xie
Shiqi Liu
Zhen-Qiu Feng
Hao Li
Mei-Jiang Gui
Tian-Yu Xiang
De-Xing Huang
Zeng-Guang Hou
OffRL
OOD
21
5
0
16 Sep 2023
An Offline Learning Approach to Propagator Models
An Offline Learning Approach to Propagator Models
Eyal Neuman
Wolfgang Stockinger
Yufei Zhang
OffRL
25
6
0
06 Sep 2023
Diffusion Policies for Out-of-Distribution Generalization in Offline
  Reinforcement Learning
Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning
S. E. Ada
Erhan Öztop
Emre Ugur
OffRL
44
15
0
10 Jul 2023
Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning
Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning
Jinyi Liu
Y. Ma
Jianye Hao
Yujing Hu
Yan Zheng
Tangjie Lv
Changjie Fan
OffRL
44
2
0
27 Jun 2023
Design from Policies: Conservative Test-Time Adaptation for Offline
  Policy Optimization
Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization
Jinxin Liu
Hongyin Zhang
Zifeng Zhuang
Yachen Kang
Donglin Wang
Bin Wang
OffRL
44
8
0
26 Jun 2023
Residual Q-Learning: Offline and Online Policy Customization without
  Value
Residual Q-Learning: Offline and Online Policy Customization without Value
Chenran Li
Chen Tang
Haruki Nishimura
Jean-Pierre Mercat
Masayoshi Tomizuka
Wei Zhan
OffRL
36
6
0
15 Jun 2023
When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online
  Reinforcement Learning
When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning
Haoyi Niu
Shubham Sharma
Yiwen Qiu
Ming Li
Guyue Zhou
Jianming Hu
Xianyuan Zhan
OffRL
OnRL
27
46
0
27 Jun 2022
Hierarchical Planning Through Goal-Conditioned Offline Reinforcement
  Learning
Hierarchical Planning Through Goal-Conditioned Offline Reinforcement Learning
Jinning Li
Chen Tang
Masayoshi Tomizuka
Wei Zhan
OffRL
53
57
0
24 May 2022
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
340
1,960
0
04 May 2020
1