ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.06169
  4. Cited By
Offline Reinforcement Learning with Implicit Q-Learning

Offline Reinforcement Learning with Implicit Q-Learning

12 October 2021
Ilya Kostrikov
Ashvin Nair
Sergey Levine
    OffRL
ArXivPDFHTML

Papers citing "Offline Reinforcement Learning with Implicit Q-Learning"

50 / 73 papers shown
Title
PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning Projects
PyTupli: A Scalable Infrastructure for Collaborative Offline Reinforcement Learning Projects
Hannah Markgraf
Michael Eichelbeck
Daria Cappey
Selin Demirtürk
Yara Schattschneider
Matthias Althoff
OffRL
51
0
0
22 May 2025
Of Mice and Machines: A Comparison of Learning Between Real World Mice and RL Agents
Of Mice and Machines: A Comparison of Learning Between Real World Mice and RL Agents
Shuo Han
German Espinosa
Junda Huang
D. Dombeck
Malcolm A. MacIver
Bradly C. Stadie
88
0
0
18 May 2025
Retrospex: Language Agent Meets Offline Reinforcement Learning Critic
Retrospex: Language Agent Meets Offline Reinforcement Learning Critic
Yufei Xiang
Yiqun Shen
Yeqin Zhang
Cam-Tu Nguyen
OffRL
LLMAG
KELM
LRM
176
3
0
17 May 2025
Generative Auto-Bidding with Value-Guided Explorations
Generative Auto-Bidding with Value-Guided Explorations
Jingtong Gao
Yewen Li
Shuai Mao
Peng Jiang
Nan Jiang
...
Fei Pan
Peng Jiang
Kun Gai
Bo An
Xiangyu Zhao
OffRL
124
0
0
20 Apr 2025
SE(3)-Equivariant Robot Learning and Control: A Tutorial Survey
SE(3)-Equivariant Robot Learning and Control: A Tutorial Survey
Joohwan Seo
Soochul Yoo
Junwoo Chang
Hyunseok An
Hyunwoo Ryu
Soomi Lee
Arvind Kruthiventy
Jongeun Choi
R. Horowitz
102
2
0
12 Mar 2025
Generative Trajectory Stitching through Diffusion Composition
Generative Trajectory Stitching through Diffusion Composition
Yunhao Luo
Utkarsh Aashu Mishra
Yilun Du
Danfei Xu
386
5
0
07 Mar 2025
Multi-agent Auto-Bidding with Latent Graph Diffusion Models
Multi-agent Auto-Bidding with Latent Graph Diffusion Models
Dom Huh
P. Mohapatra
DiffM
AI4CE
73
0
0
04 Mar 2025
Yes, Q-learning Helps Offline In-Context RL
Yes, Q-learning Helps Offline In-Context RL
Denis Tarasov
Alexander Nikulin
Ilya Zisman
Albina Klepach
Andrei Polubarov
Nikita Lyubaykin
Alexander Derevyagin
Igor Kiselev
Vladislav Kurenkov
OffRL
OnRL
395
1
0
24 Feb 2025
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hojoon Lee
Youngdo Lee
Takuma Seno
Donghu Kim
Peter Stone
Jaegul Choo
137
3
0
21 Feb 2025
Data Center Cooling System Optimization Using Offline Reinforcement Learning
Data Center Cooling System Optimization Using Offline Reinforcement Learning
Xianyuan Zhan
Xiangyu Zhu
Peng Cheng
Xiao Hu
Ziteng He
...
Chenhui Liu
Tianshun Hong
Huiwen Zheng
Yunxin Liu
Feng Zhao
AI4CE
126
0
0
17 Feb 2025
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Shenghong He
OffRL
401
0
0
10 Feb 2025
Skill Expansion and Composition in Parameter Space
Skill Expansion and Composition in Parameter Space
Tenglong Liu
Junjie Li
Yinan Zheng
Haoyi Niu
Yixing Lan
Xin Xu
Xianyuan Zhan
103
4
0
09 Feb 2025
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
Vivek Myers
Bill Chunyuan Zheng
Anca Dragan
Kuan Fang
Sergey Levine
140
0
0
08 Feb 2025
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Patrick Yin
Tyler Westenbroek
Simran Bagaria
Kevin Huang
Ching-an Cheng
Andrey Kobolov
Abhishek Gupta
124
3
0
04 Feb 2025
Strengthening Generative Robot Policies through Predictive World Modeling
Strengthening Generative Robot Policies through Predictive World Modeling
Han Qi
Haocheng Yin
Aris Zhu
Yilun Du
Heng Yang
121
3
0
02 Feb 2025
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network
Jijia Liu
Feng Gao
Q. Liao
Chao Yu
Yu Wang
OffRL
113
0
0
01 Feb 2025
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
Wall Kim
Mamba
94
0
0
10 Jan 2025
SR-Reward: Taking The Path More Traveled
SR-Reward: Taking The Path More Traveled
Seyed Mahdi Basiri Azad
Zahra Padar
Gabriel Kalweit
Joschka Boedecker
OffRL
127
0
0
04 Jan 2025
OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
Yooseok Lim
Sujee Lee
OffRL
197
0
0
03 Jan 2025
AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games
AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games
Kefan Su
Yusen Huo
Zhilin Zhang
Shuai Dou
Chuan Yu
Jian Xu
Zongqing Lu
Bo Zheng
132
7
0
31 Dec 2024
Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning
Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning
Anthony Kobanda
Rémy Portelas
Odalric-Ambrym Maillard
Ludovic Denoyer
OffRL
CLL
124
1
0
19 Dec 2024
Auto-bidding in real-time auctions via Oracle Imitation Learning (OIL)
Auto-bidding in real-time auctions via Oracle Imitation Learning (OIL)
Alberto Silvio Chiappa
Briti Gangopadhyay
Zhao Wang
Shingo Takamatsu
125
1
0
16 Dec 2024
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
A. Jain
Harley Wiltzer
Jesse Farebrother
Irina Rish
Glen Berseth
Sanjiban Choudhury
98
2
0
11 Nov 2024
Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
Marvin Alles
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
OffRL
76
1
0
07 Nov 2024
Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy for Visuomotor Imitation Learning
Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy for Visuomotor Imitation Learning
George Jiayuan Gao
Tianyu Li
Nadia Figueroa
103
0
0
05 Nov 2024
Local Policies Enable Zero-shot Long-horizon Manipulation
Local Policies Enable Zero-shot Long-horizon Manipulation
Murtaza Dalal
Min Liu
Walter Talbott
Chen Chen
Deepak Pathak
Jian Zhang
Ruslan Salakhutdinov
109
3
0
29 Oct 2024
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
Jing Zhang
Linjiajie Fang
Kexin Shi
Wenjia Wang
Bing-Yi Jing
OffRL
111
0
0
27 Oct 2024
OGBench: Benchmarking Offline Goal-Conditioned RL
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park
Kevin Frans
Benjamin Eysenbach
Sergey Levine
OffRL
114
25
0
26 Oct 2024
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Max Wilcoxson
Qiyang Li
Kevin Frans
Sergey Levine
SSL
OffRL
OnRL
135
0
0
23 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
110
15
0
17 Oct 2024
Diffusion Model Predictive Control
Diffusion Model Predictive Control
Guangyao Zhou
Sivaramakrishnan Swaminathan
Rajkumar Vasudeva Raju
J. S. Guntupalli
Wolfgang Lehrach
Joseph Ortiz
Antoine Dedieu
Miguel Lázaro-Gredilla
Kevin P. Murphy
64
10
0
07 Oct 2024
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Zhaolin Gao
Wenhao Zhan
Jonathan D. Chang
Gokul Swamy
Kianté Brantley
Jason D. Lee
Wen Sun
OffRL
119
7
0
06 Oct 2024
Predictive Coding for Decision Transformer
Predictive Coding for Decision Transformer
Tung M. Luu
Donghoon Lee
Chang D. Yoo
OffRL
85
2
0
04 Oct 2024
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Jie Cheng
Ruixi Qiao
Gang Xiong
Binhua Li
Yingwei Ma
Binhua Li
Yongbin Li
Yisheng Lv
OffRL
OnRL
LM&Ro
85
4
0
01 Oct 2024
An Enhanced-State Reinforcement Learning Algorithm for Multi-Task Fusion in Large-Scale Recommender Systems
An Enhanced-State Reinforcement Learning Algorithm for Multi-Task Fusion in Large-Scale Recommender Systems
Peng Liu
Jiawei Zhu
Cong Xu
Ming Zhao
Bin Wang
49
1
0
18 Sep 2024
Offline Reinforcement Learning for Learning to Dispatch for Job Shop Scheduling
Offline Reinforcement Learning for Learning to Dispatch for Job Shop Scheduling
Jesse van Remmerden
Zaharah Bukhsh
Yingqian Zhang
OffRL
OnRL
92
1
0
16 Sep 2024
Domain Adaptation for Offline Reinforcement Learning with Limited Samples
Domain Adaptation for Offline Reinforcement Learning with Limited Samples
Weiqin Chen
Sandipan Mishra
Santiago Paternain
OffRL
82
2
0
22 Aug 2024
q-exponential family for policy optimization
q-exponential family for policy optimization
Lingwei Zhu
Haseeb Shah
Han Wang
Yukie Nagai
Martha White
OffRL
96
0
0
14 Aug 2024
How to Solve Contextual Goal-Oriented Problems with Offline Datasets?
How to Solve Contextual Goal-Oriented Problems with Offline Datasets?
Ying Fan
Jingling Li
Adith Swaminathan
Aditya Modi
Ching-An Cheng
OffRL
94
0
0
14 Aug 2024
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
Tao Ma
Xuzhi Yang
Zoltan Szabo
OffRL
105
0
0
01 Jul 2024
Residual-MPPI: Online Policy Customization for Continuous Control
Residual-MPPI: Online Policy Customization for Continuous Control
Pengcheng Wang
Chenran Li
Catherine Weaver
Kenta Kawamoto
Masayoshi Tomizuka
Chen Tang
Wei Zhan
OffRL
82
3
0
01 Jul 2024
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Vivek Myers
Chongyi Zheng
Anca Dragan
Sergey Levine
Benjamin Eysenbach
OffRL
88
12
0
24 Jun 2024
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL
Qi Lv
Xiang Deng
Gongwei Chen
Michael Yu Wang
Liqiang Nie
112
7
0
08 Jun 2024
Value Improved Actor Critic Algorithms
Value Improved Actor Critic Algorithms
Yaniv Oren
Moritz A. Zanger
Pascal R. van der Vaart
M. Spaan
Wendelin Bohmer
Wendelin Bohmer
OffRL
63
0
0
03 Jun 2024
Amortizing intractable inference in diffusion models for vision, language, and control
Amortizing intractable inference in diffusion models for vision, language, and control
S. Venkatraman
Moksh Jain
Luca Scimeca
Minsu Kim
Marcin Sendera
...
Alexandre Adam
Jarrid Rector-Brooks
Yoshua Bengio
Glen Berseth
Nikolay Malkin
116
30
0
31 May 2024
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Linjiajie Fang
Ruoxue Liu
Jing Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
94
7
0
31 May 2024
Return-Aligned Decision Transformer
Return-Aligned Decision Transformer
Tsunehiko Tanaka
Kenshi Abe
Kaito Ariu
Tetsuro Morimura
Edgar Simo-Serra
OffRL
101
1
0
06 Feb 2024
A Tractable Inference Perspective of Offline RL
A Tractable Inference Perspective of Offline RL
Xuejie Liu
Hoang Trung-Dung
Guy Van den Broeck
Yitao Liang
OffRL
84
1
0
31 Oct 2023
H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps
H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps
Haoyi Niu
Tianying Ji
Bingqi Liu
Haocheng Zhao
Xiangyu Zhu
Jianying Zheng
Pengfei Huang
Guyue Zhou
Jianming Hu
Xianyuan Zhan
OffRL
OnRL
AI4CE
76
8
0
22 Sep 2023
Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning
Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning
Jinyi Liu
Yi Ma
Jianye Hao
Yujing Hu
Yan Zheng
Tangjie Lv
Changjie Fan
OffRL
101
2
0
27 Jun 2023
12
Next