Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.06268
Cited By
Bi-Level Offline Policy Optimization with Limited Exploration
10 October 2023
Wenzhuo Zhou
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Bi-Level Offline Policy Optimization with Limited Exploration"
24 / 24 papers shown
Title
PR-Attack: Coordinated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Optimization
Yang Jiao
Xiao Wang
Kai Yang
AAML
SILM
93
1
0
10 Apr 2025
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps
Jinglin Chen
Nan Jiang
OffRL
98
35
0
25 Mar 2022
Offline Reinforcement Learning with Realizability and Single-policy Concentrability
Wenhao Zhan
Baihe Huang
Audrey Huang
Nan Jiang
Jason D. Lee
OffRL
394
112
0
09 Feb 2022
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
C. Shi
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
65
26
0
12 Nov 2021
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning
Wenzhuo Zhou
Ruoqing Zhu
Annie Qu
60
22
0
20 Oct 2021
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
299
924
0
12 Oct 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
95
119
0
19 Aug 2021
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
Masatoshi Uehara
Wen Sun
OffRL
141
150
0
13 Jul 2021
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Jongmin Lee
Wonseok Jeon
Byung-Jun Lee
J. Pineau
Kee-Eung Kim
OffRL
170
100
0
21 Jun 2021
Bellman-consistent Pessimism for Offline Reinforcement Learning
Tengyang Xie
Ching-An Cheng
Nan Jiang
Paul Mineiro
Alekh Agarwal
OffRL
LRM
154
278
0
13 Jun 2021
A Minimalist Approach to Offline Reinforcement Learning
Scott Fujimoto
S. Gu
OffRL
132
828
0
12 Jun 2021
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Paria Rashidinejad
Banghua Zhu
Cong Ma
Jiantao Jiao
Stuart J. Russell
OffRL
225
290
0
22 Mar 2021
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
176
359
0
30 Dec 2020
Reinforcement Learning for Strategic Recommendations
Georgios Theocharous
Yash Chandak
Philip S. Thomas
F. D. Nijs
OffRL
48
11
0
15 Sep 2020
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
143
1,831
0
08 Jun 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GP
OffRL
229
1,381
0
15 Apr 2020
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
69
321
0
01 Aug 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
134
1,066
0
03 Jun 2019
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Lin F. Yang
Mengdi Wang
OffRL
GP
64
288
0
24 May 2019
Information-Theoretic Considerations in Batch Reinforcement Learning
Jinglin Chen
Nan Jiang
OOD
OffRL
161
378
0
01 May 2019
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
236
1,624
0
07 Dec 2018
Safe Policy Improvement with Baseline Bootstrapping
Romain Laroche
P. Trichelair
Rémi Tachet des Combes
OffRL
64
201
0
19 Dec 2017
Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning
Daniel J. Luckett
Eric B. Laber
A. Kahkoska
D. Maahs
E. Mayer‐Davis
Michael R. Kosorok
67
137
0
10 Nov 2016
On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes
B. Scherrer
Boris Lesner
OffRL
84
51
0
29 Nov 2012
1