Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1605.08062
Cited By
A PAC RL Algorithm for Episodic POMDPs
25 May 2016
Z. Guo
Shayan Doroudi
Emma Brunskill
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A PAC RL Algorithm for Episodic POMDPs"
15 / 15 papers shown
Title
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
27
4
0
22 Feb 2024
Provable Representation with Efficient Planning for Partial Observable Reinforcement Learning
Hongming Zhang
Tongzheng Ren
Chenjun Xiao
Dale Schuurmans
Bo Dai
45
3
0
20 Nov 2023
Posterior Sampling-based Online Learning for Episodic POMDPs
Dengwang Tang
Dongze Ye
Rahul Jain
A. Nayyar
Pierluigi Nuzzo
OffRL
51
0
0
16 Oct 2023
Learning Optimal Admission Control in Partially Observable Queueing Networks
Jonatha Anselmi
B. Gaujal
Louis-Sébastien Rebuffi
19
1
0
04 Aug 2023
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP
Jiacheng Guo
Zihao Li
Huazheng Wang
Mengdi Wang
Zhuoran Yang
Xuezhou Zhang
32
5
0
21 Jun 2023
Reward-Mixing MDPs with a Few Latent Contexts are Learnable
Jeongyeol Kwon
Yonathan Efroni
C. Caramanis
Shie Mannor
27
5
0
05 Oct 2022
PAC Reinforcement Learning for Predictive State Representations
Wenhao Zhan
Masatoshi Uehara
Wen Sun
Jason D. Lee
31
38
0
12 Jul 2022
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings
Masatoshi Uehara
Ayush Sekhari
Jason D. Lee
Nathan Kallus
Wen Sun
58
6
0
24 Jun 2022
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems
Masatoshi Uehara
Ayush Sekhari
Jason D. Lee
Nathan Kallus
Wen Sun
OffRL
49
31
0
24 Jun 2022
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games
Qinghua Liu
Csaba Szepesvári
Chi Jin
29
20
0
02 Jun 2022
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Miao Lu
Yifei Min
Zhaoran Wang
Zhuoran Yang
OffRL
51
22
0
26 May 2022
Planning in Observable POMDPs in Quasipolynomial Time
Noah Golowich
Ankur Moitra
Dhruv Rohatgi
21
27
0
12 Jan 2022
Reinforcement Learning in Reward-Mixing MDPs
Jeongyeol Kwon
Yonathan Efroni
C. Caramanis
Shie Mannor
27
15
0
07 Oct 2021
Sublinear Regret for Learning POMDPs
Yi Xiong
Ningyuan Chen
Xuefeng Gao
Xiang Zhou
21
25
0
08 Jul 2021
RL for Latent MDPs: Regret Guarantees and a Lower Bound
Jeongyeol Kwon
Yonathan Efroni
C. Caramanis
Shie Mannor
21
77
0
09 Feb 2021
1