Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.03635
Cited By
v1
v2
v3
v4 (latest)
Sublinear Regret for Learning POMDPs
8 July 2021
Yi Xiong
Ningyuan Chen
Xuefeng Gao
Xiang Zhou
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Sublinear Regret for Learning POMDPs"
26 / 26 papers shown
Title
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
76
20
0
25 Feb 2021
RL for Latent MDPs: Regret Guarantees and a Lower Bound
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
67
80
0
09 Feb 2021
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
Chi Jin
Sham Kakade
A. Krishnamurthy
Qinghua Liu
84
66
0
22 Jun 2020
Regime Switching Bandits
Xiang Zhou
Yi Xiong
Ningyuan Chen
Xuefeng Gao
51
19
0
26 Jan 2020
A Tractable Algorithm For Finite-Horizon Continuous Reinforcement Learning
Phanideep Gampa
Sairam Satwik Kondamudi
L. Kailasam
34
1
0
26 Jun 2019
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Zihan Zhang
Xiangyang Ji
60
72
0
12 Jun 2019
Hedging the Drift: Learning to Optimize under Non-Stationarity
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
65
91
0
04 Mar 2019
Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes
Jian Qian
Ronan Fruit
Matteo Pirotta
A. Lazaric
33
10
0
11 Dec 2018
Is Q-learning Provably Efficient?
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
70
807
0
10 Jul 2018
Deep Variational Reinforcement Learning for POMDPs
Maximilian Igl
L. Zintgraf
T. Le
Frank Wood
Shimon Whiteson
BDL
OffRL
68
262
0
06 Jun 2018
Multi-Armed Bandits for Correlated Markovian Environments with Smoothed Reward Feedback
Tanner Fiez
S. Sekar
Lillian J. Ratliff
47
8
0
11 Mar 2018
Posterior sampling for reinforcement learning: worst-case regret bounds
Shipra Agrawal
Randy Jia
57
37
0
19 May 2017
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
86
775
0
16 Mar 2017
Consistent order estimation for nonparametric Hidden Markov Models
Luc Lehéricy
35
16
0
02 Jun 2016
A PAC RL Algorithm for Episodic POMDPs
Z. Guo
Shayan Doroudi
Emma Brunskill
77
56
0
25 May 2016
Reinforcement Learning of POMDPs using Spectral Methods
Kamyar Azizzadenesheli
A. Lazaric
Anima Anandkumar
44
128
0
25 Feb 2016
Statistical and Computational Guarantees for the Baum-Welch Algorithm
Fanny Yang
Sivaraman Balakrishnan
Martin J. Wainwright
57
42
0
27 Dec 2015
Deep Recurrent Q-Learning for Partially Observable MDPs
Matthew J. Hausknecht
Peter Stone
108
1,679
0
23 Jul 2015
Consistent estimation of the filtering and marginal smoothing distributions in nonparametric hidden Markov models
Yohann De Castro
Elisabeth Gassiat
Sylvain Le Corff
45
32
0
23 Jul 2015
Statistical guarantees for the EM algorithm: From population to sample-based analysis
Sivaraman Balakrishnan
Martin J. Wainwright
Bin Yu
310
479
0
09 Aug 2014
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
R. Ortner
D. Ryabko
OffRL
91
85
0
11 Feb 2013
Tensor decompositions for learning latent variable models
Anima Anandkumar
Rong Ge
Daniel J. Hsu
Sham Kakade
Matus Telgarsky
440
1,145
0
29 Oct 2012
Regret Bounds for Restless Markov Bandits
R. Ortner
D. Ryabko
P. Auer
Rémi Munos
96
117
0
12 Sep 2012
Discretized Approximations for POMDP with Average Cost
Huizhen Yu
Dimitri Bertsekas
54
53
0
11 Jul 2012
A Method of Moments for Mixture Models and Hidden Markov Models
Anima Anandkumar
Daniel J. Hsu
Sham Kakade
188
344
0
03 Mar 2012
Linearly Parameterized Bandits
Paat Rusmevichientong
J. Tsitsiklis
389
559
0
18 Dec 2008
1