ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.03635
  4. Cited By
Sublinear Regret for Learning POMDPs
v1v2v3v4 (latest)

Sublinear Regret for Learning POMDPs

8 July 2021
Yi Xiong
Ningyuan Chen
Xuefeng Gao
Xiang Zhou
ArXiv (abs)PDFHTML

Papers citing "Sublinear Regret for Learning POMDPs"

26 / 26 papers shown
Title
Online Learning for Unknown Partially Observable MDPs
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
78
20
0
25 Feb 2021
RL for Latent MDPs: Regret Guarantees and a Lower Bound
RL for Latent MDPs: Regret Guarantees and a Lower Bound
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
67
80
0
09 Feb 2021
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
Chi Jin
Sham Kakade
A. Krishnamurthy
Qinghua Liu
84
66
0
22 Jun 2020
Regime Switching Bandits
Regime Switching Bandits
Xiang Zhou
Yi Xiong
Ningyuan Chen
Xuefeng Gao
51
19
0
26 Jan 2020
A Tractable Algorithm For Finite-Horizon Continuous Reinforcement
  Learning
A Tractable Algorithm For Finite-Horizon Continuous Reinforcement Learning
Phanideep Gampa
Sairam Satwik Kondamudi
L. Kailasam
34
1
0
26 Jun 2019
Regret Minimization for Reinforcement Learning by Evaluating the Optimal
  Bias Function
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Zihan Zhang
Xiangyang Ji
60
72
0
12 Jun 2019
Hedging the Drift: Learning to Optimize under Non-Stationarity
Hedging the Drift: Learning to Optimize under Non-Stationarity
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
65
91
0
04 Mar 2019
Exploration Bonus for Regret Minimization in Undiscounted Discrete and
  Continuous Markov Decision Processes
Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes
Jian Qian
Ronan Fruit
Matteo Pirotta
A. Lazaric
33
10
0
11 Dec 2018
Is Q-learning Provably Efficient?
Is Q-learning Provably Efficient?
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
70
807
0
10 Jul 2018
Deep Variational Reinforcement Learning for POMDPs
Deep Variational Reinforcement Learning for POMDPs
Maximilian Igl
L. Zintgraf
T. Le
Frank Wood
Shimon Whiteson
BDLOffRL
68
262
0
06 Jun 2018
Multi-Armed Bandits for Correlated Markovian Environments with Smoothed
  Reward Feedback
Multi-Armed Bandits for Correlated Markovian Environments with Smoothed Reward Feedback
Tanner Fiez
S. Sekar
Lillian J. Ratliff
49
8
0
11 Mar 2018
Posterior sampling for reinforcement learning: worst-case regret bounds
Posterior sampling for reinforcement learning: worst-case regret bounds
Shipra Agrawal
Randy Jia
59
37
0
19 May 2017
Minimax Regret Bounds for Reinforcement Learning
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
86
775
0
16 Mar 2017
Consistent order estimation for nonparametric Hidden Markov Models
Consistent order estimation for nonparametric Hidden Markov Models
Luc Lehéricy
35
16
0
02 Jun 2016
A PAC RL Algorithm for Episodic POMDPs
A PAC RL Algorithm for Episodic POMDPs
Z. Guo
Shayan Doroudi
Emma Brunskill
77
56
0
25 May 2016
Reinforcement Learning of POMDPs using Spectral Methods
Reinforcement Learning of POMDPs using Spectral Methods
Kamyar Azizzadenesheli
A. Lazaric
Anima Anandkumar
44
128
0
25 Feb 2016
Statistical and Computational Guarantees for the Baum-Welch Algorithm
Statistical and Computational Guarantees for the Baum-Welch Algorithm
Fanny Yang
Sivaraman Balakrishnan
Martin J. Wainwright
57
42
0
27 Dec 2015
Deep Recurrent Q-Learning for Partially Observable MDPs
Deep Recurrent Q-Learning for Partially Observable MDPs
Matthew J. Hausknecht
Peter Stone
108
1,679
0
23 Jul 2015
Consistent estimation of the filtering and marginal smoothing
  distributions in nonparametric hidden Markov models
Consistent estimation of the filtering and marginal smoothing distributions in nonparametric hidden Markov models
Yohann De Castro
Elisabeth Gassiat
Sylvain Le Corff
45
32
0
23 Jul 2015
Statistical guarantees for the EM algorithm: From population to
  sample-based analysis
Statistical guarantees for the EM algorithm: From population to sample-based analysis
Sivaraman Balakrishnan
Martin J. Wainwright
Bin Yu
310
479
0
09 Aug 2014
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
R. Ortner
D. Ryabko
OffRL
91
85
0
11 Feb 2013
Tensor decompositions for learning latent variable models
Tensor decompositions for learning latent variable models
Anima Anandkumar
Rong Ge
Daniel J. Hsu
Sham Kakade
Matus Telgarsky
440
1,145
0
29 Oct 2012
Regret Bounds for Restless Markov Bandits
Regret Bounds for Restless Markov Bandits
R. Ortner
D. Ryabko
P. Auer
Rémi Munos
96
117
0
12 Sep 2012
Discretized Approximations for POMDP with Average Cost
Discretized Approximations for POMDP with Average Cost
Huizhen Yu
Dimitri Bertsekas
56
53
0
11 Jul 2012
A Method of Moments for Mixture Models and Hidden Markov Models
A Method of Moments for Mixture Models and Hidden Markov Models
Anima Anandkumar
Daniel J. Hsu
Sham Kakade
188
344
0
03 Mar 2012
Linearly Parameterized Bandits
Linearly Parameterized Bandits
Paat Rusmevichientong
J. Tsitsiklis
389
559
0
18 Dec 2008
1