Reinforcement Learning of POMDPs using Spectral Methods

25 February 2016

Papers citing "Reinforcement Learning of POMDPs using Spectral Methods"

16 / 16 papers shown

Title
Mixing time estimation in reversible Markov chains from a single sample path Daniel J. Hsu A. Kontorovich D. A. Levin Yuval Peres Csaba Szepesvári 39 82 0 24 Aug 2017
A PAC RL Algorithm for Episodic POMDPs Z. Guo Shayan Doroudi Emma Brunskill 65 56 0 25 May 2016
PAC Reinforcement Learning with Rich Observations A. Krishnamurthy Alekh Agarwal John Langford OffRL 11 8 0 08 Feb 2016
Selecting Near-Optimal Approximate State Representations in Reinforcement Learning R. Ortner Odalric-Ambrym Maillard D. Ryabko 120 27 0 12 May 2014
Playing Atari with Deep Reinforcement Learning Volodymyr Mnih Koray Kavukcuoglu David Silver Alex Graves Ioannis Antonoglou Daan Wierstra Martin Riedmiller 103 12,163 0 19 Dec 2013
Efficient Learning and Planning with Compressed Predictive States William L. Hamilton M. M. Fard Joelle Pineau 50 41 0 01 Dec 2013
Nonparametric Estimation of Multi-View Latent Variable Models Le Song Anima Anandkumar Bo Dai Bo Xie 43 44 0 13 Nov 2013
Sequential Transfer in Multi-armed Bandit with Finite Set of Models M. G. Azar A. Lazaric Emma Brunskill OffRL 99 114 0 25 Jul 2013
Regret Bounds for Reinforcement Learning with Policy Advice M. G. Azar A. Lazaric Emma Brunskill 65 36 0 05 May 2013
On learning parametric-output HMMs A. Kontorovich B. Nadler Roi Weiss 50 37 0 25 Feb 2013
PEGASUS: A Policy Search Method for Large MDPs and POMDPs A. Ng Michael I. Jordan 63 496 0 16 Jan 2013
Tensor decompositions for learning latent variable models Anima Anandkumar Rong Ge Daniel J. Hsu Sham Kakade Matus Telgarsky 273 1,142 0 29 Oct 2012
REGAL: A Regularization based Algorithm for Reinforcement Learning in Weakly Communicating MDPs Peter L. Bartlett Ambuj Tewari 71 280 0 09 May 2012
A Method of Moments for Mixture Models and Hidden Markov Models Anima Anandkumar Daniel J. Hsu Sham Kakade 127 341 0 03 Mar 2012
Infinite-Horizon Policy-Gradient Estimation Jonathan Baxter Peter L. Bartlett 70 808 0 03 Jun 2011
Closing the Learning-Planning Loop with Predictive State Representations Byron Boots S. Siddiqi Geoffrey J. Gordon 204 265 0 12 Dec 2009