ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.09018
24
230

Provably efficient RL with Rich Observations via Latent State Decoding

25 January 2019
S. Du
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
Miroslav Dudík
John Langford
    OffRL
ArXivPDFHTML
Abstract

We study the exploration problem in episodic MDPs with rich observations generated from a small number of latent states. Under certain identifiability assumptions, we demonstrate how to estimate a mapping from the observations to latent states inductively through a sequence of regression and clustering steps -- where previously decoded latent states provide labels for later regression problems -- and use it to construct good exploration policies. We provide finite-sample guarantees on the quality of the learned state decoding function and exploration policies, and complement our theory with an empirical evaluation on a class of hard exploration problems. Our method exponentially improves over QQQ-learning with na\"ive exploration, even when QQQ-learning has cheating access to latent states.

View on arXiv
Comments on this paper