ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.04570
  4. Cited By
Learning Unknown Markov Decision Processes: A Thompson Sampling Approach

Learning Unknown Markov Decision Processes: A Thompson Sampling Approach

14 September 2017
Ouyang Yi
Mukul Gagrani
A. Nayyar
R. Jain
ArXivPDFHTML

Papers citing "Learning Unknown Markov Decision Processes: A Thompson Sampling Approach"

9 / 9 papers shown
Title
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem
Nima Akbarzadeh
Erick Delage
Yossiri Adulyasak
100
0
0
30 Oct 2024
NeoRL: Efficient Exploration for Nonepisodic RL
NeoRL: Efficient Exploration for Nonepisodic RL
Bhavya Sukhija
Lenart Treven
Florian Dorfler
Stelian Coros
Andreas Krause
OffRL
68
0
0
03 Jun 2024
Posterior Sampling for Reinforcement Learning Without Episodes
Posterior Sampling for Reinforcement Learning Without Episodes
Ian Osband
Benjamin Van Roy
OffRL
22
22
0
09 Aug 2016
Why is Posterior Sampling Better than Optimism for Reinforcement
  Learning?
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
Ian Osband
Benjamin Van Roy
BDL
74
257
0
01 Jul 2016
Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning
Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning
Christoph Dann
Emma Brunskill
50
249
0
29 Oct 2015
(More) Efficient Reinforcement Learning via Posterior Sampling
(More) Efficient Reinforcement Learning via Posterior Sampling
Ian Osband
Daniel Russo
Benjamin Van Roy
103
529
0
04 Jun 2013
Learning to Optimize Via Posterior Sampling
Learning to Optimize Via Posterior Sampling
Daniel Russo
Benjamin Van Roy
137
699
0
11 Jan 2013
REGAL: A Regularization based Algorithm for Reinforcement Learning in
  Weakly Communicating MDPs
REGAL: A Regularization based Algorithm for Reinforcement Learning in Weakly Communicating MDPs
Peter L. Bartlett
Ambuj Tewari
71
280
0
09 May 2012
Optimism in Reinforcement Learning and Kullback-Leibler Divergence
Optimism in Reinforcement Learning and Kullback-Leibler Divergence
Sarah Filippi
Olivier Cappé
Aurélien Garivier
99
105
0
29 Apr 2010
1