ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.00030
  4. Cited By
Optimistic Policy Iteration for MDPs with Acyclic Transient State
  Structure
v1v2v3 (latest)

Optimistic Policy Iteration for MDPs with Acyclic Transient State Structure

29 January 2021
Joseph Lubars
Anna Winnicki
Michael Livesay
R. Srikant
ArXiv (abs)PDFHTML

Papers citing "Optimistic Policy Iteration for MDPs with Acyclic Transient State Structure"

2 / 2 papers shown
Title
On The Convergence Of Policy Iteration-Based Reinforcement Learning With
  Monte Carlo Policy Evaluation
On The Convergence Of Policy Iteration-Based Reinforcement Learning With Monte Carlo Policy Evaluation
Anna Winnicki
R. Srikant
92
9
0
23 Jan 2023
Reinforcement Learning with Unbiased Policy Evaluation and Linear
  Function Approximation
Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation
Anna Winnicki
R. Srikant
OffRL
54
6
0
13 Oct 2022
1