Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.00030
Cited By
v1
v2
v3 (latest)
Optimistic Policy Iteration for MDPs with Acyclic Transient State Structure
29 January 2021
Joseph Lubars
Anna Winnicki
Michael Livesay
R. Srikant
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimistic Policy Iteration for MDPs with Acyclic Transient State Structure"
2 / 2 papers shown
Title
On The Convergence Of Policy Iteration-Based Reinforcement Learning With Monte Carlo Policy Evaluation
Anna Winnicki
R. Srikant
92
9
0
23 Jan 2023
Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation
Anna Winnicki
R. Srikant
OffRL
54
6
0
13 Oct 2022
1