Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1503.09105
Cited By
Two Timescale Stochastic Approximation with Controlled Markov noise and Off-policy temporal difference learning
31 March 2015
Prasenjit Karmakar
S. Bhatnagar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Two Timescale Stochastic Approximation with Controlled Markov noise and Off-policy temporal difference learning"
4 / 4 papers shown
Title
Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize
Huizhen Yu
60
30
0
23 Nov 2015
On the Complexity of Best Arm Identification in Multi-Armed Bandit Models
E. Kaufmann
Olivier Cappé
Aurélien Garivier
193
1,025
0
16 Jul 2014
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
230
220
0
22 May 2012
Convergence and Convergence Rate of Stochastic Gradient Search in the Case of Multiple and Non-Isolated Extrema
V. Tadic
102
19
0
06 Jul 2009
1