Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.03539
Cited By
Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity
8 September 2019
Peng Liao
Kristjan Greenewald
P. Klasnja
Susan Murphy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity"
17 / 17 papers shown
Title
Mitigating Planner Overfitting in Model-Based Reinforcement Learning
Dilip Arumugam
David Abel
Kavosh Asadi
N. Gopalan
Christopher Grimm
Jun Ki Lee
Lucas Lehnert
Michael L. Littman
31
11
0
03 Dec 2018
Semiparametric Contextual Bandits
A. Krishnamurthy
Zhiwei Steven Wu
Vasilis Syrgkanis
112
44
0
12 Mar 2018
Estimation Considerations in Contextual Bandits
Maria Dimakopoulou
Zhengyuan Zhou
Susan Athey
Guido Imbens
266
69
0
19 Nov 2017
Learning Unknown Markov Decision Processes: A Thompson Sampling Approach
Ouyang Yi
Mukul Gagrani
A. Nayyar
R. Jain
49
128
0
14 Sep 2017
Non-Stationary Bandits with Habituation and Recovery Dynamics
Yonatan Dov Mintz
A. Aswani
Philip M. Kaminsky
E. Flowers
Yoshimi Fukuoka
162
57
0
26 Jul 2017
On Optimistic versus Randomized Exploration in Reinforcement Learning
Ian Osband
Benjamin Van Roy
41
11
0
13 Jun 2017
Misspecified Linear Bandits
Avishek Ghosh
Sayak Ray Chowdhury
Aditya Gopalan
47
66
0
23 Apr 2017
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
Ian Osband
Benjamin Van Roy
BDL
76
260
0
01 Jul 2016
A Reinforcement Learning System to Encourage Physical Activity in Diabetes Patients
I. Hochberg
G. Feraru
Mark Kozdoba
Shie Mannor
Moshe Tennenholtz
E. Yom-Tov
47
170
0
13 May 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
421
576
0
04 Apr 2016
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies
Vincent François-Lavet
R. Fonteneau
D. Ernst
49
111
0
07 Dec 2015
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Nan Jiang
Lihong Li
OffRL
200
623
0
11 Nov 2015
(More) Efficient Reinforcement Learning via Posterior Sampling
Ian Osband
Daniel Russo
Benjamin Van Roy
116
533
0
04 Jun 2013
Learning to Optimize Via Posterior Sampling
Daniel Russo
Benjamin Van Roy
193
699
0
11 Jan 2013
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
195
997
0
15 Sep 2012
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
E. Kaufmann
N. Korda
Rémi Munos
157
588
0
18 May 2012
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
456
2,949
0
28 Feb 2010
1