Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems

24 July 2018

Papers citing "Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems"

3 / 3 papers shown

Title
Stochastic bandits with arm-dependent delays Anne Gael Manegueu Claire Vernade Alexandra Carpentier Michal Valko 19 44 0 18 Jun 2020
An empirical investigation of the challenges of real-world reinforcement learning Gabriel Dulac-Arnold Nir Levine D. Mankowitz Jerry Li Cosmin Paduraru Sven Gowal Todd Hester OffRL 34 120 0 24 Mar 2020
Nonstochastic Multiarmed Bandits with Unrestricted Delays Tobias Sommer Thune Nicolò Cesa-Bianchi Yevgeny Seldin 15 52 0 03 Jun 2019