v1v2 (latest)

Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound

24 May 2019

Papers citing "Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound"

26 / 26 papers shown

Title
Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation Long-Fei Li Yu Zhang Peng Zhao Zhi Zhou 241 5 0 17 Jan 2025
Spectral Representation for Causal Estimation with Hidden Confounders Zhaolin Ren Haotian Sun Antoine Moulin Arthur Gretton Bo Dai CML 114 3 0 15 Jul 2024
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning Subhojyoti Mukherjee Josiah P. Hanna Qiaomin Xie Robert Nowak 241 2 0 07 Jun 2024
On Online Learning in Kernelized Markov Decision Processes Sayak Ray Chowdhury Aditya Gopalan OffRL 65 48 0 04 Nov 2019
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles Aditya Modi Nan Jiang Ambuj Tewari Satinder Singh 70 132 0 23 Oct 2019
Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning? S. Du Sham Kakade Ruosong Wang Lin F. Yang 204 193 0 07 Oct 2019
Provably Efficient Reinforcement Learning with Linear Function Approximation Chi Jin Zhuoran Yang Zhaoran Wang Michael I. Jordan 109 560 0 11 Jul 2019
No-regret Exploration in Contextual Reinforcement Learning Aditya Modi Ambuj Tewari OffRL 37 14 0 14 Mar 2019
Sample-Optimal Parametric Q-Learning Using Linearly Additive Features Lin F. Yang Mengdi Wang VLM 56 14 0 13 Feb 2019
Policy Certificates: Towards Accountable Reinforcement Learning Christoph Dann Ashutosh Adhikari Wei Wei Jimmy J. Lin OffRL 143 146 0 07 Nov 2018
Is Q-learning Provably Efficient? Chi Jin Zeyuan Allen-Zhu Sébastien Bubeck Michael I. Jordan OffRL 84 812 0 10 Jul 2018
What Doubling Tricks Can and Can't Do for Multi-Armed Bandits Lilian Besson E. Kaufmann 79 116 0 19 Mar 2018
Efficient Exploration through Bayesian Deep Q-Networks Kamyar Azizzadenesheli Anima Anandkumar OffRL BDL 87 163 0 13 Feb 2018
On Kernelized Multi-armed Bandits Sayak Ray Chowdhury Aditya Gopalan 127 463 0 03 Apr 2017
Deep Exploration via Randomized Value Functions Ian Osband Benjamin Van Roy Daniel Russo Zheng Wen 116 307 0 22 Mar 2017
Minimax Regret Bounds for Reinforcement Learning M. G. Azar Ian Osband Rémi Munos 95 778 0 16 Mar 2017
Reinforcement Learning in Rich-Observation MDPs using Spectral Methods Kamyar Azizzadenesheli A. Lazaric Anima Anandkumar 70 31 0 11 Nov 2016
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving Shai Shalev-Shwartz Shaked Shammah Amnon Shashua 120 840 0 11 Oct 2016
On Lower Bounds for Regret in Reinforcement Learning Ian Osband Benjamin Van Roy 85 101 0 09 Aug 2016
Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning Christoph Dann Emma Brunskill 78 249 0 29 Oct 2015
Randomized sketches for kernels: Fast and optimal non-parametric regression Yun Yang Mert Pilanci Martin J. Wainwright 95 174 0 25 Jan 2015
Playing Atari with Deep Reinforcement Learning Volodymyr Mnih Koray Kavukcuoglu David Silver Alex Graves Ioannis Antonoglou Daan Wierstra Martin Riedmiller 132 12,272 0 19 Dec 2013
Finite-Time Analysis of Kernelised Contextual Bandits Michal Valko N. Korda Rémi Munos I. Flaounas N. Cristianini 193 275 0 26 Sep 2013
A Contextual-Bandit Approach to Personalized News Article Recommendation Lihong Li Wei Chu John Langford Robert Schapire 473 2,958 0 28 Feb 2010
Linearly Parameterized Bandits Paat Rusmevichientong J. Tsitsiklis 409 562 0 18 Dec 2008
A Spectral Algorithm for Learning Hidden Markov Models Daniel J. Hsu Sham Kakade Tong Zhang 201 310 0 26 Nov 2008