Regret Bound Balancing and Elimination for Model Selection in Bandits
and RL

Regret Bound Balancing and Elimination for Model Selection in Bandits and RL

24 December 2020

Claudio Gentile

Peter L. Bartlett

Papers citing "Regret Bound Balancing and Elimination for Model Selection in Bandits and RL"

15 / 15 papers shown

Title
A Model Selection Approach for Corruption Robust Reinforcement Learning Chen-Yu Wei Christoph Dann Julian Zimmert 125 45 0 31 Dec 2024
Adapting to Misspecification in Contextual Bandits Dylan J. Foster Claudio Gentile M. Mohri Julian Zimmert 91 86 0 12 Jul 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints Chi Jin Zhuoran Yang Zhaoran Wang OffRL 239 167 0 06 Jan 2021
Online Model Selection for Reinforcement Learning with Function Approximation Jonathan Lee Aldo Pacchiano Vidya Muthukumar Weihao Kong Emma Brunskill OffRL 40 37 0 19 Nov 2020
Corralling Stochastic Bandit Algorithms R. Arora T. V. Marinov M. Mohri 50 35 0 16 Jun 2020
Regret Balancing for Bandit and RL Model Selection Yasin Abbasi-Yadkori Aldo Pacchiano My Phan 59 26 0 09 Jun 2020
Rate-adaptive model selection over a collection of black-box contextual bandit algorithms Aurélien F. Bibaut Antoine Chambaz Mark van der Laan 55 6 0 05 Jun 2020
Problem-Complexity Adaptive Model Selection for Stochastic Linear Bandits Avishek Ghosh Abishek Sankararaman Kannan Ramchandran 25 33 0 04 Jun 2020
Model Selection in Contextual Stochastic Bandit Problems Aldo Pacchiano My Phan Yasin Abbasi-Yadkori Anup B. Rao Julian Zimmert Tor Lattimore Csaba Szepesvári 166 94 0 03 Mar 2020
Learning Near Optimal Policies with Low Inherent Bellman Error Andrea Zanette A. Lazaric Mykel Kochenderfer Emma Brunskill OffRL 71 222 0 29 Feb 2020
Model selection for contextual bandits Dylan J. Foster A. Krishnamurthy Haipeng Luo OffRL 161 90 0 03 Jun 2019
OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits Niladri S. Chatterji Vidya Muthukumar Peter L. Bartlett 51 44 0 24 May 2019
Corralling a Band of Bandit Algorithms Alekh Agarwal Haipeng Luo Behnam Neyshabur Robert Schapire 141 157 0 19 Dec 2016
A Contextual-Bandit Approach to Personalized News Article Recommendation Lihong Li Wei Chu John Langford Robert Schapire 456 2,949 0 28 Feb 2010
Linearly Parameterized Bandits Paat Rusmevichientong J. Tsitsiklis 385 559 0 18 Dec 2008