Tractable Optimality in Episodic Latent MABs

5 October 2022

Jeongyeol Kwon

Yonathan Efroni

Constantine Caramanis

Shie Mannor

ArXiv PDF HTML

Papers citing "Tractable Optimality in Episodic Latent MABs"

24 / 24 papers shown

Title
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning Subhojyoti Mukherjee Josiah P. Hanna Qiaomin Xie Robert Nowak 201 2 0 07 Jun 2024
When Is Partially Observable Reinforcement Learning Not Scary? Qinghua Liu Alan Chung Csaba Szepesvári Chi Jin 41 97 0 19 Apr 2022
Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms Jeongyeol Kwon Yonathan Efroni Constantine Caramanis Shie Mannor AAML 75 6 0 30 Jan 2022
Reinforcement Learning in Reward-Mixing MDPs Jeongyeol Kwon Yonathan Efroni Constantine Caramanis Shie Mannor 85 15 0 07 Oct 2021
Bayesian decision-making under misspecified priors with applications to meta-learning Max Simchowitz Christopher Tosh A. Krishnamurthy Daniel J. Hsu Thodoris Lykouris Miroslav Dudík Robert Schapire 76 50 0 03 Jul 2021
Meta-Thompson Sampling Branislav Kveton Mikhail Konobeev Manzil Zaheer Chih-Wei Hsu Martin Mladenov Craig Boutilier Csaba Szepesvári 83 61 0 11 Feb 2021
RL for Latent MDPs: Regret Guarantees and a Lower Bound Jeongyeol Kwon Yonathan Efroni Constantine Caramanis Shie Mannor 53 80 0 09 Feb 2021
Near-optimal Representation Learning for Linear Bandits and Linear RL Jiachen Hu Xiaoyu Chen Chi Jin Lihong Li Liwei Wang OffRL 151 53 0 08 Feb 2021
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs Chi Jin Sham Kakade A. Krishnamurthy Qinghua Liu 82 66 0 22 Jun 2020
Regime Switching Bandits Xiang Zhou Yi Xiong Ningyuan Chen Xuefeng Gao 45 19 0 26 Jan 2020
The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits Ronshee Chawla Abishek Sankararaman A. Ganesh Sanjay Shakkottai 49 52 0 15 Jan 2020
Learning with Good Feature Representations in Bandits and in RL with a Generative Model Tor Lattimore Csaba Szepesvári Gellert Weisz OffRL 153 170 0 18 Nov 2019
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles Aditya Modi Nan Jiang Ambuj Tewari Satinder Singh 68 131 0 23 Oct 2019
Optimal estimation of Gaussian mixtures via denoised method of moments Yihong Wu Pengkun Yang 97 75 0 19 Jul 2018
Minimax Regret Bounds for Reinforcement Learning M. G. Azar Ian Osband Rémi Munos 83 774 0 16 Mar 2017
Low-rank Bandits with Latent Mixtures Aditya Gopalan Odalric-Ambrym Maillard Mohammadi Zaki 81 27 0 06 Sep 2016
Reinforcement Learning of POMDPs using Spectral Methods Kamyar Azizzadenesheli A. Lazaric Anima Anandkumar 39 128 0 25 Feb 2016
On the Prior Sensitivity of Thompson Sampling Che-Yu Liu Lihong Li 51 25 0 10 Jun 2015
Contextual Markov Decision Processes Assaf Hallak Dotan Di Castro Shie Mannor 89 247 0 08 Feb 2015
Online Clustering of Bandits Claudio Gentile Shuai Li Giovanni Zappella 137 264 0 31 Jan 2014
Smoothed Analysis of Tensor Decompositions Aditya Bhaskara Moses Charikar Ankur Moitra Aravindan Vijayaraghavan 152 154 0 14 Nov 2013
Sample Complexity of Multi-task Reinforcement Learning Emma Brunskill Lihong Li 81 138 0 26 Sep 2013
Tensor decompositions for learning latent variable models Anima Anandkumar Rong Ge Daniel J. Hsu Sham Kakade Matus Telgarsky 432 1,144 0 29 Oct 2012
Anytime Point-Based Approximations for Large POMDPs Joelle Pineau Geoffrey J. Gordon Sebastian Thrun 96 419 0 30 Sep 2011