Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles

12 February 2020

Papers citing "Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles"

47 / 47 papers shown

Title
Constrained Online Decision-Making: A Unified Framework Haichen Hu David Simchi-Levi Navid Azizan 39 0 0 11 May 2025
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure Aleksandrs Slivkins Yunzong Xu Shiliang Zuo 86 1 0 06 Mar 2025
A Complete Characterization of Learnability for Stochastic Noisy Bandits Steve Hanneke Kun Wang 42 0 0 20 Jan 2025
On The Statistical Complexity of Offline Decision-Making Thanh Nguyen-Tang R. Arora OffRL 53 1 0 10 Jan 2025
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits H. Bui Enrique Mallada Anqi Liu 213 0 0 08 Nov 2024
Second Order Bounds for Contextual Bandits with Function Approximation Aldo Pacchiano 66 4 0 24 Sep 2024
Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning Noah Golowich Ankur Moitra Dhruv Rohatgi OffRL 35 4 0 04 Apr 2024
Online Learning with Unknown Constraints Karthik Sridharan Seung Won Wilson Yoo 33 2 0 06 Mar 2024
Harnessing the Power of Federated Learning in Federated Contextual Bandits Chengshuai Shi Ruida Zhou Kun Yang Cong Shen FedML 33 0 0 26 Dec 2023
Stochastic Graph Bandit Learning with Side-Observations Xueping Gong Jiheng Zhang 34 1 0 29 Aug 2023
Anytime Model Selection in Linear Bandits Parnian Kassraie N. Emmenegger Andreas Krause Aldo Pacchiano 54 2 0 24 Jul 2023
VITS : Variational Inference Thompson Sampling for contextual bandits Pierre Clavier Tom Huix Alain Durmus 29 3 0 19 Jul 2023
Oracle Efficient Online Multicalibration and Omniprediction Sumegha Garg Christopher Jung Omer Reingold Aaron Roth 23 18 0 18 Jul 2023
Neural Exploitation and Exploration of Contextual Bandits Yikun Ban Yuchen Yan A. Banerjee Jingrui He 44 8 0 05 May 2023
Smoothed Analysis of Sequential Probability Assignment Alankrita Bhatt Nika Haghtalab Abhishek Shetty 32 9 0 08 Mar 2023
Sequential Counterfactual Risk Minimization Houssam Zenati Eustache Diemert Matthieu Martin Julien Mairal Pierre Gaillard OffRL 29 3 0 23 Feb 2023
Infinite Action Contextual Bandits with Reusable Data Exhaust Mark Rucker Yinglun Zhu Paul Mineiro OffRL 21 1 0 16 Feb 2023
Multicalibration as Boosting for Regression Ira Globus-Harris Declan Harrison Michael Kearns Aaron Roth Jessica Sorrell 30 21 0 31 Jan 2023
Learning to Generate All Feasible Actions Mirco Theile Daniele Bernardini Raphael Trumpp C. Piazza Marco Caccamo Alberto L. Sangiovanni-Vincentelli 29 2 0 26 Jan 2023
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes Chen Ye Wei Xiong Quanquan Gu Tong Zhang 31 29 0 12 Dec 2022
Eluder-based Regret for Stochastic Contextual MDPs Orin Levy Asaf B. Cassel Alon Cohen Yishay Mansour 35 5 0 27 Nov 2022
Global Optimization with Parametric Function Approximation Chong Liu Yu Wang 38 7 0 16 Nov 2022
Redeeming Intrinsic Rewards via Constrained Optimization Eric Chen Zhang-Wei Hong Joni Pajarinen Pulkit Agrawal OnRL 36 24 0 14 Nov 2022
Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear Bandit Algorithms Osama A. Hanna Lin F. Yang Christina Fragouli 27 11 0 08 Nov 2022
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees Andrea Tirinzoni Matteo Papini Ahmed Touati A. Lazaric Matteo Pirotta 33 4 0 24 Oct 2022
Optimal Contextual Bandits with Knapsacks under Realizability via Regression Oracles Yuxuan Han Jialin Zeng Yang Wang Yangzhen Xiang Jiheng Zhang 59 9 0 21 Oct 2022
Risk-aware linear bandits with convex loss Patrick Saux Odalric-Ambrym Maillard 27 2 0 15 Sep 2022
Optimistic Whittle Index Policy: Online Learning for Restless Bandits Kai Wang Lily Xu Aparna Taneja Milind Tambe 41 16 0 30 May 2022
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits Gergely Neu Julia Olkhovskaya Matteo Papini Ludovic Schwartz 33 16 0 27 May 2022
Contextual Pandora's Box Alexia Atsidakou Constantine Caramanis Evangelia Gergatsouli Orestis Papadigenopoulos Christos Tzamos 23 5 0 26 May 2022
$Breaking the $\sqrt{T}$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits$ Breaking the $\sqrt{T}$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits Avishek Ghosh Abishek Sankararaman 29 3 0 19 May 2022
Efficient Active Learning with Abstention Yinglun Zhu Robert D. Nowak 49 11 0 31 Mar 2022
Flexible and Efficient Contextual Bandits with Heterogeneous Treatment Effect Oracles Aldo G. Carranza Sanath Kumar Krishnamurthy Susan Athey 24 1 0 30 Mar 2022
Oracle-Efficient Online Learning for Beyond Worst-Case Adversaries Nika Haghtalab Yanjun Han Abhishek Shetty Kunhe Yang 41 23 0 17 Feb 2022
An Experimental Design Approach for Regret Minimization in Logistic Bandits Blake Mason Kwang-Sung Jun Lalit P. Jain 29 10 0 04 Feb 2022
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability Aadirupa Saha A. Krishnamurthy 42 35 0 24 Nov 2021
Misspecified Gaussian Process Bandit Optimization Ilija Bogunovic Andreas Krause 57 43 0 09 Nov 2021
Representation Learning for Online and Offline RL in Low-rank MDPs Masatoshi Uehara Xuezhou Zhang Wen Sun OffRL 67 127 0 09 Oct 2021
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning Tong Zhang 27 63 0 02 Oct 2021
Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination Dylan J. Foster A. Krishnamurthy 48 43 0 05 Jul 2021
On component interactions in two-stage recommender systems Jiri Hron K. Krauth Michael I. Jordan Niki Kilbertus CML LRM 40 31 0 28 Jun 2021
Heuristic-Guided Reinforcement Learning Ching-An Cheng Andrey Kolobov Adith Swaminathan OffRL 40 61 0 05 Jun 2021
Information Directed Sampling for Sparse Linear Bandits Botao Hao Tor Lattimore Wei Deng 25 19 0 29 May 2021
An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap Yuanhao Wang Ruosong Wang Sham Kakade OffRL 41 43 0 23 Mar 2021
Neural Thompson Sampling Weitong Zhang Dongruo Zhou Lihong Li Quanquan Gu 34 115 0 02 Oct 2020
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability D. Simchi-Levi Yunzong Xu OffRL 47 107 0 28 Mar 2020
Context-Based Dynamic Pricing with Online Clustering Sentao Miao Xi Chen X. Chao Jiaxi Liu Yidong Zhang 27 31 0 17 Feb 2019