Mostly Exploration-Free Algorithms for Contextual Bandits

28 April 2017

Papers citing "Mostly Exploration-Free Algorithms for Contextual Bandits"

40 / 40 papers shown

Title
Contextual Online Uncertainty-Aware Preference Learning for Human Feedback Nan Lu Ethan X. Fang Junwei Lu 257 0 0 27 Apr 2025
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure Aleksandrs Slivkins Yunzong Xu Shiliang Zuo 86 1 0 06 Mar 2025
Contextual Bandits for Unbounded Context Distributions Puning Zhao Xiaogang Xu Zhe Liu Huiwen Wu Qin Zhang Zong Ke Tianhang Zheng 76 4 0 19 Aug 2024
Batched Nonparametric Contextual Bandits Rong Jiang Cong Ma OffRL 46 1 0 27 Feb 2024
Incentivized Exploration via Filtered Posterior Sampling Anand Kalvit Aleksandrs Slivkins Yonatan Gur 29 1 0 20 Feb 2024
Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits Yuwei Luo Mohsen Bayati 26 1 0 26 Jun 2023
Bandit Social Learning: Exploration under Myopic Behavior Kiarash Banihashem Mohammadtaghi Hajiaghayi Suho Shin Aleksandrs Slivkins 24 4 0 15 Feb 2023
Transfer Learning for Contextual Multi-armed Bandits Changxiao Cai T. Tony Cai Hongzhe Li 47 16 0 22 Nov 2022
Advertising Media and Target Audience Optimization via High-dimensional Bandits Wenjia Ba J. Harrison Harikesh S. Nair 16 0 0 17 Sep 2022
Risk-aware linear bandits with convex loss Patrick Saux Odalric-Ambrym Maillard 27 2 0 15 Sep 2022
Meta Representation Learning with Contextual Linear Bandits Leonardo Cella Karim Lounici Massimiliano Pontil 62 5 0 30 May 2022
Worst-case Performance of Greedy Policies in Bandits with Imperfect Context Observations Hongju Park Mohamad Kazem Shirani Faradonbeh OffRL 29 2 0 10 Apr 2022
Truncated LinUCB for Stochastic Linear Bandits Yanglei Song Meng zhou 52 0 0 23 Feb 2022
Efficient Algorithms for Learning to Control Bandits with Unobserved Contexts Hongju Park Mohamad Kazem Shirani Faradonbeh 23 6 0 02 Feb 2022
Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits Yash J. Patel Mohamad Kazem Shirani Faradonbeh 16 15 0 23 Oct 2021
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification James A. Grant David S. Leslie 50 3 0 29 Sep 2021
Safe Policy Learning through Extrapolation: Application to Pre-trial Risk Assessment Eli Ben-Michael D. J. Greiner Kosuke Imai Zhichao Jiang OffRL 33 22 0 22 Sep 2021
Dynamic Selection in Algorithmic Decision-making Jin Li Ye Luo Xiaowei Zhang 34 2 0 28 Aug 2021
On component interactions in two-stage recommender systems Jiri Hron K. Krauth Michael I. Jordan Niki Kilbertus CML LRM 42 31 0 28 Jun 2021
Fair Exploration via Axiomatic Bargaining Jackie Baek Vivek F. Farias FaML 18 28 0 04 Jun 2021
Leveraging Good Representations in Linear Contextual Bandits Matteo Papini Andrea Tirinzoni Marcello Restelli A. Lazaric Matteo Pirotta 38 26 0 08 Apr 2021
Competing Bandits: The Perils of Exploration Under Competition Guy Aridor Yishay Mansour Aleksandrs Slivkins Zhiwei Steven Wu 25 16 0 20 Jul 2020
TS-UCB: Improving on Thompson Sampling With Little to No Additional Computation Jackie Baek Vivek F. Farias 45 9 0 11 Jun 2020
An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic Gradient Descent and Thompson Sampling Qin Ding Cho-Jui Hsieh James Sharpnack 25 37 0 07 Jun 2020
Greedy Algorithm almost Dominates in Smoothed Contextual Bandits Manish Raghavan Aleksandrs Slivkins Jennifer Wortman Vaughan Zhiwei Steven Wu 21 18 0 19 May 2020
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability D. Simchi-Levi Yunzong Xu OffRL 54 109 0 28 Mar 2020
Bounded Regret for Finitely Parameterized Multi-Armed Bandits Kishan Panaganti D. Kalathil 18 1 0 03 Mar 2020
Information Directed Sampling for Linear Partial Monitoring Johannes Kirschner Tor Lattimore Andreas Krause 24 46 0 25 Feb 2020
Adaptive Exploration in Linear Contextual Bandit Botao Hao Tor Lattimore Csaba Szepesvári 30 74 0 15 Oct 2019
Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes Yichun Hu Nathan Kallus Xiaojie Mao 35 34 0 05 Sep 2019
Adaptive Robot-Assisted Feeding: An Online Learning Framework for Acquiring Previously Unseen Food Items E. Gordon Xiang Meng Matt Barnes Tapomayukh Bhattacharjee S. Srinivasa OffRL OnRL 18 45 0 19 Aug 2019
Convergence Rates of Posterior Distributions in Markov Decision Process Zhen Li E. Laber 20 0 0 22 Jul 2019
Stochastic Bandits with Context Distributions Johannes Kirschner Andreas Krause 29 30 0 06 Jun 2019
Model selection for contextual bandits Dylan J. Foster A. Krishnamurthy Haipeng Luo OffRL 36 90 0 03 Jun 2019
Rarely-switching linear bandits: optimization of causal effects for the real world B. Lansdell Sofia Triantafillou Konrad Paul Kording 22 4 0 30 May 2019
Meta Dynamic Pricing: Transfer Learning Across Experiments Hamsa Bastani D. Simchi-Levi Ruihao Zhu 39 88 0 28 Feb 2019
Bayesian Exploration with Heterogeneous Agents Nicole Immorlica Jieming Mao Aleksandrs Slivkins Zhiwei Steven Wu 32 24 0 19 Feb 2019
Input Perturbations for Adaptive Control and Learning Mohamad Kazem Shirani Faradonbeh Ambuj Tewari George Michailidis 21 46 0 10 Nov 2018
Inventory Balancing with Online Learning Wang Chi Cheung Will Ma D. Simchi-Levi Xinshang Wang 24 16 0 11 Oct 2018
Dynamic Pricing in High-dimensions Adel Javanmard Hamid Nazerzadeh 78 138 0 24 Sep 2016