Thompson Sampling for Contextual Bandits with Linear Payoffs

v1v2v3v4 (latest)

Thompson Sampling for Contextual Bandits with Linear Payoffs

15 September 2012

ArXiv (abs)PDF HTML

Papers citing "Thompson Sampling for Contextual Bandits with Linear Payoffs"

19 / 19 papers shown

Title
Prompt Optimization with Logged Bandit Data Haruka Kiyohara Daniel Yiming Cao Yuta Saito Thorsten Joachims 218 0 0 03 Apr 2025
Linear Bandits with Partially Observable Features Wonyoung Hedge Kim Sungwoo Park G. Iyengar A. Zeevi Min Hwan Oh 184 1 0 10 Feb 2025
Distributed Thompson sampling under constrained communication Saba Zerefa Tongzheng Ren Haitong Ma Na Li 92 1 0 03 Jan 2025
AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games Kefan Su Yusen Huo Zhilin Zhang Shuai Dou Chuan Yu Jian Xu Zongqing Lu Bo Zheng 141 7 0 31 Dec 2024
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem Nima Akbarzadeh Erick Delage Yossiri Adulyasak 168 0 0 30 Oct 2024
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit Junyu Cao Ruijiang Gao Esmaeil Keyvanshokooh 195 1 0 18 Oct 2024
Second Order Bounds for Contextual Bandits with Function Approximation Aldo Pacchiano 228 5 0 24 Sep 2024
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback Arun Verma Zhongxiang Dai Xiaoqiang Lin Patrick Jaillet K. H. Low 170 6 0 24 Jul 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF Akhil Agnihotri Rahul Jain Deepak Ramachandran Zheng Wen OffRL 193 2 0 13 Jun 2024
On Bits and Bandits: Quantifying the Regret-Information Trade-off Itai Shufaro Nadav Merlis Nir Weinberger Shie Mannor 164 0 0 26 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces Imad Aouali Victor-Emmanuel Brunel David Rohde Anna Korba OffRL 148 5 0 22 Feb 2024
Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits Nicolas Nguyen Imad Aouali András Gyorgy Claire Vernade 77 2 0 08 Feb 2024
Ensemble sampling for linear bandits: small ensembles suffice David Janz A. Litvak Csaba Szepesvári 100 1 0 14 Nov 2023
Selective Uncertainty Propagation in Offline RL Sanath Kumar Krishnamurthy Shrey Modi Tanmay Gangwani S. Katariya Branislav Kveton A. Rangi OffRL 191 0 0 01 Feb 2023
Safe Linear Thompson Sampling with Side Information Ahmadreza Moradipari Sanae Amani M. Alizadeh Christos Thrampoulidis 139 44 0 06 Nov 2019
Learning to Optimize Via Posterior Sampling Daniel Russo Benjamin Van Roy 203 703 0 11 Jan 2013
Further Optimal Regret Bounds for Thompson Sampling Shipra Agrawal Navin Goyal 107 442 0 15 Sep 2012
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis E. Kaufmann N. Korda Rémi Munos 162 588 0 18 May 2012
Towards minimax policies for online linear optimization with bandit feedback Sébastien Bubeck Nicolò Cesa-Bianchi Sham Kakade OffRL 285 151 0 14 Feb 2012