Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1209.3352
Cited By
v1
v2
v3
v4 (latest)
Thompson Sampling for Contextual Bandits with Linear Payoffs
15 September 2012
Shipra Agrawal
Navin Goyal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Thompson Sampling for Contextual Bandits with Linear Payoffs"
19 / 19 papers shown
Title
Prompt Optimization with Logged Bandit Data
Haruka Kiyohara
Daniel Yiming Cao
Yuta Saito
Thorsten Joachims
218
0
0
03 Apr 2025
Linear Bandits with Partially Observable Features
Wonyoung Hedge Kim
Sungwoo Park
G. Iyengar
A. Zeevi
Min Hwan Oh
184
1
0
10 Feb 2025
Distributed Thompson sampling under constrained communication
Saba Zerefa
Tongzheng Ren
Haitong Ma
Na Li
92
1
0
03 Jan 2025
AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games
Kefan Su
Yusen Huo
Zhilin Zhang
Shuai Dou
Chuan Yu
Jian Xu
Zongqing Lu
Bo Zheng
141
7
0
31 Dec 2024
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem
Nima Akbarzadeh
Erick Delage
Yossiri Adulyasak
168
0
0
30 Oct 2024
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit
Junyu Cao
Ruijiang Gao
Esmaeil Keyvanshokooh
195
1
0
18 Oct 2024
Second Order Bounds for Contextual Bandits with Function Approximation
Aldo Pacchiano
228
5
0
24 Sep 2024
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback
Arun Verma
Zhongxiang Dai
Xiaoqiang Lin
Patrick Jaillet
K. H. Low
170
6
0
24 Jul 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
193
2
0
13 Jun 2024
On Bits and Bandits: Quantifying the Regret-Information Trade-off
Itai Shufaro
Nadav Merlis
Nir Weinberger
Shie Mannor
164
0
0
26 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
148
5
0
22 Feb 2024
Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits
Nicolas Nguyen
Imad Aouali
András Gyorgy
Claire Vernade
77
2
0
08 Feb 2024
Ensemble sampling for linear bandits: small ensembles suffice
David Janz
A. Litvak
Csaba Szepesvári
100
1
0
14 Nov 2023
Selective Uncertainty Propagation in Offline RL
Sanath Kumar Krishnamurthy
Shrey Modi
Tanmay Gangwani
S. Katariya
Branislav Kveton
A. Rangi
OffRL
191
0
0
01 Feb 2023
Safe Linear Thompson Sampling with Side Information
Ahmadreza Moradipari
Sanae Amani
M. Alizadeh
Christos Thrampoulidis
139
44
0
06 Nov 2019
Learning to Optimize Via Posterior Sampling
Daniel Russo
Benjamin Van Roy
203
703
0
11 Jan 2013
Further Optimal Regret Bounds for Thompson Sampling
Shipra Agrawal
Navin Goyal
107
442
0
15 Sep 2012
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
E. Kaufmann
N. Korda
Rémi Munos
162
588
0
18 May 2012
Towards minimax policies for online linear optimization with bandit feedback
Sébastien Bubeck
Nicolò Cesa-Bianchi
Sham Kakade
OffRL
285
151
0
14 Feb 2012
1