Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1301.2609
Cited By
Learning to Optimize Via Posterior Sampling
11 January 2013
Daniel Russo
Benjamin Van Roy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to Optimize Via Posterior Sampling"
50 / 147 papers shown
Title
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
94
1
0
29 Apr 2025
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
Yun Qu
Wenjie Wang
Yixiu Mao
Yiqin Lv
Xiangyang Ji
TTA
93
0
0
27 Apr 2025
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context
Jianyu Xu
Qiuzhuang Sun
Yang Yang
Huadong Mo
Daoyi Dong
83
0
0
24 Feb 2025
Improved Regret Analysis in Gaussian Process Bandits: Optimality for Noiseless Reward, RKHS norm, and Non-Stationary Variance
S. Iwazaki
Shion Takeno
85
1
0
10 Feb 2025
Causal Discovery via Bayesian Optimization
Bao Duong
Sunil Gupta
Thin Nguyen
55
0
0
28 Jan 2025
Improved Regret of Linear Ensemble Sampling
Harin Lee
Min-hwan Oh
42
1
0
06 Nov 2024
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem
Nima Akbarzadeh
Erick Delage
Yossiri Adulyasak
43
0
0
30 Oct 2024
Optimizing Posterior Samples for Bayesian Optimization via Rootfinding
Taiwo A. Adebiyi
Bach Do
Ruda Zhang
114
2
0
29 Oct 2024
HR-Bandit: Human-AI Collaborated Linear Recourse Bandit
Junyu Cao
Ruijiang Gao
Esmaeil Keyvanshokooh
45
1
0
18 Oct 2024
Advances in Preference-based Reinforcement Learning: A Review
Youssef Abdelkareem
Shady Shehata
Fakhri Karray
OffRL
56
10
0
21 Aug 2024
Bayesian Bandit Algorithms with Approximate Inference in Stochastic Linear Bandits
Ziyi Huang
Henry Lam
Haofeng Zhang
38
0
0
20 Jun 2024
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
S. Samsonov
Eric Moulines
Qi-Man Shao
Zhuo-Song Zhang
Alexey Naumov
38
4
0
26 May 2024
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback
Ruitao Chen
Liwei Wang
75
1
0
18 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
41
5
0
22 Feb 2024
Incentivized Exploration via Filtered Posterior Sampling
Anand Kalvit
Aleksandrs Slivkins
Yonatan Gur
29
1
0
20 Feb 2024
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
Thanh Nguyen-Tang
Raman Arora
OffRL
44
3
0
06 Jan 2024
Posterior Sampling-based Online Learning for Episodic POMDPs
Dengwang Tang
Dongze Ye
Rahul Jain
A. Nayyar
Pierluigi Nuzzo
OffRL
57
0
0
16 Oct 2023
Pseudo-Bayesian Optimization
Haoxian Chen
Henry Lam
39
2
0
15 Oct 2023
Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit Approach
Arman Rahbar
Niklas Åkerblom
M. Chehreghani
33
0
0
21 Aug 2023
VITS : Variational Inference Thompson Sampling for contextual bandits
Pierre Clavier
Tom Huix
Alain Durmus
32
3
0
19 Jul 2023
Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits
Yuwei Luo
Mohsen Bayati
26
1
0
26 Jun 2023
Incentivizing Exploration with Linear Contexts and Combinatorial Actions
Mark Sellke
34
3
0
03 Jun 2023
Multi-objective optimisation via the R2 utilities
Ben Tu
N. Kantas
Robert M. Lee
B. Shafei
229
3
0
19 May 2023
Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards
Hao Qin
Kwang-Sung Jun
Chicheng Zhang
46
0
0
28 Apr 2023
Simulating Gaussian vectors via randomized dimension reduction and PCA
N. Kahalé
35
0
0
14 Apr 2023
Thompson Sampling for Linear Bandit Problems with Normal-Gamma Priors
Björn Lindenberg
Karl-Olof Lindahl
30
0
0
06 Mar 2023
Algorithm Selection for Deep Active Learning with Imbalanced Datasets
Jifan Zhang
Shuai Shao
Saurabh Verma
Robert D. Nowak
33
20
0
14 Feb 2023
Randomized Gaussian Process Upper Confidence Bound with Tighter Bayesian Regret Bounds
Shion Takeno
Yu Inatsu
Masayuki Karasuyama
35
13
0
03 Feb 2023
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Mengdi Wang
Furong Huang
Dinesh Manocha
24
8
0
28 Jan 2023
Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient
Dylan J. Foster
Noah Golowich
Yanjun Han
OffRL
36
29
0
19 Jan 2023
Multi-Task Off-Policy Learning from Bandit Feedback
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
OffRL
37
10
0
09 Dec 2022
Monte Carlo Tree Search Algorithms for Risk-Aware and Multi-Objective Reinforcement Learning
Conor F. Hayes
Mathieu Reymond
D. Roijers
Enda Howley
Patrick Mannion
26
4
0
23 Nov 2022
Distributed Resource Allocation for URLLC in IIoT Scenarios: A Multi-Armed Bandit Approach
Francesco Pase
M. Giordani
Giampaolo Cuozzo
Sara Cavallero
J. Eichinger
Roberto Verdone
M. Zorzi
34
9
0
22 Nov 2022
Bayesian Fixed-Budget Best-Arm Identification
Alexia Atsidakou
S. Katariya
Sujay Sanghavi
Branislav Kveton
35
11
0
15 Nov 2022
Robust Contextual Linear Bandits
Rong Zhu
Branislav Kveton
27
3
0
26 Oct 2022
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
Wei Xiong
Han Zhong
Chengshuai Shi
Cong Shen
Tong Zhang
66
18
0
04 Oct 2022
A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning
Christoph Dann
M. Mohri
Tong Zhang
Julian Zimmert
OffRL
23
33
0
23 Aug 2022
Delayed Feedback in Generalised Linear Bandits Revisited
Benjamin Howson
Ciara Pike-Burke
Sarah Filippi
10
14
0
21 Jul 2022
Graph Neural Network Bandits
Parnian Kassraie
Andreas Krause
Ilija Bogunovic
38
11
0
13 Jul 2022
POEM: Out-of-Distribution Detection with Posterior Sampling
Yifei Ming
Ying Fan
Yixuan Li
OODD
44
114
0
28 Jun 2022
How to talk so AI will learn: Instructions, descriptions, and autonomy
T. Sumers
Robert D. Hawkins
Mark K. Ho
Thomas Griffiths
Dylan Hadfield-Menell
LM&Ro
43
20
0
16 Jun 2022
Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits
Tianyuan Jin
Pan Xu
X. Xiao
Anima Anandkumar
49
12
0
07 Jun 2022
Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization
Hui Yuan
Chengzhuo Ni
Huazheng Wang
Xuezhou Zhang
Le Cong
Csaba Szepesvári
Mengdi Wang
28
2
0
05 Jun 2022
Surrogate modeling for Bayesian optimization beyond a single Gaussian process
Qin Lu
Konstantinos D. Polyzos
Bingcong Li
G. Giannakis
GP
38
18
0
27 May 2022
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits
Gergely Neu
Julia Olkhovskaya
Matteo Papini
Ludovic Schwartz
44
16
0
27 May 2022
Multi-Environment Meta-Learning in Stochastic Linear Bandits
Ahmadreza Moradipari
Mohammad Ghavamzadeh
Taha Rajabzadeh
Christos Thrampoulidis
M. Alizadeh
24
4
0
12 May 2022
Non-Stationary Bandit Learning via Predictive Sampling
Yueyang Liu
Kuang Xu
Benjamin Van Roy
28
19
0
04 May 2022
Rate-Constrained Remote Contextual Bandits
Francesco Pase
Deniz Gündüz
M. Zorzi
39
8
0
26 Apr 2022
Stochastic Conservative Contextual Linear Bandits
Jiabin Lin
Xian Yeow Lee
Talukder Jubery
Shana Moothedath
Soumik Sarkar
Baskar Ganapathysubramanian
16
7
0
29 Mar 2022
Truncated LinUCB for Stochastic Linear Bandits
Yanglei Song
Meng zhou
52
0
0
23 Feb 2022
1
2
3
Next