ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1301.2609
  4. Cited By
Learning to Optimize Via Posterior Sampling

Learning to Optimize Via Posterior Sampling

11 January 2013
Daniel Russo
Benjamin Van Roy
ArXivPDFHTML

Papers citing "Learning to Optimize Via Posterior Sampling"

50 / 147 papers shown
Title
Deep Hierarchy in Bandits
Deep Hierarchy in Bandits
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
33
20
0
03 Feb 2022
Meta-Learning Hypothesis Spaces for Sequential Decision-making
Meta-Learning Hypothesis Spaces for Sequential Decision-making
Parnian Kassraie
Jonas Rothfuss
Andreas Krause
OffRL
47
6
0
01 Feb 2022
Optimal Regret Is Achievable with Bounded Approximate Inference Error:
  An Enhanced Bayesian Upper Confidence Bound Framework
Optimal Regret Is Achievable with Bounded Approximate Inference Error: An Enhanced Bayesian Upper Confidence Bound Framework
Ziyi Huang
Henry Lam
A. Meisami
Haofeng Zhang
41
4
0
31 Jan 2022
Gaussian Imagination in Bandit Learning
Gaussian Imagination in Bandit Learning
Yueyang Liu
Adithya M. Devraj
Benjamin Van Roy
Kuang Xu
40
7
0
06 Jan 2022
A Free Lunch from the Noise: Provable and Practical Exploration for
  Representation Learning
A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning
Tongzheng Ren
Tianjun Zhang
Csaba Szepesvári
Bo Dai
39
19
0
22 Nov 2021
Hierarchical Bayesian Bandits
Hierarchical Bayesian Bandits
Joey Hong
Branislav Kveton
Manzil Zaheer
Mohammad Ghavamzadeh
FedML
52
38
0
12 Nov 2021
Solving Multi-Arm Bandit Using a Few Bits of Communication
Solving Multi-Arm Bandit Using a Few Bits of Communication
Osama A. Hanna
Lin F. Yang
Christina Fragouli
29
16
0
11 Nov 2021
Online Learning of Energy Consumption for Navigation of Electric
  Vehicles
Online Learning of Energy Consumption for Navigation of Electric Vehicles
Niklas Åkerblom
Yuxin Chen
M. Chehreghani
30
12
0
03 Nov 2021
Analysis of Thompson Sampling for Partially Observable Contextual
  Multi-Armed Bandits
Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits
Yash J. Patel
Mohamad Kazem Shirani Faradonbeh
16
15
0
23 Oct 2021
Representation Learning for Online and Offline RL in Low-rank MDPs
Representation Learning for Online and Offline RL in Low-rank MDPs
Masatoshi Uehara
Xuezhou Zhang
Wen Sun
OffRL
67
127
0
09 Oct 2021
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement
  Learning
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning
Tong Zhang
27
63
0
02 Oct 2021
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored
  Online Binary Classification
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification
James A. Grant
David S. Leslie
50
3
0
29 Sep 2021
Online Learning of Network Bottlenecks via Minimax Paths
Online Learning of Network Bottlenecks via Minimax Paths
Niklas Åkerblom
F. Hoseini
M. Chehreghani
37
10
0
17 Sep 2021
Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models
Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models
Runzhe Wan
Linjuan Ge
Rui Song
38
28
0
13 Aug 2021
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit
  Partial Observability
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
Dibya Ghosh
Jad Rahme
Aviral Kumar
Amy Zhang
Ryan P. Adams
Sergey Levine
OffRL
286
109
0
13 Jul 2021
No Regrets for Learning the Prior in Bandits
No Regrets for Learning the Prior in Bandits
Soumya Basu
Branislav Kveton
Manzil Zaheer
Csaba Szepesvári
46
33
0
13 Jul 2021
Metalearning Linear Bandits by Prior Update
Metalearning Linear Bandits by Prior Update
Amit Peleg
Naama Pearl
Ron Meir
42
18
0
12 Jul 2021
Information Directed Sampling for Sparse Linear Bandits
Information Directed Sampling for Sparse Linear Bandits
Botao Hao
Tor Lattimore
Wei Deng
25
19
0
29 May 2021
Policy Learning with Adaptively Collected Data
Policy Learning with Adaptively Collected Data
Ruohan Zhan
Zhimei Ren
Susan Athey
Zhengyuan Zhou
OffRL
45
27
0
05 May 2021
When and Whom to Collaborate with in a Changing Environment: A
  Collaborative Dynamic Bandit Solution
When and Whom to Collaborate with in a Changing Environment: A Collaborative Dynamic Bandit Solution
Chuanhao Li
Qingyun Wu
Hongning Wang
50
5
0
14 Apr 2021
UCB-based Algorithms for Multinomial Logistic Regression Bandits
UCB-based Algorithms for Multinomial Logistic Regression Bandits
Sanae Amani
Christos Thrampoulidis
34
10
0
21 Mar 2021
Reinforcement Learning, Bit by Bit
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
30
70
0
06 Mar 2021
Fairness of Exposure in Stochastic Bandits
Fairness of Exposure in Stochastic Bandits
Lequn Wang
Yiwei Bai
Wen Sun
Thorsten Joachims
FaML
29
49
0
03 Mar 2021
Online Multi-Armed Bandits with Adaptive Inference
Online Multi-Armed Bandits with Adaptive Inference
Maria Dimakopoulou
Zhimei Ren
Zhengyuan Zhou
40
34
0
25 Feb 2021
Online Learning for Unknown Partially Observable MDPs
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
39
20
0
25 Feb 2021
Bayesian adversarial multi-node bandit for optimal smart grid protection
  against cyber attacks
Bayesian adversarial multi-node bandit for optimal smart grid protection against cyber attacks
Jianyu Xu
Bin Liu
H. Mo
D. Dong
AAML
16
22
0
20 Feb 2021
The Elliptical Potential Lemma for General Distributions with an
  Application to Linear Thompson Sampling
The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling
N. Hamidi
Mohsen Bayati
22
1
0
16 Feb 2021
Meta-Thompson Sampling
Meta-Thompson Sampling
Branislav Kveton
Mikhail Konobeev
Manzil Zaheer
Chih-Wei Hsu
Martin Mladenov
Craig Boutilier
Csaba Szepesvári
50
61
0
11 Feb 2021
Non-Stationary Latent Bandits
Non-Stationary Latent Bandits
Joey Hong
Branislav Kveton
Manzil Zaheer
Yinlam Chow
Amr Ahmed
Mohammad Ghavamzadeh
Craig Boutilier
OffRL
38
13
0
01 Dec 2020
Model-based Reinforcement Learning for Continuous Control with Posterior
  Sampling
Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
Ying Fan
Yifei Ming
33
17
0
20 Nov 2020
Asymptotic Convergence of Thompson Sampling
Asymptotic Convergence of Thompson Sampling
Cem Kalkanli
Ayfer Özgür
8
5
0
08 Nov 2020
Reinforcement Learning for Efficient and Tuning-Free Link Adaptation
Reinforcement Learning for Efficient and Tuning-Free Link Adaptation
Vidit Saxena
H. Tullberg
Joakim Jaldén
21
36
0
16 Oct 2020
Neural Thompson Sampling
Neural Thompson Sampling
Weitong Zhang
Dongruo Zhou
Lihong Li
Quanquan Gu
34
115
0
02 Oct 2020
On Information Gain and Regret Bounds in Gaussian Process Bandits
On Information Gain and Regret Bounds in Gaussian Process Bandits
Sattar Vakili
Kia Khezeli
Victor Picheny
GP
29
128
0
15 Sep 2020
IntelligentPooling: Practical Thompson Sampling for mHealth
IntelligentPooling: Practical Thompson Sampling for mHealth
Sabina Tomkins
Peng Liao
P. Klasnja
Susan Murphy
41
31
0
31 Jul 2020
A Partially Observable MDP Approach for Sequential Testing for
  Infectious Diseases such as COVID-19
A Partially Observable MDP Approach for Sequential Testing for Infectious Diseases such as COVID-19
Rahul Singh
Fang Liu
Ness B. Shroff
23
6
0
25 Jul 2020
Competing Bandits: The Perils of Exploration Under Competition
Competing Bandits: The Perils of Exploration Under Competition
Guy Aridor
Yishay Mansour
Aleksandrs Slivkins
Zhiwei Steven Wu
25
16
0
20 Jul 2020
Information Theoretic Regret Bounds for Online Nonlinear Control
Information Theoretic Regret Bounds for Online Nonlinear Control
Sham Kakade
A. Krishnamurthy
Kendall Lowrey
Motoya Ohnishi
Wen Sun
38
117
0
22 Jun 2020
TS-UCB: Improving on Thompson Sampling With Little to No Additional
  Computation
TS-UCB: Improving on Thompson Sampling With Little to No Additional Computation
Jackie Baek
Vivek F. Farias
45
9
0
11 Jun 2020
Scalable Thompson Sampling using Sparse Gaussian Process Models
Scalable Thompson Sampling using Sparse Gaussian Process Models
Sattar Vakili
Henry B. Moss
A. Artemev
Vincent Dutordoir
Victor Picheny
13
34
0
09 Jun 2020
Random Hypervolume Scalarizations for Provable Multi-Objective Black Box
  Optimization
Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization
Daniel Golovin
Qiuyi Zhang
33
70
0
08 Jun 2020
Model-Based Reinforcement Learning with Value-Targeted Regression
Model-Based Reinforcement Learning with Value-Targeted Regression
Alex Ayoub
Zeyu Jia
Csaba Szepesvári
Mengdi Wang
Lin F. Yang
OffRL
59
299
0
01 Jun 2020
Seamlessly Unifying Attributes and Items: Conversational Recommendation
  for Cold-Start Users
Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users
Shijun Li
Wenqiang Lei
Qingyun Wu
Xiangnan He
Peng Jiang
Tat-Seng Chua
31
118
0
23 May 2020
Online Learning and Optimization for Revenue Management Problems with
  Add-on Discounts
Online Learning and Optimization for Revenue Management Problems with Add-on Discounts
D. Simchi-Levi
Rui Sun
Huanan Zhang
16
11
0
02 May 2020
Sequential Batch Learning in Finite-Action Linear Contextual Bandits
Sequential Batch Learning in Finite-Action Linear Contextual Bandits
Yanjun Han
Zhengqing Zhou
Zhengyuan Zhou
Jose H. Blanchet
Peter Glynn
Yinyu Ye
OffRL
9
71
0
14 Apr 2020
Online Residential Demand Response via Contextual Multi-Armed Bandits
Online Residential Demand Response via Contextual Multi-Armed Bandits
Xin Chen
Yutong Nie
Na Li
16
30
0
07 Mar 2020
Improved Optimistic Algorithms for Logistic Bandits
Improved Optimistic Algorithms for Logistic Bandits
Louis Faury
Marc Abeille
Clément Calauzènes
Olivier Fercoq
23
85
0
18 Feb 2020
Bayesian Optimization for Categorical and Category-Specific Continuous
  Inputs
Bayesian Optimization for Categorical and Category-Specific Continuous Inputs
Dang Nguyen
Sunil R. Gupta
Santu Rana
A. Shilton
Svetha Venkatesh
16
50
0
28 Nov 2019
Comments on the Du-Kakade-Wang-Yang Lower Bounds
Comments on the Du-Kakade-Wang-Yang Lower Bounds
Benjamin Van Roy
Shi Dong
22
38
0
18 Nov 2019
Neural Contextual Bandits with UCB-based Exploration
Neural Contextual Bandits with UCB-based Exploration
Dongruo Zhou
Lihong Li
Quanquan Gu
38
15
0
11 Nov 2019
Previous
123
Next