ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.10089
  4. Cited By
Perturbed-History Exploration in Stochastic Multi-Armed Bandits
v1v2 (latest)

Perturbed-History Exploration in Stochastic Multi-Armed Bandits

26 February 2019
Branislav Kveton
Csaba Szepesvári
Mohammad Ghavamzadeh
Craig Boutilier
ArXiv (abs)PDFHTML

Papers citing "Perturbed-History Exploration in Stochastic Multi-Armed Bandits"

10 / 10 papers shown
Title
Perturbed-History Exploration in Stochastic Linear Bandits
Perturbed-History Exploration in Stochastic Linear Bandits
Branislav Kveton
Csaba Szepesvári
Mohammad Ghavamzadeh
Craig Boutilier
36
43
0
21 Mar 2019
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits
Branislav Kveton
Csaba Szepesvári
Sharan Vaswani
Zheng Wen
Mohammad Ghavamzadeh
Tor Lattimore
142
70
0
13 Nov 2018
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep
  Networks for Thompson Sampling
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling
C. Riquelme
George Tucker
Jasper Snoek
BDL
79
366
0
26 Feb 2018
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems
Zachary Chase Lipton
Xiujun Li
Jianfeng Gao
Lihong Li
Faisal Ahmed
Li Deng
68
172
0
15 Nov 2017
Ensemble Sampling
Ensemble Sampling
Xiuyuan Lu
Benjamin Van Roy
129
121
0
20 May 2017
Cascading Bandits: Learning to Rank in the Cascade Model
Cascading Bandits: Learning to Rank in the Cascade Model
Branislav Kveton
Csaba Szepesvári
Zheng Wen
Azin Ashkan
197
284
0
10 Feb 2015
Thompson Sampling for Complex Bandit Problems
Thompson Sampling for Complex Bandit Problems
Aditya Gopalan
Shie Mannor
Yishay Mansour
152
203
0
03 Nov 2013
An efficient algorithm for learning with semi-bandit feedback
An efficient algorithm for learning with semi-bandit feedback
Gergely Neu
Gábor Bartók
127
80
0
13 May 2013
Thompson Sampling for Contextual Bandits with Linear Payoffs
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
195
1,006
0
15 Sep 2012
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
Aurélien Garivier
Olivier Cappé
180
615
0
12 Feb 2011
1