Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.11817
Cited By
Connections Between Mirror Descent, Thompson Sampling and the Information Ratio
28 May 2019
Julian Zimmert
Tor Lattimore
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Connections Between Mirror Descent, Thompson Sampling and the Information Ratio"
13 / 13 papers shown
Title
A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of
Θ
(
T
2
/
3
)
Θ(T^{2/3})
Θ
(
T
2/3
)
and its Application to Best-of-Both-Worlds
Taira Tsuchiya
Shinji Ito
26
0
0
30 May 2024
On the Minimax Regret for Online Learning with Feedback Graphs
Khaled Eldowa
Emmanuel Esposito
Tommaso Cesari
Nicolò Cesa-Bianchi
33
8
0
24 May 2023
Bayesian Reinforcement Learning with Limited Cognitive Load
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
OffRL
34
8
0
05 May 2023
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
Christoph Dann
Chen-Yu Wei
Julian Zimmert
26
22
0
20 Feb 2023
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Mengdi Wang
Furong Huang
Dinesh Manocha
24
7
0
28 Jan 2023
Anytime-valid off-policy inference for contextual bandits
Ian Waudby-Smith
Lili Wu
Aaditya Ramdas
Nikos Karampatziakis
Paul Mineiro
OffRL
43
25
0
19 Oct 2022
Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds
Shinji Ito
Taira Tsuchiya
Junya Honda
AAML
23
16
0
14 Jun 2022
Minimax Regret for Partial Monitoring: Infinite Outcomes and Rustichini's Regret
Tor Lattimore
22
16
0
22 Feb 2022
Gaussian Imagination in Bandit Learning
Yueyang Liu
Adithya M. Devraj
Benjamin Van Roy
Kuang Xu
40
7
0
06 Jan 2022
The Value of Information When Deciding What to Learn
Dilip Arumugam
Benjamin Van Roy
37
12
0
26 Oct 2021
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
30
70
0
06 Mar 2021
Exploration by Optimisation in Partial Monitoring
Tor Lattimore
Csaba Szepesvári
33
38
0
12 Jul 2019
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits
Julian Zimmert
Yevgeny Seldin
AAML
24
175
0
19 Jul 2018
1