Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.09724
Cited By
Information-Theoretic Confidence Bounds for Reinforcement Learning
21 November 2019
Xiuyuan Lu
Benjamin Van Roy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Information-Theoretic Confidence Bounds for Reinforcement Learning"
20 / 20 papers shown
Title
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
94
1
0
29 Apr 2025
On Bits and Bandits: Quantifying the Regret-Information Trade-off
Itai Shufaro
Nadav Merlis
Nir Weinberger
Shie Mannor
40
0
0
26 May 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Yingru Li
Jiawei Xu
Lei Han
Zhi-Quan Luo
BDL
OffRL
36
5
0
05 Feb 2024
Bayesian Reinforcement Learning with Limited Cognitive Load
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
OffRL
39
8
0
05 May 2023
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Mengdi Wang
Furong Huang
Dinesh Manocha
24
8
0
28 Jan 2023
Multi-Task Off-Policy Learning from Bandit Feedback
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
OffRL
40
10
0
09 Dec 2022
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
36
4
0
30 Oct 2022
Graph Neural Network Bandits
Parnian Kassraie
Andreas Krause
Ilija Bogunovic
38
11
0
13 Jul 2022
Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework
Runzhe Wan
Linjuan Ge
Rui Song
23
13
0
26 Feb 2022
Meta-Learning for Simple Regret Minimization
Javad Azizi
Branislav Kveton
Mohammad Ghavamzadeh
S. Katariya
22
10
0
25 Feb 2022
Deep Hierarchy in Bandits
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
33
20
0
03 Feb 2022
Gaussian Imagination in Bandit Learning
Yueyang Liu
Adithya M. Devraj
Benjamin Van Roy
Kuang Xu
40
7
0
06 Jan 2022
Hierarchical Bayesian Bandits
Joey Hong
Branislav Kveton
Manzil Zaheer
Mohammad Ghavamzadeh
FedML
52
38
0
12 Nov 2021
No Regrets for Learning the Prior in Bandits
Soumya Basu
Branislav Kveton
Manzil Zaheer
Csaba Szepesvári
46
33
0
13 Jul 2021
Metalearning Linear Bandits by Prior Update
Amit Peleg
Naama Pearl
Ron Meir
42
18
0
12 Jul 2021
Information Directed Sampling for Sparse Linear Bandits
Botao Hao
Tor Lattimore
Wei Deng
25
19
0
29 May 2021
An Upper Confidence Bound for Simultaneous Exploration and Exploitation in Heterogeneous Multi-Robot Systems
Ki Myung Brian Lee
Felix H. Kong
Ricardo Cannizzaro
J. Palmer
David Johnson
C. Yoo
Robert Fitch
47
15
0
13 May 2021
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
30
70
0
06 Mar 2021
Meta-Thompson Sampling
Branislav Kveton
Mikhail Konobeev
Manzil Zaheer
Chih-Wei Hsu
Martin Mladenov
Craig Boutilier
Csaba Szepesvári
50
61
0
11 Feb 2021
Information Theoretic Regret Bounds for Online Nonlinear Control
Sham Kakade
A. Krishnamurthy
Kendall Lowrey
Motoya Ohnishi
Wen Sun
38
117
0
22 Jun 2020
1