Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.01303
Cited By
An Analysis of Ensemble Sampling
2 March 2022
Chao Qin
Zheng Wen
Xiuyuan Lu
Benjamin Van Roy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Analysis of Ensemble Sampling"
18 / 18 papers shown
Title
Enhancing Diversity in Parallel Agents: A Maximum State Entropy Exploration Story
Vincenzo De Paola
Riccardo Zamboni
Mirco Mutti
Marcello Restelli
19
0
0
02 May 2025
Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model
Moritz A. Zanger
Pascal R. van der Vaart
Wendelin Bohmer
M. Spaan
UQCV
BDL
149
0
0
14 Mar 2025
Improved Regret of Linear Ensemble Sampling
Harin Lee
Min-hwan Oh
37
0
0
06 Nov 2024
Sample-Efficient Alignment for LLMs
Zichen Liu
Changyu Chen
Chao Du
Wee Sun Lee
Min-Bin Lin
36
3
0
03 Nov 2024
Bayesian Bandit Algorithms with Approximate Inference in Stochastic Linear Bandits
Ziyi Huang
Henry Lam
Haofeng Zhang
33
0
0
20 Jun 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
37
2
0
13 Jun 2024
Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks
Shaunak A. Mehta
Soheil Habibian
Dylan P. Losey
SSL
65
2
0
20 Mar 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Yingru Li
Jiawei Xu
Lei Han
Zhi-Quan Luo
BDL
OffRL
26
6
0
05 Feb 2024
Ensemble sampling for linear bandits: small ensembles suffice
David Janz
A. Litvak
Csaba Szepesvári
30
2
0
14 Nov 2023
Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Zheqing Zhu
Yueyang Liu
Xu Kuang
Benjamin Van Roy
AI4TS
29
0
0
11 Oct 2023
Scalable Neural Contextual Bandit for Recommender Systems
Zheqing Zhu
Benjamin Van Roy
OffRL
24
9
0
26 Jun 2023
Leveraging Demonstrations to Improve Online Learning: Quality Matters
Botao Hao
Rahul Jain
Tor Lattimore
Benjamin Van Roy
Zheng Wen
24
8
0
07 Feb 2023
Multiplier Bootstrap-based Exploration
Runzhe Wan
Haoyu Wei
B. Kveton
R. Song
16
2
0
03 Feb 2023
Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping
Vikranth Dwaracherla
Zheng Wen
Ian Osband
Xiuyuan Lu
S. Asghari
Benjamin Van Roy
UQCV
24
17
0
08 Jun 2022
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits
Gergely Neu
Julia Olkhovskaya
Matteo Papini
Ludovic Schwartz
33
16
0
27 May 2022
Anti-Concentrated Confidence Bonuses for Scalable Exploration
Jordan T. Ash
Cyril Zhang
Surbhi Goel
A. Krishnamurthy
Sham Kakade
35
6
0
21 Oct 2021
Deep Exploration for Recommendation Systems
Zheqing Zhu
Benjamin Van Roy
32
11
0
26 Sep 2021
Stochastic Low-rank Tensor Bandits for Multi-dimensional Online Decision Making
Jie Zhou
Botao Hao
Zheng Wen
Jingfei Zhang
W. Sun
35
6
0
31 Jul 2020
1