An Analysis of Ensemble Sampling

An Analysis of Ensemble Sampling

2 March 2022

Benjamin Van Roy

Papers citing "An Analysis of Ensemble Sampling"

18 / 18 papers shown

Title
Enhancing Diversity in Parallel Agents: A Maximum State Entropy Exploration Story Vincenzo De Paola Riccardo Zamboni Mirco Mutti Marcello Restelli 19 0 0 02 May 2025
Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model Moritz A. Zanger Pascal R. van der Vaart Wendelin Bohmer M. Spaan UQCV BDL 149 0 0 14 Mar 2025
Improved Regret of Linear Ensemble Sampling Harin Lee Min-hwan Oh 37 0 0 06 Nov 2024
Sample-Efficient Alignment for LLMs Zichen Liu Changyu Chen Chao Du Wee Sun Lee Min-Bin Lin 36 3 0 03 Nov 2024
Bayesian Bandit Algorithms with Approximate Inference in Stochastic Linear Bandits Ziyi Huang Henry Lam Haofeng Zhang 33 0 0 20 Jun 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF Akhil Agnihotri Rahul Jain Deepak Ramachandran Zheng Wen OffRL 37 2 0 13 Jun 2024
Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks Shaunak A. Mehta Soheil Habibian Dylan P. Losey SSL 65 2 0 20 Mar 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent Yingru Li Jiawei Xu Lei Han Zhi-Quan Luo BDL OffRL 26 6 0 05 Feb 2024
Ensemble sampling for linear bandits: small ensembles suffice David Janz A. Litvak Csaba Szepesvári 30 2 0 14 Nov 2023
Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling Zheqing Zhu Yueyang Liu Xu Kuang Benjamin Van Roy AI4TS 29 0 0 11 Oct 2023
Scalable Neural Contextual Bandit for Recommender Systems Zheqing Zhu Benjamin Van Roy OffRL 24 9 0 26 Jun 2023
Leveraging Demonstrations to Improve Online Learning: Quality Matters Botao Hao Rahul Jain Tor Lattimore Benjamin Van Roy Zheng Wen 24 8 0 07 Feb 2023
Multiplier Bootstrap-based Exploration Runzhe Wan Haoyu Wei B. Kveton R. Song 16 2 0 03 Feb 2023
Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping Vikranth Dwaracherla Zheng Wen Ian Osband Xiuyuan Lu S. Asghari Benjamin Van Roy UQCV 24 17 0 08 Jun 2022
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits Gergely Neu Julia Olkhovskaya Matteo Papini Ludovic Schwartz 33 16 0 27 May 2022
Anti-Concentrated Confidence Bonuses for Scalable Exploration Jordan T. Ash Cyril Zhang Surbhi Goel A. Krishnamurthy Sham Kakade 35 6 0 21 Oct 2021
Deep Exploration for Recommendation Systems Zheqing Zhu Benjamin Van Roy 32 11 0 26 Sep 2021
Stochastic Low-rank Tensor Bandits for Multi-dimensional Online Decision Making Jie Zhou Botao Hao Zheng Wen Jingfei Zhang W. Sun 35 6 0 31 Jul 2020