Optimal Best-Arm Identification in Bandits with Access to Offline Data

15 June 2023

Papers citing "Optimal Best-Arm Identification in Bandits with Access to Offline Data"

4 / 4 papers shown

Title
Online Bandit Learning with Offline Preference Data for Improved RLHF Akhil Agnihotri Rahul Jain Deepak Ramachandran Zheng Wen OffRL 42 2 0 13 Jun 2024
On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits Antoine Barrier Aurélien Garivier Gilles Stoltz 44 13 0 30 Sep 2022
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits Siddhartha Banerjee Sean R. Sinclair Milind Tambe Lily Xu Chao Yu AI4TS 33 6 0 30 Sep 2022
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems Sergey Levine Aviral Kumar George Tucker Justin Fu OffRL GP 343 1,968 0 04 May 2020