Optimal Exploitation of Clustering and History Information in Multi-Armed Bandit

31 May 2019

Papers citing "Optimal Exploitation of Clustering and History Information in Multi-Armed Bandit"

3 / 3 papers shown

Title
Online Bandit Learning with Offline Preference Data for Improved RLHF Akhil Agnihotri Rahul Jain Deepak Ramachandran Zheng Wen OffRL 42 2 0 13 Jun 2024
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits Siddhartha Banerjee Sean R. Sinclair Milind Tambe Lily Xu Chao Yu AI4TS 31 6 0 30 Sep 2022
How can AI Automate End-to-End Data Science? Charu C. Aggarwal Djallel Bouneffouf Horst Samulowitz Beat Buesser T. Hoang ... Tejaswini Pedapati Parikshit Ram Ambrish Rawat Martin Wistuba Alexander G. Gray 88 14 0 22 Oct 2019