20
0

Optimal Thresholding Linear Bandit

Abstract

We study a novel pure exploration problem: the ϵ\epsilon-Thresholding Bandit Problem (TBP) with fixed confidence in stochastic linear bandits. We prove a lower bound for the sample complexity and extend an algorithm designed for Best Arm Identification in the linear case to TBP that is asymptotically optimal.

View on arXiv
Comments on this paper