The Batch Complexity of Bandit Pure Exploration

3 February 2025

Abstract

In a fixed-confidence pure exploration problem in stochastic multi-armed bandits, an algorithm iteratively samples arms and should stop as early as possible and return the correct answer to a query about the arms distributions. We are interested in batched methods, which change their sampling behaviour only a few times, between batches of observations. We give an instance-dependent lower bound on the number of batches used by any sample efficient algorithm for any pure exploration task. We then give a general batched algorithm and prove upper bounds on its expected sample complexity and batch complexity. We illustrate both lower and upper bounds on best-arm identification and thresholding bandits.

View on arXiv

@article{tuynman2025_2502.01425,
  title={ The Batch Complexity of Bandit Pure Exploration },
  author={ Adrienne Tuynman and Rémy Degenne },
  journal={arXiv preprint arXiv:2502.01425},
  year={ 2025 }
}

Comments on this paper