Approximate Function Evaluation via Multi-Armed Bandits

We study the problem of estimating the value of a known smooth function at an unknown point , where each component can be sampled via a noisy oracle. Sampling more frequently components of corresponding to directions of the function with larger directional derivatives is more sample-efficient. However, as is unknown, the optimal sampling frequencies are also unknown. We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least returns an accurate estimate of . We generalize our algorithm to adapt to heteroskedastic noise, and prove asymptotic optimality when is linear. We corroborate our theoretical results with numerical experiments, showing the dramatic gains afforded by adaptivity.
View on arXiv