ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.12200
48
4
v1v2 (latest)

Random Effect Bandits

23 June 2021
Rong Zhu
Branislav Kveton
ArXiv (abs)PDFHTML
Abstract

This paper studies regret minimization in multi-armed bandits, a classical online learning problem. To develop more statistically-efficient algorithms, we propose to use the assumption of a random-effect model. In this model, the mean rewards of arms are drawn independently from an unknown distribution, whose parameters we estimate. We provide an estimator of the arm means in this model and also analyze its uncertainty. Based on these results, we design a UCB algorithm, which we call ReUCB. We analyze ReUCB and prove a Bayes regret bound on its nnn-round regret, which matches an existing lower bound. Our experiments show that ReUCB can outperform Thompson sampling in various scenarios, without assuming that the prior distribution of arm means is known.

View on arXiv
Comments on this paper