ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.19325
40
0

Partition Tree Weighting for Non-Stationary Stochastic Bandits

26 February 2025
Joel Veness
Marcus Hutter
Andras Gyorgy
Jordi Grau-Moya
ArXiv (abs)PDFHTML
Abstract

This paper considers a generalisation of universal source coding for interaction data, namely data streams that have actions interleaved with observations. Our goal will be to construct a coding distribution that is both universal \emph{and} can be used as a control policy. Allowing for action generation needs careful treatment, as naive approaches which do not distinguish between actions and observations run into the self-delusion problem in universal settings. We showcase our perspective in the context of the challenging non-stationary stochastic Bernoulli bandit problem. Our main contribution is an efficient and high performing algorithm for this problem that generalises the Partition Tree Weighting universal source coding technique for passive prediction to the control setting.

View on arXiv
@article{veness2025_2502.19325,
  title={ Partition Tree Weighting for Non-Stationary Stochastic Bandits },
  author={ Joel Veness and Marcus Hutter and Andras Gyorgy and Jordi Grau-Moya },
  journal={arXiv preprint arXiv:2502.19325},
  year={ 2025 }
}
Comments on this paper