ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.05014
40
0

Sample Complexity of Identifying the Nonredundancy of Nontransitive Games in Dueling Bandits

8 May 2025
Shang Lu
Shuji Kijima
ArXivPDFHTML
Abstract

Dueling bandit is a variant of the Multi-armed bandit to learn the binary relation by comparisons. Most work on the dueling bandit has targeted transitive relations, that is, totally/partially ordered sets, or assumed at least the existence of a champion such as Condorcet winner and Copeland winner. This work develops an analysis of dueling bandits for non-transitive relations. Jan-ken (a.k.a. rock-paper-scissors) is a typical example of a non-transitive relation. It is known that a rational player chooses one of three items uniformly at random, which is known to be Nash equilibrium in game theory. Interestingly, any variant of Jan-ken with four items (e.g., rock, paper, scissors, and well) contains at least one useless item, which is never selected by a rational player. This work investigates a dueling bandit problem to identify whether all nnn items are indispensable in a given win-lose relation. Then, we provide upper and lower bounds of the sample complexity of the identification problem in terms of the determinant of AAA and a solution of x⊤A=0⊤\mathbf{x}^{\top} A = \mathbf{0}^{\top}x⊤A=0⊤ where AAA is an n×nn \times nn×n pay-off matrix that every duel follows.

View on arXiv
@article{lu2025_2505.05014,
  title={ Sample Complexity of Identifying the Nonredundancy of Nontransitive Games in Dueling Bandits },
  author={ Shang Lu and Shuji Kijima },
  journal={arXiv preprint arXiv:2505.05014},
  year={ 2025 }
}
Comments on this paper