ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.07404
20
25

Almost Optimal Algorithms for Two-player Zero-Sum Linear Mixture Markov Games

15 February 2021
Zixiang Chen
Dongruo Zhou
Quanquan Gu
ArXivPDFHTML
Abstract

We study reinforcement learning for two-player zero-sum Markov games with simultaneous moves in the finite-horizon setting, where the transition kernel of the underlying Markov games can be parameterized by a linear function over the current state, both players' actions and the next state. In particular, we assume that we can control both players and aim to find the Nash Equilibrium by minimizing the duality gap. We propose an algorithm Nash-UCRL based on the principle "Optimism-in-Face-of-Uncertainty". Our algorithm only needs to find a Coarse Correlated Equilibrium (CCE), which is computationally efficient. Specifically, we show that Nash-UCRL can provably achieve an O~(dHT)\tilde{O}(dH\sqrt{T})O~(dHT​) regret, where ddd is the linear function dimension, HHH is the length of the game and TTT is the total number of steps in the game. To assess the optimality of our algorithm, we also prove an Ω~(dHT)\tilde{\Omega}( dH\sqrt{T})Ω~(dHT​) lower bound on the regret. Our upper bound matches the lower bound up to logarithmic factors, which suggests the optimality of our algorithm.

View on arXiv
Comments on this paper