43
0

Projection-based Lyapunov method for fully heterogeneous weakly-coupled MDPs

Abstract

Heterogeneity poses a fundamental challenge for many real-world large-scale decision-making problems but remains largely understudied. In this paper, we study the fully heterogeneous setting of a prominent class of such problems, known as weakly-coupled Markov decision processes (WCMDPs). Each WCMDP consists of NN arms (or subproblems), which have distinct model parameters in the fully heterogeneous setting, leading to the curse of dimensionality when NN is large. We show that, under mild assumptions, an efficiently computable policy achieves an O(1/N)O(1/\sqrt{N}) optimality gap in the long-run average reward per arm for fully heterogeneous WCMDPs as NN becomes large. This is the first asymptotic optimality result for fully heterogeneous average-reward WCMDPs. Our main technical innovation is the construction of projection-based Lyapunov functions that certify the convergence of rewards and costs to an optimal region, even under full heterogeneity.

View on arXiv
@article{zhang2025_2502.06072,
  title={ Projection-based Lyapunov method for fully heterogeneous weakly-coupled MDPs },
  author={ Xiangcheng Zhang and Yige Hong and Weina Wang },
  journal={arXiv preprint arXiv:2502.06072},
  year={ 2025 }
}
Comments on this paper