ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.14115
29
0

A Communication and Computation Efficient Fully First-order Method for Decentralized Bilevel Optimization

18 October 2024
Min Wen
Chengchang Liu
Ahmed Abdelmoniem
Yipeng Zhou
Yuedong Xu
ArXivPDFHTML
Abstract

Bilevel optimization, crucial for hyperparameter tuning, meta-learning and reinforcement learning, remains less explored in the decentralized learning paradigm, such as decentralized federated learning (DFL). Typically, decentralized bilevel methods rely on both gradients and Hessian matrices to approximate hypergradients of upper-level models. However, acquiring and sharing the second-order oracle is compute and communication intensive. % and sharing this information incurs heavy communication overhead. To overcome these challenges, this paper introduces a fully first-order decentralized method for decentralized Bilevel optimization, C2\text{C}^2C2DFB which is both compute- and communicate-efficient. In C2\text{C}^2C2DFB, each learning node optimizes a min-min-max problem to approximate hypergradient by exclusively using gradients information. To reduce the traffic load at the inner-loop of solving the lower-level problem, C2\text{C}^2C2DFB incorporates a lightweight communication protocol for efficiently transmitting compressed residuals of local parameters. % during the inner loops. Rigorous theoretical analysis ensures its convergence % of the algorithm, indicating a first-order oracle calls of O~(ϵ−4)\tilde{\mathcal{O}}(\epsilon^{-4})O~(ϵ−4). Experiments on hyperparameter tuning and hyper-representation tasks validate the superiority of C2\text{C}^2C2DFB across various typologies and heterogeneous data distributions.

View on arXiv
Comments on this paper