v1v2 (latest)

Position: Federated Foundation Language Model Post-Training Should Focus on Open-Source Models

29 May 2025

Main:7 Pages

Bibliography:6 Pages

1 Tables

Abstract

Post-training of foundation language models has emerged as a promising research domain in federated learning (FL) with the goal to enable privacy-preserving model improvements and adaptations to user's downstream tasks. Recent advances in this area adopt centralized post-training approaches that build upon black-box foundation language models where there is no access to model weights and architecture details. Although the use of black-box models has been successful in centralized post-training, their blind replication in FL raises several concerns. Our position is that using black-box models in FL contradicts the core principles of federation such as data privacy and autonomy. In this position paper, we critically analyze the usage of black-box models in federated post-training, and provide a detailed account of various aspects of openness and their implications for FL.

View on arXiv

@article{agrawal2025_2505.23593,
  title={ Position: Federated Foundation Language Model Post-Training Should Focus on Open-Source Models },
  author={ Nikita Agrawal and Simon Mertel and Ruben Mayer },
  journal={arXiv preprint arXiv:2505.23593},
  year={ 2025 }
}

Comments on this paper