23
0

Traceable Black-box Watermarks for Federated Learning

Abstract

Due to the distributed nature of Federated Learning (FL) systems, each local client has access to the global model, posing a critical risk of model leakage. Existing works have explored injecting watermarks into local models to enable intellectual property protection. However, these methods either focus on non-traceable watermarks or traceable but white-box watermarks. We identify a gap in the literature regarding the formal definition of traceable black-box watermarking and the formulation of the problem of injecting such watermarks into FL systems. In this work, we first formalize the problem of injecting traceable black-box watermarks into FL. Based on the problem, we propose a novel server-side watermarking method, TraMark\mathbf{TraMark}, which creates a traceable watermarked model for each client, enabling verification of model leakage in black-box settings. To achieve this, TraMark\mathbf{TraMark} partitions the model parameter space into two distinct regions: the main task region and the watermarking region. Subsequently, a personalized global model is constructed for each client by aggregating only the main task region while preserving the watermarking region. Each model then learns a unique watermark exclusively within the watermarking region using a distinct watermark dataset before being sent back to the local client. Extensive results across various FL systems demonstrate that TraMark\mathbf{TraMark} ensures the traceability of all watermarked models while preserving their main task performance.

View on arXiv
@article{xu2025_2505.13651,
  title={ Traceable Black-box Watermarks for Federated Learning },
  author={ Jiahao Xu and Rui Hu and Olivera Kotevska and Zikai Zhang },
  journal={arXiv preprint arXiv:2505.13651},
  year={ 2025 }
}
Comments on this paper