Private Transformer Inference in MLaaS: A Survey

15 May 2025

Abstract

Transformer models have revolutionized AI, powering applications like content generation and sentiment analysis. However, their deployment in Machine Learning as a Service (MLaaS) raises significant privacy concerns, primarily due to the centralized processing of sensitive user data. Private Transformer Inference (PTI) offers a solution by utilizing cryptographic techniques such as secure multi-party computation and homomorphic encryption, enabling inference while preserving both user data and model privacy. This paper reviews recent PTI advancements, highlighting state-of-the-art solutions and challenges. We also introduce a structured taxonomy and evaluation framework for PTI, focusing on balancing resource efficiency with privacy and bridging the gap between high-performance inference and data privacy.

View on arXiv

@article{li2025_2505.10315,
  title={ Private Transformer Inference in MLaaS: A Survey },
  author={ Yang Li and Xinyu Zhou and Yitong Wang and Liangxin Qian and Jun Zhao },
  journal={arXiv preprint arXiv:2505.10315},
  year={ 2025 }
}

Comments on this paper