ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.03392
32
1

Automatic Calibration for Membership Inference Attack on Large Language Models

6 May 2025
Saleh Zare Zade
Yao Qiang
Xiangyu Zhou
Hui Zhu
Mohammad Amin Roshani
Prashant Khanduri
Dongxiao Zhu
ArXivPDFHTML
Abstract

Membership Inference Attacks (MIAs) have recently been employed to determine whether a specific text was part of the pre-training data of Large Language Models (LLMs). However, existing methods often misinfer non-members as members, leading to a high false positive rate, or depend on additional reference models for probability calibration, which limits their practicality. To overcome these challenges, we introduce a novel framework called Automatic Calibration Membership Inference Attack (ACMIA), which utilizes a tunable temperature to calibrate output probabilities effectively. This approach is inspired by our theoretical insights into maximum likelihood estimation during the pre-training of LLMs. We introduce ACMIA in three configurations designed to accommodate different levels of model access and increase the probability gap between members and non-members, improving the reliability and robustness of membership inference. Extensive experiments on various open-source LLMs demonstrate that our proposed attack is highly effective, robust, and generalizable, surpassing state-of-the-art baselines across three widely used benchmarks. Our code is available at: \href{this https URL}{\textcolor{blue}{Github}}.

View on arXiv
@article{zade2025_2505.03392,
  title={ Automatic Calibration for Membership Inference Attack on Large Language Models },
  author={ Saleh Zare Zade and Yao Qiang and Xiangyu Zhou and Hui Zhu and Mohammad Amin Roshani and Prashant Khanduri and Dongxiao Zhu },
  journal={arXiv preprint arXiv:2505.03392},
  year={ 2025 }
}
Comments on this paper