ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.11786
  4. Cited By
Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection

Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection

20 January 2025
Ali Naseh
Niloofar Mireshghallah
ArXiv (abs)PDFHTML

Papers citing "Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection"

16 / 16 papers shown
Title
Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
Jie Zhang
Debeshee Das
Gautam Kamath
Florian Tramèr
MIALMMIACV
314
27
1
29 Sep 2024
Evaluating Copyright Takedown Methods for Language Models
Evaluating Copyright Takedown Methods for Language Models
Boyi Wei
Weijia Shi
Yangsibo Huang
Noah A. Smith
Chiyuan Zhang
Luke Zettlemoyer
Kai Li
Peter Henderson
145
25
0
26 Jun 2024
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Debeshee Das
Jie Zhang
Florian Tramèr
MIALM
180
39
1
23 Jun 2024
LLM Dataset Inference: Did you train on my dataset?
LLM Dataset Inference: Did you train on my dataset?
Pratyush Maini
Hengrui Jia
Nicolas Papernot
Adam Dziedzic
MIALM
146
47
0
10 Jun 2024
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models
Jingyang Zhang
Jingwei Sun
Eric C. Yeats
Ouyang Yang
Martin Kuo
Jianyi Zhang
Hao Frank Yang
Hai "Helen" Li
154
54
0
03 Apr 2024
PANORAMIA: Privacy Auditing of Machine Learning Models without
  Retraining
PANORAMIA: Privacy Auditing of Machine Learning Models without Retraining
Mishaal Kazmi
H. Lautraite
Alireza Akbari
Mauricio Soroco
Qiaoyue Tang
Tao Wang
Sébastien Gambs
Mathias Lécuyer
95
11
0
12 Feb 2024
Do Membership Inference Attacks Work on Large Language Models?
Do Membership Inference Attacks Work on Large Language Models?
Michael Duan
Anshuman Suri
Niloofar Mireshghallah
Sewon Min
Weijia Shi
Luke Zettlemoyer
Yulia Tsvetkov
Yejin Choi
David Evans
Hanna Hajishirzi
MIALM
132
101
0
12 Feb 2024
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Lianghui Zhu
Xinggang Wang
Xinlong Wang
ELMALM
184
143
0
26 Oct 2023
Detecting Pretraining Data from Large Language Models
Detecting Pretraining Data from Large Language Models
Weijia Shi
Anirudh Ajith
Mengzhou Xia
Yangsibo Huang
Daogao Liu
Terra Blevins
Danqi Chen
Luke Zettlemoyer
MIALM
122
201
0
25 Oct 2023
Synthetic is all you need: removing the auxiliary data assumption for
  membership inference attacks against synthetic data
Synthetic is all you need: removing the auxiliary data assumption for membership inference attacks against synthetic data
Florent Guépin
Matthieu Meeus
Ana-Maria Cretu
Yves-Alexandre de Montjoye
63
10
0
04 Jul 2023
On Evaluating Multilingual Compositional Generalization with Translated
  Datasets
On Evaluating Multilingual Compositional Generalization with Translated Datasets
Zi Wang
Daniel Hershcovich
102
7
0
20 Jun 2023
Smaller Language Models are Better Black-box Machine-Generated Text
  Detectors
Smaller Language Models are Better Black-box Machine-Generated Text Detectors
Niloofar Mireshghallah
Justus Mattern
Sicun Gao
Reza Shokri
Taylor Berg-Kirkpatrick
DeLMO
123
48
0
17 May 2023
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability
  Curvature
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
E. Mitchell
Yoonho Lee
Alexander Khazatsky
Christopher D. Manning
Chelsea Finn
132
633
0
26 Jan 2023
Membership Inference Attacks From First Principles
Membership Inference Attacks From First Principles
Nicholas Carlini
Steve Chien
Milad Nasr
Shuang Song
Andreas Terzis
Florian Tramèr
MIACVMIALM
175
713
0
07 Dec 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
486
2,128
0
31 Dec 2020
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
Basel Alomair
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAUSILM
574
1,969
0
14 Dec 2020
1