ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.04801
  4. Cited By
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs

Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs

5 March 2024
Aly M. Kassem
Omar Mahmoud
Niloofar Mireshghallah
Hyunwoo J. Kim
Yulia Tsvetkov
Yejin Choi
Sherif Saad
Santu Rana
ArXivPDFHTML

Papers citing "Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs"

46 / 46 papers shown
Title
AdaToken-3D: Dynamic Spatial Gating for Efficient 3D Large Multimodal-Models Reasoning
AdaToken-3D: Dynamic Spatial Gating for Efficient 3D Large Multimodal-Models Reasoning
Kai Zhang
Xingyu Chen
Xiaofeng Zhang
51
0
0
19 May 2025
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures
Francisco Aguilera-Martínez
Fernando Berzal
PILM
88
0
0
02 May 2025
Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Lin Lu
Zhigang Zuo
Ziji Sheng
Pan Zhou
MoMe
96
0
0
22 Feb 2025
Undesirable Memorization in Large Language Models: A Survey
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
131
7
0
03 Oct 2024
Manipulation Facing Threats: Evaluating Physical Vulnerabilities in
  End-to-End Vision Language Action Models
Manipulation Facing Threats: Evaluating Physical Vulnerabilities in End-to-End Vision Language Action Models
Hao Cheng
Erjia Xiao
Chengyuan Yu
Zhao Yao
Jiahang Cao
...
Jiaxu Wang
Mengshu Sun
Kaidi Xu
Jindong Gu
Renjing Xu
AAML
58
3
0
20 Sep 2024
Demystifying Verbatim Memorization in Large Language Models
Demystifying Verbatim Memorization in Large Language Models
Jing Huang
Diyi Yang
Christopher Potts
ELM
PILM
MU
86
25
0
25 Jul 2024
CopyBench: Measuring Literal and Non-Literal Reproduction of
  Copyright-Protected Text in Language Model Generation
CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation
Tong Chen
Akari Asai
Niloofar Mireshghallah
Sewon Min
James Grimmelmann
Yejin Choi
Hannaneh Hajishirzi
Luke Zettlemoyer
Pang Wei Koh
87
18
0
09 Jul 2024
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Yash More
Prakhar Ganesh
G. Farnadi
AAML
94
6
0
02 Jul 2024
Uncovering Latent Memories: Assessing Data Leakage and Memorization
  Patterns in Frontier AI Models
Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Frontier AI Models
Sunny Duan
Mikail Khona
Abhiram Iyer
Rylan Schaeffer
Ila R Fiete
86
5
0
20 Jun 2024
Measuring memorization in RLHF for code completion
Measuring memorization in RLHF for code completion
Aneesh Pappu
Billy Porter
Ilia Shumailov
Jamie Hayes
63
1
0
17 Jun 2024
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs
  with Nothing
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Yuntian Deng
Radha Poovendran
Yejin Choi
Bill Yuchen Lin
SyDa
84
146
0
12 Jun 2024
Coercing LLMs to do and reveal (almost) anything
Coercing LLMs to do and reveal (almost) anything
Jonas Geiping
Alex Stein
Manli Shu
Khalid Saifullah
Yuxin Wen
Tom Goldstein
AAML
64
47
0
21 Feb 2024
Do Membership Inference Attacks Work on Large Language Models?
Do Membership Inference Attacks Work on Large Language Models?
Michael Duan
Anshuman Suri
Niloofar Mireshghallah
Sewon Min
Weijia Shi
Luke Zettlemoyer
Yulia Tsvetkov
Yejin Choi
David Evans
Hanna Hajishirzi
MIALM
75
85
0
12 Feb 2024
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
160
377
0
01 Feb 2024
Dolma: an Open Corpus of Three Trillion Tokens for Language Model
  Pretraining Research
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Luca Soldaini
Rodney Michael Kinney
Akshita Bhagia
Dustin Schwenk
David Atkinson
...
Hanna Hajishirzi
Iz Beltagy
Dirk Groeneveld
Jesse Dodge
Kyle Lo
78
265
0
31 Jan 2024
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to
  Challenge AI Safety by Humanizing LLMs
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
Yi Zeng
Hongpeng Lin
Jingwen Zhang
Diyi Yang
Ruoxi Jia
Weiyan Shi
68
284
0
12 Jan 2024
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety
  Training
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Evan Hubinger
Carson E. Denison
Jesse Mu
Mike Lambert
Meg Tong
...
Sören Mindermann
Ryan Greenblatt
Buck Shlegeris
Nicholas Schiefer
Ethan Perez
LLMAG
44
159
0
10 Jan 2024
Mixtral of Experts
Mixtral of Experts
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
...
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LLMAG
108
1,049
0
08 Jan 2024
Make Them Spill the Beans! Coercive Knowledge Extraction from
  (Production) LLMs
Make Them Spill the Beans! Coercive Knowledge Extraction from (Production) LLMs
Zhuo Zhang
Guangyu Shen
Guanhong Tao
Shuyang Cheng
Xiangyu Zhang
68
14
0
08 Dec 2023
Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
Anay Mehrotra
Manolis Zampetakis
Paul Kassianik
Blaine Nelson
Hyrum Anderson
Yaron Singer
Amin Karbasi
71
239
0
04 Dec 2023
Scalable Extraction of Training Data from (Production) Language Models
Scalable Extraction of Training Data from (Production) Language Models
Milad Nasr
Nicholas Carlini
Jonathan Hayase
Matthew Jagielski
A. Feder Cooper
Daphne Ippolito
Christopher A. Choquette-Choo
Eric Wallace
Florian Tramèr
Katherine Lee
SILM
32
339
0
28 Nov 2023
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Hamish Ivison
Yizhong Wang
Valentina Pyatkin
Nathan Lambert
Matthew E. Peters
...
Joel Jang
David Wadden
Noah A. Smith
Iz Beltagy
Hanna Hajishirzi
ALM
ELM
65
187
0
17 Nov 2023
Scalable and Transferable Black-Box Jailbreaks for Language Models via
  Persona Modulation
Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation
Rusheb Shah
Quentin Feuillade--Montixi
Soroush Pour
Arush Tagade
Stephen Casper
Javier Rando
47
133
0
06 Nov 2023
DeepInception: Hypnotize Large Language Model to Be Jailbreaker
DeepInception: Hypnotize Large Language Model to Be Jailbreaker
Xuan Li
Zhanke Zhou
Jianing Zhu
Jiangchao Yao
Tongliang Liu
Bo Han
71
170
0
06 Nov 2023
Detecting Pretraining Data from Large Language Models
Detecting Pretraining Data from Large Language Models
Weijia Shi
Anirudh Ajith
Mengzhou Xia
Yangsibo Huang
Daogao Liu
Terra Blevins
Danqi Chen
Luke Zettlemoyer
MIALM
59
177
0
25 Oct 2023
Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Yangsibo Huang
Samyak Gupta
Mengzhou Xia
Kai Li
Danqi Chen
AAML
48
293
0
10 Oct 2023
Universal and Transferable Adversarial Attacks on Aligned Language
  Models
Universal and Transferable Adversarial Attacks on Aligned Language Models
Andy Zou
Zifan Wang
Nicholas Carlini
Milad Nasr
J. Zico Kolter
Matt Fredrikson
192
1,376
0
27 Jul 2023
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open
  Resources
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources
Yizhong Wang
Hamish Ivison
Pradeep Dasigi
Jack Hessel
Tushar Khot
...
David Wadden
Kelsey MacMillan
Noah A. Smith
Iz Beltagy
Hannaneh Hajishirzi
ALM
ELM
87
379
0
07 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
104
758
0
01 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
297
3,712
0
29 May 2023
The False Promise of Imitating Proprietary LLMs
The False Promise of Imitating Proprietary LLMs
Arnav Gudibande
Eric Wallace
Charles Burton Snell
Xinyang Geng
Hao Liu
Pieter Abbeel
Sergey Levine
Dawn Song
ALM
93
202
0
25 May 2023
Are Chatbots Ready for Privacy-Sensitive Applications? An Investigation
  into Input Regurgitation and Prompt-Induced Sanitization
Are Chatbots Ready for Privacy-Sensitive Applications? An Investigation into Input Regurgitation and Prompt-Induced Sanitization
Aman Priyanshu
Supriti Vijay
Ayush Kumar
Rakshit Naidu
Fatemehsadat Mireshghallah
SILM
116
24
0
24 May 2023
Enhancing Chat Language Models by Scaling High-quality Instructional
  Conversations
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations
Ning Ding
Yulin Chen
Bokai Xu
Yujia Qin
Zhi Zheng
Shengding Hu
Zhiyuan Liu
Maosong Sun
Bowen Zhou
ALM
91
511
0
23 May 2023
Emergent and Predictable Memorization in Large Language Models
Emergent and Predictable Memorization in Large Language Models
Stella Biderman
USVSN Sai Prashanth
Lintang Sutawika
Hailey Schoelkopf
Quentin G. Anthony
Shivanshu Purohit
Edward Raf
62
122
0
21 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and
  Scaling
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
81
1,231
0
03 Apr 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
832
13,788
0
15 Mar 2023
Automatically Auditing Large Language Models via Discrete Optimization
Automatically Auditing Large Language Models via Discrete Optimization
Erik Jones
Anca Dragan
Aditi Raghunathan
Jacob Steinhardt
73
164
0
08 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
938
12,840
0
27 Feb 2023
Bag of Tricks for Training Data Extraction from Language Models
Bag of Tricks for Training Data Extraction from Language Models
Weichen Yu
Tianyu Pang
Qian Liu
Chao Du
Bingyi Kang
Yan Huang
Min Lin
Shuicheng Yan
89
49
0
09 Feb 2023
Extracting Training Data from Diffusion Models
Extracting Training Data from Diffusion Models
Nicholas Carlini
Jamie Hayes
Milad Nasr
Matthew Jagielski
Vikash Sehwag
Florian Tramèr
Borja Balle
Daphne Ippolito
Eric Wallace
DiffM
111
589
0
30 Jan 2023
Preventing Verbatim Memorization in Language Models Gives a False Sense
  of Privacy
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy
Daphne Ippolito
Florian Tramèr
Milad Nasr
Chiyuan Zhang
Matthew Jagielski
Katherine Lee
Christopher A. Choquette-Choo
Nicholas Carlini
PILM
MU
43
60
0
31 Oct 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
353
6,132
0
05 Apr 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
726
12,525
0
04 Mar 2022
Quantifying Memorization Across Neural Language Models
Quantifying Memorization Across Neural Language Models
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
89
603
0
15 Feb 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
399
2,051
0
31 Dec 2020
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
422
1,868
0
14 Dec 2020
1