ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.04634
  4. Cited By
On the Reliability of Watermarks for Large Language Models

On the Reliability of Watermarks for Large Language Models

7 June 2023
John Kirchenbauer
Jonas Geiping
Yuxin Wen
Manli Shu
Khalid Saifullah
Kezhi Kong
Kasun Fernando
Aniruddha Saha
Micah Goldblum
Tom Goldstein
    WaLM
ArXivPDFHTML

Papers citing "On the Reliability of Watermarks for Large Language Models"

41 / 91 papers shown
Title
Double-I Watermark: Protecting Model Copyright for LLM Fine-tuning
Double-I Watermark: Protecting Model Copyright for LLM Fine-tuning
Shen Li
Liuyi Yao
Jinyang Gao
Lan Zhang
Yaliang Li
49
11
0
22 Feb 2024
Secret Collusion among Generative AI Agents: Multi-Agent Deception via Steganography
Secret Collusion among Generative AI Agents: Multi-Agent Deception via Steganography
S. Motwani
Mikhail Baranchuk
Martin Strohmeier
Vijay Bolina
Philip Torr
Lewis Hammond
Christian Schroeder de Witt
48
4
0
12 Feb 2024
Whispers in the Machine: Confidentiality in LLM-integrated Systems
Whispers in the Machine: Confidentiality in LLM-integrated Systems
Jonathan Evertz
Merlin Chlosta
Lea Schonherr
Thorsten Eisenhofer
79
17
0
10 Feb 2024
Copyright Protection in Generative AI: A Technical Perspective
Copyright Protection in Generative AI: A Technical Perspective
Jie Ren
Han Xu
Pengfei He
Yingqian Cui
Shenglai Zeng
...
Hongzhi Wen
Jiayuan Ding
Hui Liu
Yi Chang
Jiliang Tang
DeLMO
33
33
0
04 Feb 2024
Adaptive Text Watermark for Large Language Models
Adaptive Text Watermark for Large Language Models
Yepeng Liu
Yuheng Bu
WaLM
20
18
0
25 Jan 2024
Optimizing watermarks for large language models
Optimizing watermarks for large language models
Bram Wouters
WaLM
30
4
0
28 Dec 2023
Towards Optimal Statistical Watermarking
Towards Optimal Statistical Watermarking
Baihe Huang
Hanlin Zhu
Banghua Zhu
Kannan Ramchandran
Michael I. Jordan
Jason D. Lee
Jiantao Jiao
WaLM
39
11
0
13 Dec 2023
AI Control: Improving Safety Despite Intentional Subversion
AI Control: Improving Safety Despite Intentional Subversion
Ryan Greenblatt
Buck Shlegeris
Kshitij Sachan
Fabien Roger
31
40
0
12 Dec 2023
On the Learnability of Watermarks for Language Models
On the Learnability of Watermarks for Language Models
Chenchen Gu
Xiang Lisa Li
Percy Liang
Tatsunori Hashimoto
WaLM
71
33
0
07 Dec 2023
New Evaluation Metrics Capture Quality Degradation due to LLM
  Watermarking
New Evaluation Metrics Capture Quality Degradation due to LLM Watermarking
Karanpartap Singh
James Zou
WaLM
116
9
0
04 Dec 2023
A Survey on Large Language Model (LLM) Security and Privacy: The Good,
  the Bad, and the Ugly
A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly
Yifan Yao
Jinhao Duan
Kaidi Xu
Yuanfang Cai
Eric Sun
Yue Zhang
PILM
ELM
54
476
0
04 Dec 2023
SymNoise: Advancing Language Model Fine-tuning with Symmetric Noise
SymNoise: Advancing Language Model Fine-tuning with Symmetric Noise
A. Yadav
Arjun Singh
54
2
0
03 Dec 2023
AuthentiGPT: Detecting Machine-Generated Text via Black-Box Language
  Models Denoising
AuthentiGPT: Detecting Machine-Generated Text via Black-Box Language Models Denoising
Zhen Guo
Shangdi Yu
DeLMO
34
10
0
13 Nov 2023
Watermarks in the Sand: Impossibility of Strong Watermarking for
  Generative Models
Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models
Hanlin Zhang
Benjamin L. Edelman
Danilo Francati
Daniele Venturi
G. Ateniese
Boaz Barak
WaLM
146
55
0
07 Nov 2023
Contextual Confidence and Generative AI
Contextual Confidence and Generative AI
Shrey Jain
Zoe Hitzig
Pamela Mishkin
44
5
0
02 Nov 2023
Preventing Language Models From Hiding Their Reasoning
Preventing Language Models From Hiding Their Reasoning
Fabien Roger
Ryan Greenblatt
LRM
31
16
0
27 Oct 2023
Towards Possibilities & Impossibilities of AI-generated Text Detection:
  A Survey
Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey
Soumya Suvra Ghosal
Souradip Chakraborty
Jonas Geiping
Furong Huang
Dinesh Manocha
Amrit Singh Bedi
DeLMO
38
33
0
23 Oct 2023
A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future
  Directions
A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions
Junchao Wu
Shu Yang
Runzhe Zhan
Yulin Yuan
Derek F. Wong
Lidia S. Chao
DeLMO
32
23
0
23 Oct 2023
REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative
  Large Language Models
REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models
Ruisi Zhang
Shehzeen Samarah Hussain
Paarth Neekhara
F. Koushanfar
34
27
0
18 Oct 2023
Embarrassingly Simple Text Watermarks
Embarrassingly Simple Text Watermarks
Ryoma Sato
Yuki Takezawa
Han Bao
Kenta Niwa
Makoto Yamada
WaLM
32
14
0
13 Oct 2023
A Semantic Invariant Robust Watermark for Large Language Models
A Semantic Invariant Robust Watermark for Large Language Models
Aiwei Liu
Leyi Pan
Xuming Hu
Shiao Meng
Lijie Wen
WaLM
53
57
0
10 Oct 2023
NEFTune: Noisy Embeddings Improve Instruction Finetuning
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Neel Jain
Ping Yeh-Chiang
Yuxin Wen
John Kirchenbauer
Hong-Min Chu
...
Avi Schwarzschild
Aniruddha Saha
Micah Goldblum
Jonas Geiping
Tom Goldstein
31
76
0
09 Oct 2023
Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as
  You May Think -- Introducing AI Detectability Index
Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as You May Think -- Introducing AI Detectability Index
Megha Chakraborty
S.M. Towhidul Islam Tonmoy
S. M. Mehedi
Krish Sharma
Niyar R. Barman
...
Tanay Kumar
Vinija Jain
Aman Chadha
Amit P. Sheth
Amitava Das
DeLMO
22
21
0
08 Oct 2023
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text
  Generation
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation
Abe Bohan Hou
Jingyu Zhang
Tianxing He
Yichen Wang
Yung-Sung Chuang
Hongwei Wang
Lingfeng Shen
Benjamin Van Durme
Daniel Khashabi
Yulia Tsvetkov
WaLM
34
0
0
06 Oct 2023
Co-audit: tools to help humans double-check AI-generated content
Co-audit: tools to help humans double-check AI-generated content
Andrew D. Gordon
Carina Negreanu
J. Cambronero
Rasika Chakravarthy
Ian Drosos
...
Hannah Richardson
Advait Sarkar
Stephanie Simmons
Jack Williams
Ben Zorn
41
13
0
02 Oct 2023
Necessary and Sufficient Watermark for Large Language Models
Necessary and Sufficient Watermark for Large Language Models
Yuki Takezawa
Ryoma Sato
Han Bao
Kenta Niwa
Makoto Yamada
WaLM
50
7
0
02 Oct 2023
Warfare:Breaking the Watermark Protection of AI-Generated Content
Warfare:Breaking the Watermark Protection of AI-Generated Content
Guanlin Li
Yifei Chen
Jie Zhang
Shangwei Guo
Shangwei Guo
Tianwei Zhang
Jiwei Li
Tianwei Zhang
WIGM
60
3
0
27 Sep 2023
Detecting ChatGPT: A Survey of the State of Detecting ChatGPT-Generated
  Text
Detecting ChatGPT: A Survey of the State of Detecting ChatGPT-Generated Text
Mahdi Dhaini
Wessel Poelman
Ege Erdogan
DeLMO
51
12
0
14 Sep 2023
Baseline Defenses for Adversarial Attacks Against Aligned Language
  Models
Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Neel Jain
Avi Schwarzschild
Yuxin Wen
Gowthami Somepalli
John Kirchenbauer
Ping Yeh-Chiang
Micah Goldblum
Aniruddha Saha
Jonas Geiping
Tom Goldstein
AAML
68
343
0
01 Sep 2023
Identifying and Mitigating the Security Risks of Generative AI
Identifying and Mitigating the Security Risks of Generative AI
Clark W. Barrett
Bradley L Boyd
Ellie Burzstein
Nicholas Carlini
Brad Chen
...
Zulfikar Ramzan
Khawaja Shams
D. Song
Ankur Taly
Diyi Yang
SILM
46
93
0
28 Aug 2023
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak
  Prompts on Large Language Models
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Xinyue Shen
Zhenpeng Chen
Michael Backes
Yun Shen
Yang Zhang
SILM
40
250
0
07 Aug 2023
PromptCARE: Prompt Copyright Protection by Watermark Injection and
  Verification
PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification
Hongwei Yao
Jian Lou
Kui Ren
Zhan Qin
AAML
VLM
41
25
0
05 Aug 2023
Advancing Beyond Identification: Multi-bit Watermark for Large Language
  Models
Advancing Beyond Identification: Multi-bit Watermark for Large Language Models
Kiyoon Yoo
Wonhyuk Ahn
Nojun Kwak
WaLM
38
17
0
01 Aug 2023
Towards Codable Watermarking for Injecting Multi-bits Information to
  LLMs
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs
Lean Wang
Wenkai Yang
Deli Chen
Hao Zhou
Yankai Lin
Fandong Meng
Jie Zhou
Xu Sun
WaLM
43
16
0
29 Jul 2023
Three Bricks to Consolidate Watermarks for Large Language Models
Three Bricks to Consolidate Watermarks for Large Language Models
Pierre Fernandez
Antoine Chaffin
Karim Tit
Vivien Chappelier
Teddy Furon
WaLM
21
47
0
26 Jul 2023
Self-Consuming Generative Models Go MAD
Self-Consuming Generative Models Go MAD
Sina Alemohammad
Josue Casco-Rodriguez
Lorenzo Luzi
Ahmed Imtiaz Humayun
Hossein Babaei
Daniel LeJeune
Ali Siahkoohi
Richard G. Baraniuk
WIGM
21
141
0
04 Jul 2023
Paraphrasing evades detectors of AI-generated text, but retrieval is an
  effective defense
Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense
Kalpesh Krishna
Yixiao Song
Marzena Karpinska
John Wieting
Mohit Iyyer
DeLMO
21
299
0
23 Mar 2023
Paraphrase Identification with Deep Learning: A Review of Datasets and
  Methods
Paraphrase Identification with Deep Learning: A Review of Datasets and Methods
Chao Zhou
Cheng Qiu
Daniel Ernesto Acuna
37
25
0
13 Dec 2022
CATER: Intellectual Property Protection on Text Generation APIs via
  Conditional Watermarks
CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks
Xuanli He
Qiongkai Xu
Yi Zeng
Lingjuan Lyu
Fangzhao Wu
Jiwei Li
R. Jia
WaLM
188
72
0
19 Sep 2022
Protecting Intellectual Property of Language Generation APIs with
  Lexical Watermark
Protecting Intellectual Property of Language Generation APIs with Lexical Watermark
Xuanli He
Qiongkai Xu
Lingjuan Lyu
Fangzhao Wu
Chenguang Wang
WaLM
177
95
0
05 Dec 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
2,007
0
31 Dec 2020
Previous
12