ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.02099
102
4
v1v2 (latest)

A Watermark for Black-Box Language Models

2 October 2024
Dara Bahri
John Wieting
Dana Alon
Donald Metzler
    WaLM
ArXiv (abs)PDFHTML
Abstract

Watermarking has recently emerged as an effective strategy for detecting the outputs of large language models (LLMs). Most existing schemes require \emph{white-box} access to the model's next-token probability distribution, which is typically not accessible to downstream users of an LLM API. In this work, we propose a principled watermarking scheme that requires only the ability to sample sequences from the LLM (i.e. \emph{black-box} access), boasts a \emph{distortion-free} property, and can be chained or nested using multiple secret keys. We provide performance guarantees, demonstrate how it can be leveraged when white-box access is available, and show when it can outperform existing white-box schemes via comprehensive experiments.

View on arXiv
@article{bahri2025_2410.02099,
  title={ A Watermark for Black-Box Language Models },
  author={ Dara Bahri and John Wieting },
  journal={arXiv preprint arXiv:2410.02099},
  year={ 2025 }
}
Comments on this paper