Multi-use LLM Watermarking and the False Detection Problem

19 June 2025

Main:9 Pages

15 Figures

Bibliography:2 Pages

7 Tables

Appendix:19 Pages

Abstract

Digital watermarking is a promising solution for mitigating some of the risks arising from the misuse of automatically generated text. These approaches either embed non-specific watermarks to allow for the detection of any text generated by a particular sampler, or embed specific keys that allow the identification of the LLM user. However, simultaneously using the same embedding for both detection and user identification leads to a false detection problem, whereby, as user capacity grows, unwatermarked text is increasingly likely to be falsely detected as watermarked. Through theoretical analysis, we identify the underlying causes of this phenomenon. Building on these insights, we propose Dual Watermarking which jointly encodes detection and identification watermarks into generated text, significantly reducing false positives while maintaining high detection accuracy. Our experimental results validate our theoretical findings and demonstrate the effectiveness of our approach.

View on arXiv

@article{fu2025_2506.15975,
  title={ Multi-use LLM Watermarking and the False Detection Problem },
  author={ Zihao Fu and Chris Russell },
  journal={arXiv preprint arXiv:2506.15975},
  year={ 2025 }
}

Comments on this paper