The Feasibility of Topic-Based Watermarking on Academic Peer Reviews
- WaLM

Large language models (LLMs) are increasingly integrated into academic workflows, with many conferences and journals permitting their use for tasks such as language refinement and literature summarization. However, their use in peer review remains prohibited due to concerns around confidentiality breaches, hallucinated content, and inconsistent evaluations. As LLM-generated text becomes more indistinguishable from human writing, there is a growing need for reliable attribution mechanisms to preserve the integrity of the review process. In this work, we evaluate topic-based watermarking (TBW), a lightweight, semantic-aware technique designed to embed detectable signals into LLM-generated text. We conduct a comprehensive assessment across multiple LLM configurations, including base, few-shot, and fine-tuned variants, using authentic peer review data from academic conferences. Our results show that TBW maintains review quality relative to non-watermarked outputs, while demonstrating strong robustness to paraphrasing-based evasion. These findings highlight the viability of TBW as a minimally intrusive and practical solution for enforcing LLM usage in peer review.
View on arXiv@article{nemecek2025_2505.21636, title={ The Feasibility of Topic-Based Watermarking on Academic Peer Reviews }, author={ Alexander Nemecek and Yuzhou Jiang and Erman Ayday }, journal={arXiv preprint arXiv:2505.21636}, year={ 2025 } }