Temperature Matters: Enhancing Watermark Robustness Against Paraphrasing Attacks

27 June 2025

Badr Youbi Idrissi

Monica Millunzi

Amelia Sorrenti

Lorenzo Baraldi

Daryna Dementieva

WaLM

ArXiv (abs)PDF HTML

Main:4 Pages

4 Figures

Bibliography:2 Pages

Abstract

In the present-day scenario, Large Language Models (LLMs) are establishing their presence as powerful instruments permeating various sectors of society. While their utility offers valuable support to individuals, there are multiple concerns over potential misuse. Consequently, some academic endeavors have sought to introduce watermarking techniques, characterized by the inclusion of markers within machine-generated text, to facilitate algorithmic identification. This research project is focused on the development of a novel methodology for the detection of synthetic text, with the overarching goal of ensuring the ethical application of LLMs in AI-driven text generation. The investigation commences with replicating findings from a previous baseline study, thereby underscoring its susceptibility to variations in the underlying generation model. Subsequently, we propose an innovative watermarking approach and subject it to rigorous evaluation, employing paraphrased generated text to asses its robustness. Experimental results highlight the robustness of our proposal compared to the~\cite{aarson} watermarking method.

View on arXiv

Comments on this paper