Robust Multi-bit Text Watermark with LLM-based Paraphrasers

Xiaojun Xu, Jinghan Jia, Yuanshun Yao, Yang Liu, Hang Li

2024-12-10

Robust Multi-bit Text Watermark with LLM-based Paraphrasers

Summary

This paper talks about a new method for embedding hidden watermarks in text using large language models (LLMs) that paraphrase sentences, allowing for secure and undetectable tracking of the text's origin.

What's the problem?

As digital content becomes easier to share and copy, it is important to protect original works from being misused or claimed by others. Traditional watermarking methods can be easily detected or removed, making it hard to ensure that the original creator is recognized.

What's the solution?

The authors propose a technique that uses two different LLM-based paraphrasers to subtly change the text while embedding a multi-bit watermark. This means they can encode information into the text without changing its overall meaning. The process involves alternating between the two paraphrasers to create a watermarked version of the text, which can later be decoded to retrieve the embedded information. The method has been tested extensively and shows high accuracy in detecting the watermark while preserving the original text's meaning.

Why it matters?

This research is important because it provides a robust way to protect intellectual property in digital formats. By embedding watermarks that are hard to detect and remove, creators can ensure their work is recognized and credited, which is crucial in an age where content is easily shared and copied online.

Abstract

We propose an imperceptible multi-bit text watermark embedded by paraphrasing with LLMs. We fine-tune a pair of LLM paraphrasers that are designed to behave differently so that their paraphrasing difference reflected in the text semantics can be identified by a trained decoder. To embed our multi-bit watermark, we use two paraphrasers alternatively to encode the pre-defined binary code at the sentence level. Then we use a text classifier as the decoder to decode each bit of the watermark. Through extensive experiments, we show that our watermarks can achieve over 99.99\% detection AUC with small (1.1B) text paraphrasers while keeping the semantic information of the original sentence. More importantly, our pipeline is robust under word substitution and sentence paraphrasing perturbations and generalizes well to out-of-distributional data. We also show the stealthiness of our watermark with LLM-based evaluation. We open-source the code: https://github.com/xiaojunxu/multi-bit-text-watermark.

View Paper