SEAL: Entangled White-box Watermarks on Low-Rank Adaptation

Giyeong Oh, Saejin Kim, Woohyun Cho, Sangkyu Lee, Jiwan Chung, Dokyung Song, Youngjae Yu

2025-01-21

SEAL: Entangled White-box Watermarks on Low-Rank Adaptation

Summary

This paper talks about SEAL, a new method for protecting the copyright of AI models that use a technique called LoRA. SEAL is like a digital watermark that's hidden inside the AI model, making it easier to prove who created it without affecting how well the AI works.

What's the problem?

AI models that use LoRA are becoming really popular because they're efficient and easy to share. But there's a big issue: it's hard to protect who owns these models. It's kind of like trying to prove you wrote a book when there's no easy way to show your name on it. This makes it easy for people to steal or misuse AI models without giving credit to the original creators.

What's the solution?

The researchers created SEAL, which stands for SEcure wAtermarking on LoRA weights. It's like hiding a secret signature inside the AI model that only the creator knows about. SEAL puts this secret signature between the parts of the model that do the actual work. The clever part is that this signature becomes tangled up with the rest of the model during training, so it's really hard to remove or change without messing up the whole AI. After training, the signature is hidden even better, but the creator can still prove it's there if they need to.

Why it matters?

This matters because as AI becomes more important in our lives, we need ways to protect the hard work that goes into creating these models. SEAL helps make sure that creators get credit for their work and can prove ownership if someone tries to steal their AI. It's also really cool because it doesn't slow down the AI or make it work worse - it can still do all kinds of tasks just as well as before. This could encourage more people to share their AI models without worrying about them being stolen, which could lead to faster progress in AI research and development.

Abstract

Recently, LoRA and its variants have become the de facto strategy for training and sharing task-specific versions of large pretrained models, thanks to their efficiency and simplicity. However, the issue of copyright protection for LoRA weights, especially through watermark-based techniques, remains underexplored. To address this gap, we propose SEAL (SEcure wAtermarking on LoRA weights), the universal whitebox watermarking for LoRA. SEAL embeds a secret, non-trainable matrix between trainable LoRA weights, serving as a passport to claim ownership. SEAL then entangles the passport with the LoRA weights through training, without extra loss for entanglement, and distributes the finetuned weights after hiding the passport. When applying SEAL, we observed no performance degradation across commonsense reasoning, textual/visual instruction tuning, and text-to-image synthesis tasks. We demonstrate that SEAL is robust against a variety of known attacks: removal, obfuscation, and ambiguity attacks.

View Paper