Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking

Yuan Yao, Jin Song, Jian Jin

2025-07-15

Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks
in Weight-based Neural Network Watermarking

Summary

This paper talks about NeuralMark, a new method to protect the ownership of neural network models by embedding special watermarks inside them that are very hard to fake or remove.

What's the problem?

The problem is that once a neural network model is shared or sold, others can try to steal it by copying, changing, or removing watermarks that prove who owns it. Existing watermark methods can be easily forged or erased by attacks like fine-tuning, pruning, or overwriting.

What's the solution?

NeuralMark solves this by using a hashed watermark filter that acts like a special signature to protect the model. This filter makes the watermark very strong and secure against attempts to forge it or erase it, even if someone tries to change the model by training it more or removing parts.

Why it matters?

This matters because neural networks are valuable technologies and protecting ownership ensures that creators get credit and control over their models. NeuralMark helps prevent theft and misuse of AI models, making the technology safer and more trustworthy.

Abstract

NeuralMark is a robust neural network watermarking method using a hashed watermark filter to protect model ownership against forging, overwriting, fine-tuning, and pruning attacks.

View Paper