Dataset Size Recovery from LoRA Weights

Mohammad Salama, Jonathan Kahana, Eliahu Horwitz, Yedid Hoshen

2024-06-28

Summary

This paper talks about a new method called DSiRe that helps determine how many training samples were used to train a model by analyzing its weights, specifically when using a technique called Low-Rank Adaptation (LoRA).

What's the problem?

When researchers create models, they often want to know how many examples were used to train them. However, current methods like model inversion and membership inference attacks don't always reveal the complete training data because they don't know the size of the training set. This lack of information can make it difficult to evaluate the model's performance and understand its capabilities.

What's the solution?

To solve this problem, the authors introduce the task of dataset size recovery, which aims to find out the number of samples used for training directly from the model's weights. They propose a method called DSiRe that analyzes specific properties of the LoRA weights—like their norm and spectrum—to estimate the dataset size. They also created a new benchmark called LoRA-WiSE, which includes over 25,000 weight snapshots from more than 2,000 different LoRA fine-tuned models to test their method. Their results show that DSiRe can accurately predict the number of training images with a very small error margin.

Why it matters?

This research is important because it provides a way to gain insights into how models are trained without needing direct access to the training data. Knowing the dataset size can help researchers better understand model performance and improve future models. Additionally, this method could help identify potential biases in training data by revealing how much data was actually used.

Abstract

Model inversion and membership inference attacks aim to reconstruct and verify the data which a model was trained on. However, they are not guaranteed to find all training samples as they do not know the size of the training set. In this paper, we introduce a new task: dataset size recovery, that aims to determine the number of samples used to train a model, directly from its weights. We then propose DSiRe, a method for recovering the number of images used to fine-tune a model, in the common case where fine-tuning uses LoRA. We discover that both the norm and the spectrum of the LoRA matrices are closely linked to the fine-tuning dataset size; we leverage this finding to propose a simple yet effective prediction algorithm. To evaluate dataset size recovery of LoRA weights, we develop and release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models. Our best classifier can predict the number of fine-tuning images with a mean absolute error of 0.36 images, establishing the feasibility of this attack.

View Paper