SPARC: Subspace-Aware Prompt Adaptation for Robust Continual Learning in LLMs
Dinithi Jayasuriya, Sina Tayebati, Davide Ettori, Ranganath Krishnan, Amit Ranjan Trivedi
2025-02-10
Summary
This paper talks about SPARC, a new method that helps large language models (LLMs) learn new tasks efficiently without forgetting what they already know. It uses a technique called prompt tuning in a smaller space to make learning faster and more effective.
What's the problem?
When AI models learn new tasks, they often overwrite or forget the information they learned before, a problem called catastrophic forgetting. Existing solutions to this issue can be slow, use too much memory, or require changing the model's structure, making them less practical for real-world use.
What's the solution?
The researchers created SPARC, which uses a mathematical technique called PCA to focus learning on the most important parts of the data. This method updates only a tiny portion of the model’s parameters while keeping the rest unchanged, so the model retains its previous knowledge. They also added a tool called LoRA to make the system adaptable to different levels of computing power. SPARC was tested on various tasks and showed it could learn new things without forgetting old ones while using very few resources.
Why it matters?
This matters because it allows AI models to continuously learn and adapt to new tasks without needing expensive retraining or risking memory loss. SPARC makes AI systems more efficient and scalable, which is important for applications like personalized assistants, scientific research, and other areas where models need to handle many tasks over time.
Abstract
We propose SPARC, a lightweight continual learning framework for large language models (LLMs) that enables efficient task adaptation through prompt tuning in a lower-dimensional space. By leveraging principal component analysis (PCA), we identify a compact subspace of the training data. Optimizing prompts in this lower-dimensional space enhances training efficiency, as it focuses updates on the most relevant features while reducing computational overhead. Furthermore, since the model's internal structure remains unaltered, the extensive knowledge gained from pretraining is fully preserved, ensuring that previously learned information is not compromised during adaptation. Our method achieves high knowledge retention in both task-incremental and domain-incremental <PRE_TAG>continual learning</POST_TAG> setups while fine-tuning only 0.04% of the model's parameters. Additionally, by integrating LoRA, we enhance adaptability to computational constraints, allowing for a tradeoff between accuracy and training cost. Experiments on the SuperGLUE benchmark demonstrate that our PCA-based prompt tuning combined with LoRA maintains full knowledge retention while improving accuracy, utilizing only 1% of the model's parameters. These results establish our approach as a scalable and resource-efficient solution for continual learning in LLMs.