KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

Fan Wang, Juyong Jiang, Chansung Park, Sunghun Kim, Jing Tang

2024-12-12

KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

Summary

This paper introduces KaSA, a new method for fine-tuning large language models (LLMs) that makes adapting these models to specific tasks more efficient by focusing on relevant knowledge.

What's the problem?

As large language models grow in size, they require more computing power and memory to adapt to new tasks. Traditional fine-tuning methods can be inefficient because they often involve adjusting many parameters, and some methods ignore irrelevant or noisy information, which can hurt performance.

What's the solution?

KaSA, or Knowledge-aware Singular-Value Adaptation, addresses these issues by using a technique called singular value decomposition (SVD). This method helps identify and focus on the most important knowledge for the task while ignoring less relevant information. KaSA allows for efficient updates by only adjusting the necessary parts of the model, leading to better performance without needing to change everything.

Why it matters?

This research is significant because it improves how we can adapt large language models for various applications, making them faster and more effective. By focusing on relevant knowledge, KaSA helps ensure that these models perform better on specific tasks while using less computational resources, which is crucial as AI technology continues to advance.

Abstract

The increasing sizes of large language models (LLMs) result in significant computational overhead and memory usage when adapting these models to specific tasks or domains. Various parameter-efficient fine-tuning (PEFT) methods have been devised to mitigate these challenges by training a small set of parameters for the task-specific updates of the model weights. Among PEFT methods, LoRA stands out for its simplicity and efficiency, inspiring the development of a series of variants. However, LoRA and its successors disregard the knowledge that is noisy or irrelevant to the targeted task, detrimentally impacting model performance and leading to suboptimality. To address this limitation, we introduce Knowledge-aware Singular-value Adaptation (KaSA), a PEFT method that leverages singular value decomposition (SVD) with knowledge-aware singular values to dynamically activate knowledge based on its relevance to the task at hand. We conduct extensive experiments across a range of LLMs on tasks spanning natural language understanding (NLU), generation (NLG), instruction following, and commonsense reasoning. The experimental results demonstrate that KaSA consistently outperforms FFT and 14 popular PEFT baselines across 16 benchmarks and 4 synthetic datasets, underscoring our method's efficacy and adaptability. The source code of our method is available at https://github.com/juyongjiang/KaSA.

View Paper