DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation

Guanzhi Deng, Bo Li, Ronghao Chen, Huacan Wang, Linqi Song, Lijie Wen

2026-01-12

DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation

Summary

This paper introduces a new method, DR-LoRA, for efficiently fine-tuning very large language models that use a 'mixture of experts' approach. These models are powerful but can be difficult to adapt to specific tasks without using a lot of computing resources.

What's the problem?

Large language models built with a 'mixture of experts' system have many different parts, called experts, that specialize in different things. When you try to adapt these models to a new task, a common technique called LoRA adds a small amount of extra information to each expert. However, current methods add the *same* amount of extra information to every expert, even though some experts are more important for the new task than others. This is inefficient because it wastes resources on experts that aren't needed and doesn't give enough resources to the experts that *are* important.

What's the solution?

DR-LoRA solves this problem by dynamically adjusting how much extra information (LoRA rank) is added to each expert. It figures out which experts are most useful for the specific task at hand by looking at how often they're used and how important their adjustments are. Then, it automatically increases the amount of extra information given to the most important experts, creating a system where each expert has just the right amount of resources for the job. This is done during the fine-tuning process itself, so it adapts to the task as it learns.

Why it matters?

This research is important because it makes fine-tuning large language models more efficient. By intelligently allocating resources to the experts that need them most, DR-LoRA allows these models to perform better on specific tasks while using the same amount of computing power as older methods. This means we can get more out of these powerful models without needing even more powerful hardware.

Abstract

Mixture-of-Experts (MoE) has become a prominent paradigm for scaling Large Language Models (LLMs). Parameter-efficient fine-tuning (PEFT), such as LoRA, is widely adopted to adapt pretrained MoE LLMs to downstream tasks. However, existing approaches assign identical LoRA ranks to all experts, overlooking the intrinsic functional specialization within MoE LLMs. This uniform allocation leads to resource mismatch, task-relevant experts are under-provisioned while less relevant ones receive redundant parameters. We propose a Dynamic Rank LoRA framework named DR-LoRA, which dynamically grows expert LoRA ranks during fine-tuning based on task-specific demands. DR-LoRA employs an Expert Saliency Scoring mechanism that integrates expert routing frequency and LoRA rank importance to quantify each expert's demand for additional capacity. Experts with higher saliency scores are prioritized for rank expansion, enabling the automatic formation of a heterogeneous rank distribution tailored to the target task. Experiments on multiple benchmarks demonstrate that DR-LoRA consistently outperforms standard LoRA and static allocation strategies under the same parameter budget, achieving superior task performance with more efficient parameter utilization.

View Paper