Crowded in B-Space: Calibrating Shared Directions for LoRA Merging

Yixuan Tang, Yi Yang

2026-04-21

Crowded in B-Space: Calibrating Shared Directions for LoRA Merging

Summary

This paper investigates a problem with combining small modifications, called LoRA adapters, that are added to large AI models after they've been trained on different tasks. While combining these adapters seems like a good idea, it often makes the model perform worse than expected.

What's the problem?

When you train a LoRA adapter for each task and then try to merge them together, the performance drops. The issue isn't with the entire adapter, but specifically with one part of it – a matrix called 'B'. This 'B' matrix tends to focus on a limited set of common patterns across all the tasks, while the other part, 'A', is more unique to each task. When merged, the shared patterns in 'B' get too much emphasis, and the important, task-specific information from 'A' gets lost, leading to poorer results.

What's the solution?

The researchers developed a method called Pico, which stands for Pre-merge interference calibration in output-space. Pico doesn't require any new training data. Instead, before merging the adapters, it adjusts the 'B' matrix by reducing the influence of those overly-shared patterns and then rebalancing the overall update. This calibration step helps preserve the task-specific information and allows the merged adapter to perform better. Pico can be easily added to existing merging techniques.

Why it matters?

This work is important because it shows that merging LoRA adapters *can* be effective if you address how the different parts of the adapters interact. Pico significantly improves the accuracy of merged adapters across a variety of challenging tasks, including math, coding, and medical applications, and even allows them to outperform a single adapter trained on all the tasks combined. It demonstrates that treating the two matrices within a LoRA adapter as distinct components is key to successful merging.

Abstract

Merging separately trained LoRA adapters is a practical alternative to joint multi-task training, but it often hurts performance. Existing methods usually treat the LoRA update ΔW = BA as a single object and do not distinguish the two LoRA matrices. We show that the main source of LoRA merge interference comes from the output-side matrix B. Across tasks, B repeatedly uses a small set of shared directions, while A remains much more task-specific. As a result, the merged adapter overemphasizes these shared directions, and task-specific information is lost. We propose Pico (Pre-merge interference calibration in output-space), a data-free method that calibrates B before merge by downscaling over-shared directions and then rescaling the merged update. Pico plugs directly into existing merging methods such as Task Arithmetic, TIES, and TSV-M. Across eight different benchmarks from math, coding, finance, and medical domains, Pico improves average accuracy by 3.4-8.3 points over the corresponding base method and achieves the best overall average performance. Pico also enables merged adapters to outperform the LoRA trained with all task data. These results show that LoRA merging works better when the two LoRA matrices are treated separately.

View Paper