The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Yan Wang, Yitao Xu, Nanhan Shen, Jinyan Su, Jimin Huang, Zining Zhu

2026-01-09

The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Summary

This research investigates how Mixture of Experts models, which are designed to specialize in different areas, actually work internally. It challenges the common idea that these models achieve specialization by having different experts handle different types of problems.

What's the problem?

Mixture of Experts models are built on the idea that they become good at specific tasks by routing different inputs to different 'expert' parts of the network. However, it wasn't clear if this specialization was truly happening, or if there was a hidden pattern in how the model was using its experts. Researchers wanted to understand if experts really specialize, or if some experts are consistently used across many different tasks.

What's the solution?

The researchers developed a tool called COMMITTEEAUDIT to analyze which experts were being used for different tasks. Instead of looking at individual experts, they focused on groups of experts. They found a 'Standing Committee' – a small, consistent group of experts that handled the majority of the work across all tasks, layers, and even different settings of how the model routes information. They also found that these core experts handle the basic reasoning and structure of problems, while other, less frequently used experts deal with specific details.

Why it matters?

This finding is important because it suggests that Mixture of Experts models might not be as specialized as people thought. The model seems to have a strong tendency to rely on a central group of experts, which means current training methods that try to force equal usage of all experts might actually be making the model less efficient. Understanding this bias could lead to better training techniques and more powerful models.

Abstract

Mixture of Experts models are widely assumed to achieve domain specialization through sparse routing. In this work, we question this assumption by introducing COMMITTEEAUDIT, a post hoc framework that analyzes routing behavior at the level of expert groups rather than individual experts. Across three representative models and the MMLU benchmark, we uncover a domain-invariant Standing Committee. This is a compact coalition of routed experts that consistently captures the majority of routing mass across domains, layers, and routing budgets, even when architectures already include shared experts. Qualitative analysis further shows that Standing Committees anchor reasoning structure and syntax, while peripheral experts handle domain-specific knowledge. These findings reveal a strong structural bias toward centralized computation, suggesting that specialization in Mixture of Experts models is far less pervasive than commonly believed. This inherent bias also indicates that current training objectives, such as load-balancing losses that enforce uniform expert utilization, may be working against the model's natural optimization path, thereby limiting training efficiency and performance.

View Paper