< Explain other AI papers

C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing

Zhongyang Li, Ziyue Li, Tianyi Zhou

2025-04-11

C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization
  for Test-Time Expert Re-Mixing

Summary

This paper talks about C3PO, a method that helps AI language models pick the best ‘experts’ (specialized mini-models) for each task on-the-fly, like choosing the right tools for a job instead of sticking to a fixed set.

What's the problem?

Current AI models that use multiple experts often pick suboptimal combinations, leading to mistakes or inefficiencies, especially on new or tricky tasks.

What's the solution?

C3PO optimizes how experts are mixed by learning from similar past tasks, focusing only on critical parts of the model to save time and energy while boosting accuracy.

Why it matters?

This makes AI models smarter and faster for real-world uses like answering complex questions or coding, letting smaller models compete with bigger ones while using less power.

Abstract

Mixture-of-Experts (MoE) Large Language Models (LLMs) suffer from severely sub-optimal expert pathways-our study reveals that naive expert selection learned from pretraining leaves a surprising 10-20% accuracy gap for improvement. Motivated by this observation, we develop a novel class of test-time optimization methods to re-weight or "re-mixing" the experts in different layers jointly for each test sample. Since the test sample's ground truth is unknown, we propose to optimize a surrogate objective defined by the sample's "successful neighbors" from a reference set of samples. We introduce three surrogates and algorithms based on mode-finding, kernel regression, and the average loss of similar reference samples/tasks. To reduce the cost of optimizing whole pathways, we apply our algorithms merely to the core experts' mixing weights in critical layers, which enjoy similar performance but save significant computation. This leads to "Critical-Layer, Core-Expert, Collaborative Pathway Optimization (C3PO)". We apply C3PO to two recent MoE LLMs and examine it on six widely-used benchmarks. It consistently improves the base model by 7-15% in accuracy and outperforms widely used test-time learning baselines, e.g., in-context learning and prompt/prefix tuning, by a large margin. Moreover, C3PO enables MoE LLMs with 1-3B active parameters to outperform LLMs of 7-9B parameters, hence improving MoE's advantages on efficiency. Our thorough ablation study further sheds novel insights on achieving test-time improvement on MoE.