Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Xingwei Qu, Shaowen Wang, Zihao Huang, Kai Hua, Fan Yin, Rui-Jie Zhu, Jundong Zhou, Qiyang Min, Zihao Wang, Yizhi Li, Tianyu Zhang, He Xing, Zheng Zhang, Yuxuan Song, Tianyu Zheng, Zhiyuan Zeng, Chenghua Lin, Ge Zhang, Wenhao Huang

2026-01-02

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Summary

This paper introduces a new way to build large language models, called Dynamic Large Concept Models (DLCM), that aims to be more efficient and perform better than existing models.

What's the problem?

Current large language models treat every part of a sentence – every single word or piece of a word – as equally important, which isn't true. Some parts of a sentence are predictable and don't need much processing power, while other parts contain crucial information that requires more attention. This means existing models waste resources on easy parts and potentially miss important details, limiting their overall performance.

What's the solution?

DLCM tackles this by learning to identify 'concepts' within the text, essentially grouping together related words. Instead of processing each individual word, the model compresses sections of text into these concepts and then focuses its processing power on understanding the relationships *between* these concepts. This is like summarizing a paragraph before analyzing it – it makes the overall task more manageable. They also developed a special technique to train this new type of model effectively, allowing it to adapt to different levels of compression and model size.

Why it matters?

This research is important because it suggests a new path for scaling up language models without simply increasing their size and computational cost. By focusing on understanding the core ideas within text, rather than every single word, DLCM achieves better performance with the same amount of processing power. This could lead to more powerful and efficient AI systems in the future, making them more accessible and sustainable.

Abstract

Large Language Models (LLMs) apply uniform computation to all tokens, despite language exhibiting highly non-uniform information density. This token-uniform regime wastes capacity on locally predictable spans while under-allocating computation to semantically critical transitions. We propose Dynamic Large Concept Models (DLCM), a hierarchical language modeling framework that learns semantic boundaries from latent representations and shifts computation from tokens to a compressed concept space where reasoning is more efficient. DLCM discovers variable-length concepts end-to-end without relying on predefined linguistic units. Hierarchical compression fundamentally changes scaling behavior. We introduce the first compression-aware scaling law, which disentangles token-level capacity, concept-level reasoning capacity, and compression ratio, enabling principled compute allocation under fixed FLOPs. To stably train this heterogeneous architecture, we further develop a decoupled μP parametrization that supports zero-shot hyperparameter transfer across widths and compression regimes. At a practical setting (R=4, corresponding to an average of four tokens per concept), DLCM reallocates roughly one-third of inference compute into a higher-capacity reasoning backbone, achieving a +2.69\% average improvement across 12 zero-shot benchmarks under matched inference FLOPs.

View Paper