SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens
Yinhan He, Wendy Zheng, Yaochen Zhu, Zaiyi Zheng, Lin Su, Sriram Vasudevan, Qi Guo, Liangjie Hong, Jundong Li
2025-11-03
Summary
This paper introduces a new method, SemCoT, to make complex reasoning tasks for large language models (LLMs) faster and more efficient without sacrificing accuracy.
What's the problem?
Large language models are getting good at 'thinking through' problems step-by-step, a technique called Chain-of-Thought reasoning. However, this process can be slow because it involves generating a lot of text. Newer methods try to hide these reasoning steps *inside* the model itself, making it faster, but they often lose the meaning of the reasoning or don't focus on how long it takes the model to even create those hidden steps.
What's the solution?
The researchers developed SemCoT, which tackles these issues in two ways. First, they use a special tool to check if the hidden reasoning still makes sense compared to a correct, step-by-step explanation. This ensures the model isn't just being fast, but also accurate. Second, they train a smaller, faster language model to generate these hidden reasoning steps efficiently, guided by the tool that checks for meaning. Essentially, they're streamlining the reasoning process while keeping it logically sound.
Why it matters?
This work is important because it makes advanced reasoning capabilities of LLMs more practical for real-world applications where speed and efficiency are crucial. By improving both the speed of generating reasoning and ensuring that reasoning remains accurate, SemCoT represents a significant step towards deploying these powerful models in more situations.
Abstract
The verbosity of Chain-of-Thought (CoT) reasoning hinders its mass deployment in efficiency-critical applications. Recently, implicit CoT approaches have emerged, which encode reasoning steps within LLM's hidden embeddings (termed ``implicit reasoning'') rather than explicit tokens. This approach accelerates CoT by reducing the reasoning length and bypassing some LLM components. However, existing implicit CoT methods face two significant challenges: (1) they fail to preserve the semantic alignment between the implicit reasoning (when transformed to natural language) and the ground-truth reasoning, resulting in a significant CoT performance degradation, and (2) they focus on reducing the length of the implicit reasoning; however, they neglect the considerable time cost for an LLM to generate one individual implicit reasoning token. To tackle these challenges, we propose a novel semantically-aligned implicit CoT framework termed SemCoT. In particular, for the first challenge, we design a contrastively trained sentence transformer that evaluates semantic alignment between implicit and explicit reasoning, which is used to enforce semantic preservation during implicit reasoning optimization. To address the second challenge, we introduce an efficient implicit reasoning generator by finetuning a lightweight language model using knowledge distillation. This generator is guided by our sentence transformer to distill ground-truth reasoning into semantically aligned implicit reasoning, while also optimizing for accuracy. SemCoT is the first approach that enhances CoT efficiency by jointly optimizing token-level generation speed and preserving semantic alignment with ground-truth reasoning. Extensive experiments demonstrate the superior performance of SemCoT compared to state-of-the-art methods in both efficiency and effectiveness. Our code can be found at https://github.com/YinhanHe123/SemCoT/.