InCoder-32B-Thinking: Industrial Code World Model for Thinking

Jian Yang, Wei Zhang, Jiajun Wu, Junhang Cheng, Tuney Zheng, Fanglin Xu, Weicheng Gu, Lin Jing, Yaxin Du, Joseph Li, Yizhi Li, Yan Xing, Chuan Hao, Ran Tao, Ruihao Gong, Aishan Liu, Zhoujun Li, Mingjie Tang, Chenghua Lin, Siheng Chen, Wayne Xin Zhao, Xianglong Liu

2026-04-06

InCoder-32B-Thinking: Industrial Code World Model for Thinking

Summary

This research focuses on making AI better at writing code for complex hardware systems, like those found in computer chips and graphics cards. The goal is to have the AI not just *produce* code, but also *explain* its reasoning process, mimicking how a human engineer would approach the problem.

What's the problem?

Currently, when AI tries to write code for hardware, it often lacks a clear understanding of how the code interacts with the underlying hardware. It doesn't show its work, so to speak. This makes it hard to trust the AI's solutions, especially when dealing with timing issues and hardware limitations. Engineers need to understand *why* a piece of code works (or doesn't work) on specific hardware, and existing AI tools don't provide that insight.

What's the solution?

The researchers developed a new AI model called InCoder-32B-Thinking. They trained it using a special method that simulates a back-and-forth conversation with 'errors' – like the AI making a mistake and then getting feedback. This process, called Error-driven Chain-of-Thought, forces the AI to explain its reasoning and correct its mistakes. They also gave the AI a 'world model' (ICWM) that lets it predict how code will behave on hardware *before* actually running it, helping it to self-check its work. This training data was created by validating AI-generated reasoning with actual hardware tools.

Why it matters?

This work is important because it moves AI closer to being a truly helpful partner for hardware engineers. By generating reasoning traces, the AI can help engineers understand and debug complex hardware code more efficiently. The improved performance on benchmarks suggests this approach could lead to faster development cycles and more reliable hardware systems, especially in areas like chip design and GPU optimization.

Abstract

Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reasoning traces showing how engineers reason about hardware constraints and timing semantics. In this work, we propose InCoder-32B-Thinking, trained on the data from the Error-driven Chain-of-Thought (ECoT) synthesis framework with an industrial code world model (ICWM) to generate reasoning traces. Specifically, ECoT generates reasoning chains by synthesizing the thinking content from multi-turn dialogue with environmental error feedback, explicitly modeling the error-correction process. ICWM is trained on domain-specific execution traces from Verilog simulation, GPU profiling, etc., learns the causal dynamics of how code affects hardware behavior, and enables self-verification by predicting execution outcomes before actual compilation. All synthesized reasoning traces are validated through domain toolchains, creating training data matching the natural reasoning depth distribution of industrial tasks. Evaluation on 14 general (81.3% on LiveCodeBench v5) and 9 industrial benchmarks (84.0% in CAD-Coder and 38.0% on KernelBench) shows InCoder-32B-Thinking achieves top-tier open-source results across all domains.GPU Optimization

View Paper