Key Features

480B-parameter Mixture-of-Experts model with 35B active parameters
Supports context length of 256K tokens natively and 1M tokens with extrapolation methods
Achieves state-of-the-art results among open models on Agentic Coding, Agentic Browser-Use, and Agentic Tool-Use
Works seamlessly with the community’s best developer tools
Pre-trained on 7.5T tokens with a 70% code ratio
Natively supports 256K context and can be extended up to 1M with YaRN
Can be used with Qwen Code and Claude Code
Accessible through the Alibaba Cloud Model Studio API

Qwen3-Coder has been trained using a combination of pre-training and post-training methods. During pre-training, the model was scaled along multiple dimensions to strengthen its core capabilities, including scaling tokens, context, and synthetic data. The model was trained on 7.5T tokens with a 70% code ratio, excelling in coding while preserving general and math abilities. Additionally, the model natively supports 256K context and can be extended up to 1M with YaRN, optimized for repo-scale and dynamic data.


Qwen3-Coder can be used with various tools, including Qwen Code, a research-purpose CLI tool adapted from Gemini CLI. Qwen Code has been enhanced with customized prompts and function calling protocols to fully unleash the capabilities of Qwen3-Coder on agentic coding tasks. Additionally, Qwen3-Coder can be used with Claude Code, a popular coding tool. The model can also be accessed through the Alibaba Cloud Model Studio API, allowing developers to integrate it into their applications.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!