Qwen3-Omni

NEW

Key Features

Natively supports text, image, audio, and video inputs with real-time streaming responses in text and speech.

Multilingual support: 119 text languages, 19 speech input languages, and 10 speech output languages.

MoE-based Thinker–Talker architecture with multi-codebook design for minimal latency.

Achieves state-of-the-art performance on multiple audio and video benchmarks, including audio understanding and voice conversation.

Flexible customization through system prompts for tailored interactions and use cases.

Detailed audio captioning model with low hallucination rate for precise audio descriptions.

Multiple deployment methods: Hugging Face Transformers, vLLM, Docker container, and web UI demos.

Supports batch processing and API usage for scalable, production-grade applications.

Supporting 119 text languages as well as 19 languages for speech input and 10 for speech output, Qwen3-Omni excels in multilingual communication scenarios. It leverages a unique MoE-based Thinker–Talker design with AuT pretraining that equips it with strong general representations and includes a multi-codebook design to minimize latency during inference. The model achieves top rankings on numerous audio and video benchmarks, rivaling leading closed-source systems. Its real-time audio and video interaction capability ensures low-latency, natural turn-taking in conversational settings.

Qwen3-Omni offers flexible control through system prompts, allowing customization for specific user needs and applications. It features a highly detailed audio captioner model that produces precise and low-hallucination descriptions of audio inputs, filling gaps in open-source multimodal AI tools. The model ecosystem includes various specialized versions for instructive tasks, thinking and reasoning processes, and downstream fine-tuned captioning applications. Deployment options include Hugging Face Transformers, vLLM inference, Docker images, and a web UI demo for users to explore its rich multimodal capabilities locally or via APIs.

Get more likes & reach the top of search results by adding this button on your site!

Qwen3-Omni

Key Features

Subscribe to the AI Search Newsletter