Key Features

Real-time interactive video diffusion model
Controllable and prompted via text, mouse, and keyboard
Trained on 10,000 hours of diverse video game footage
Frame-causal rectified flow transformer architecture
Seamless and latency-free interaction
WorldEngine inference library for interactive world model streaming
High-performance and optimized for low latency and high throughput
Suitable for real-time interactive applications

The backbone of the model is a frame-causal rectified flow transformer, which is trained from the ground up with a focus on interactive experiences. This allows for seamless and latency-free interaction, where users can move the camera freely with the mouse and input any key on the keyboard. Each frame is generated with the user's controls as context, providing a realistic and immersive experience.


Waypoint-1 is part of the Overworld platform, which includes the WorldEngine inference library for interactive world model streaming. WorldEngine provides the core tooling for building inference applications in pure Python, optimized for low latency, high throughput, extensibility, and developer simplicity. The library sustains high performance, achieving 30 FPS at 4 steps or 60 FPS at 2 steps, making it suitable for real-time interactive applications.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner
Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!