The backbone of the model is a frame-causal rectified flow transformer, which is trained from the ground up with a focus on interactive experiences. This allows for seamless and latency-free interaction, where users can move the camera freely with the mouse and input any key on the keyboard. Each frame is generated with the user's controls as context, providing a realistic and immersive experience.
Waypoint-1 is part of the Overworld platform, which includes the WorldEngine inference library for interactive world model streaming. WorldEngine provides the core tooling for building inference applications in pure Python, optimized for low latency, high throughput, extensibility, and developer simplicity. The library sustains high performance, achieving 30 FPS at 4 steps or 60 FPS at 2 steps, making it suitable for real-time interactive applications.


