< Explain other AI papers

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Yuning Gong, Yifei Liu, Yifan Zhan, Muyao Niu, Xueying Li, Yuanjun Liao, Jiaming Chen, Yuanyuan Gao, Jiaqi Chen, Minming Chen, Li Zhou, Yuning Zhang, Wei Wang, Xiaoqing Hou, Huaxi Huang, Shixiang Tang, Le Ma, Dingwen Zhang, Xue Yang, Junchi Yan, Yanchi Zhang, Yinqiang Zheng

2025-12-10

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Summary

This paper introduces Visionary, a new platform designed to easily view and interact with 3D models created using a technique called 3D Gaussian Splatting, and other similar methods, directly within a web browser.

What's the problem?

Currently, viewing these advanced 3D models is difficult because the existing tools are often complicated to set up, require powerful computers, or don't work well with dynamic or AI-generated content. It's hard to share and experiment with these models because of these technical hurdles, and existing viewers aren't optimized for speed or flexibility.

What's the solution?

The creators of this paper built Visionary, a platform that runs entirely within a web browser using a technology called WebGPU. This makes it fast and easy to use – you can essentially 'click and run' it. They also created a standard way for different algorithms to create or change the 3D models in real-time, and it works with existing web development tools like three.js, making it easy to add to websites. Visionary sorts the 3D elements efficiently on the graphics card, making rendering faster.

Why it matters?

Visionary makes it much simpler for researchers, developers, and anyone else to work with and share these cutting-edge 3D models. By making the process easier and more accessible, it encourages further development and experimentation in the field of 'world models' – essentially, creating realistic and interactive 3D representations of the world around us – and allows for easier integration of AI-generated 3D content.

Abstract

Neural rendering, particularly 3D Gaussian Splatting (3DGS), has evolved rapidly and become a key component for building world models. However, existing viewer solutions remain fragmented, heavy, or constrained by legacy pipelines, resulting in high deployment friction and limited support for dynamic content and generative models. In this work, we present Visionary, an open, web-native platform for real-time various Gaussian Splatting and meshes rendering. Built on an efficient WebGPU renderer with per-frame ONNX inference, Visionary enables dynamic neural processing while maintaining a lightweight, "click-to-run" browser experience. It introduces a standardized Gaussian Generator contract, which not only supports standard 3DGS rendering but also allows plug-and-play algorithms to generate or update Gaussians each frame. Such inference also enables us to apply feedforward generative post-processing. The platform further offers a plug in three.js library with a concise TypeScript API for seamless integration into existing web applications. Experiments show that, under identical 3DGS assets, Visionary achieves superior rendering efficiency compared to current Web viewers due to GPU-based primitive sorting. It already supports multiple variants, including MLP-based 3DGS, 4DGS, neural avatars, and style transformation or enhancement networks. By unifying inference and rendering directly in the browser, Visionary significantly lowers the barrier to reproduction, comparison, and deployment of 3DGS-family methods, serving as a unified World Model Carrier for both reconstructive and generative paradigms.