GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces

Melis Ocal, Xiaoyan Xing, Yue Li, Ngo Anh Vien, Sezer Karaoglu, Theo Gevers

2025-12-05

GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces

Summary

This paper introduces a new technique called GaussianBlender for quickly and easily changing the style of 3D models based on text descriptions.

What's the problem?

Currently, making stylistic changes to 3D models is a slow and difficult process, often requiring artists to manually adjust each model individually. Existing methods that use text prompts to guide these changes are also inconsistent – meaning the model might look different from various angles, and they aren't fast enough for creating lots of assets, like in video game development.

What's the solution?

GaussianBlender uses a new approach that learns how 3D shapes and appearances are connected. It represents 3D objects as a collection of 'Gaussians' and then uses a type of artificial intelligence called a 'latent diffusion model' to apply text-based style changes instantly. This method keeps the changes consistent from all viewpoints and preserves the original shape of the model.

Why it matters?

This research is important because it makes 3D stylization much faster and more accessible. It allows for large-scale production of customized 3D assets without the need for time-consuming manual adjustments, which is a big step forward for game development, virtual reality, and other digital art fields.

Abstract

3D stylization is central to game development, virtual reality, and digital arts, where the demand for diverse assets calls for scalable methods that support fast, high-fidelity manipulation. Existing text-to-3D stylization methods typically distill from 2D image editors, requiring time-intensive per-asset optimization and exhibiting multi-view inconsistency due to the limitations of current text-to-image models, which makes them impractical for large-scale production. In this paper, we introduce GaussianBlender, a pioneering feed-forward framework for text-driven 3D stylization that performs edits instantly at inference. Our method learns structured, disentangled latent spaces with controlled information sharing for geometry and appearance from spatially-grouped 3D Gaussians. A latent diffusion model then applies text-conditioned edits on these learned representations. Comprehensive evaluations show that GaussianBlender not only delivers instant, high-fidelity, geometry-preserving, multi-view consistent stylization, but also surpasses methods that require per-instance test-time optimization - unlocking practical, democratized 3D stylization at scale.

View Paper