Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling
Hanyang Kong, Xingyi Yang, Xinchao Wang
2025-02-28
Summary
This paper talks about a new way to create 3D videos from regular 2D videos more efficiently, called Efficient Dynamic Gaussian Splatting (EDGS). It's like making a 3D movie from a normal video, but doing it faster and with better quality.
What's the problem?
Current methods for turning 2D videos into 3D scenes use too many 'Gaussians' (which are like 3D pixels) to represent everything in the video. This makes the process slow and can cause shaky images in parts of the scene that don't move. It's like using too many building blocks to make a model, which makes it take longer to build and sometimes wobble.
What's the solution?
The researchers created EDGS, which uses fewer, smarter 'building blocks' to represent the 3D scene. They use something called a sparse anchor-grid to figure out how things move, and they have a clever way to identify which parts of the scene are moving and which are still. This means they only use the complicated math (MLPs) on the parts that are actually moving, making the whole process much faster.
Why it matters?
This matters because it could make creating 3D videos from regular videos much faster and better looking. This could be really useful for things like virtual reality, where you want to turn real places into 3D environments quickly, or for special effects in movies. It could also help make better 3D models for things like video games or architectural designs, all while using less computer power.
Abstract
Rendering dynamic scenes from monocular videos is a crucial yet challenging task. The recent deformable <PRE_TAG>Gaussian Splatting</POST_TAG> has emerged as a robust solution to represent real-world dynamic scenes. However, it often leads to heavily redundant Gaussians, attempting to fit every training view at various time steps, leading to slower rendering speeds. Additionally, the attributes of Gaussians in static areas are time-invariant, making it unnecessary to model every Gaussian, which can cause jittering in static regions. In practice, the primary bottleneck in rendering speed for dynamic scenes is the number of Gaussians. In response, we introduce Efficient Dynamic Gaussian Splatting (EDGS), which represents dynamic scenes via sparse time-variant attribute modeling. Our approach formulates dynamic scenes using a sparse anchor-grid representation, with the motion flow of dense Gaussians calculated via a classical kernel representation. Furthermore, we propose an unsupervised strategy to efficiently filter out anchors corresponding to static areas. Only anchors associated with deformable objects are input into MLPs to query time-variant attributes. Experiments on two real-world datasets demonstrate that our EDGS significantly improves the rendering speed with superior rendering quality compared to previous state-of-the-art methods.