Key Features

Reconstructs complete 4D dynamic objects from monocular in-the-wild video.
Uses causal latent conditioning for temporally consistent per-frame 3D predictions.
Initializes a deformable 3D Gaussian Splatting representation.
Uses canonical Gaussians animated by sparse deformation nodes.
Refines appearance with occlusion-aware losses and inpainted frames.
Uses a novel-view diffusion prior for unobserved and occluded regions.
Targets severe occlusions and non-rigid motion.
Provides arXiv, public code link, and direct video demos.

The method adapts a single-view 3D reconstruction model to generate temporally consistent per-frame 3D predictions through causal latent conditioning. These predictions initialize a deformable 3D Gaussian Splatting representation, which is then refined with occlusion-aware appearance optimization and a view-conditioned diffusion prior.


Lift4D is useful for 4D reconstruction research, dynamic object capture, and monocular video-to-asset workflows. It improves over prior baselines on challenging sequences with occlusion and non-rigid motion by combining observed details with learned completion priors.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner
Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!