Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion
Shengyuan Zhang, An Zhao, Ling Yang, Zejian Li, Chenye Meng, Haoran Xu, Tianrun Chen, AnYang Wei, Perry Pengyun GU, Lingyun Sun
2024-12-05

Summary
This paper introduces ScoreLiDAR, a new method for improving the efficiency of 3D LiDAR scene completion using diffusion models, which helps autonomous vehicles better understand their surroundings.
What's the problem?
3D LiDAR sensors collect data about the environment, but the process of completing these scenes (filling in gaps in the data) using diffusion models can be slow. This slow speed is a problem for autonomous vehicles that need to quickly analyze their surroundings to navigate safely. Existing methods have strong performance but do not work fast enough for real-time applications.
What's the solution?
ScoreLiDAR addresses this issue by introducing a distillation method that allows the model to generate high-quality scene completions much faster. It reduces the number of steps needed to complete a scene while maintaining accuracy. Additionally, it incorporates a new technique called Structural Loss, which helps the model understand and capture the geometric structure of the environment better. This combination leads to significant improvements in both speed and quality of scene completion.
Why it matters?
This research is important because it enhances the capabilities of autonomous vehicles by allowing them to perceive their environment more efficiently. Faster and more accurate scene completion can lead to safer navigation and improved performance in various applications like self-driving cars, robotics, and other technologies that rely on understanding complex environments.
Abstract
Diffusion models have been applied to 3D LiDAR scene completion due to their strong training stability and high completion quality. However, the slow sampling speed limits the practical application of diffusion-based scene completion models since autonomous vehicles require an efficient perception of surrounding environments. This paper proposes a novel distillation method tailored for 3D LiDAR scene completion models, dubbed ScoreLiDAR, which achieves efficient yet high-quality scene completion. ScoreLiDAR enables the distilled model to sample in significantly fewer steps after distillation. To improve completion quality, we also introduce a novel Structural Loss, which encourages the distilled model to capture the geometric structure of the 3D LiDAR scene. The loss contains a scene-wise term constraining the holistic structure and a point-wise term constraining the key landmark points and their relative configuration. Extensive experiments demonstrate that ScoreLiDAR significantly accelerates the completion time from 30.55 to 5.37 seconds per frame (>5times) on SemanticKITTI and achieves superior performance compared to state-of-the-art 3D LiDAR scene completion models. Our code is publicly available at https://github.com/happyw1nd/ScoreLiDAR.