SparseCraft: Few-Shot Neural Reconstruction through Stereopsis Guided Geometric Linearization
Mae Younes, Amine Ouasfi, Adnane Boukhayma
2024-07-22

Summary
This paper introduces SparseCraft, a new method for reconstructing 3D shapes and appearances from only a few colored images. The approach uses advanced techniques to create detailed 3D models efficiently.
What's the problem?
Creating accurate 3D models from images is typically challenging, especially when only a few images are available. Many existing methods require a lot of data or prior training on large datasets, making them slow and inefficient. This limits their usefulness in real-world applications where quick and effective reconstruction is needed.
What's the solution?
The authors developed SparseCraft, which uses a Signed Distance Function (SDF) and a radiance field to represent shapes and colors. Their method employs a two-step process that involves progressively training the model using ray marching and volumetric rendering techniques. They also incorporate multi-view stereo cues to improve the accuracy of the reconstruction without needing pre-trained models. This allows SparseCraft to produce high-quality 3D reconstructions and generate new views from just a few input images, achieving state-of-the-art results in less than 10 minutes of training.
Why it matters?
This research is significant because it makes it easier to create detailed 3D models from limited data, which can be applied in various fields like gaming, virtual reality, and robotics. By improving the efficiency and quality of 3D reconstruction, SparseCraft opens up new possibilities for using AI in creative and technical applications.
Abstract
We present a novel approach for recovering 3D shape and view dependent appearance from a few colored images, enabling efficient 3D reconstruction and novel view synthesis. Our method learns an implicit neural representation in the form of a Signed Distance Function (SDF) and a radiance field. The model is trained progressively through ray marching enabled volumetric rendering, and regularized with learning-free multi-view stereo (MVS) cues. Key to our contribution is a novel implicit neural shape function learning strategy that encourages our SDF field to be as linear as possible near the level-set, hence robustifying the training against noise emanating from the supervision and regularization signals. Without using any pretrained priors, our method, called SparseCraft, achieves state-of-the-art performances both in novel-view synthesis and reconstruction from sparse views in standard benchmarks, while requiring less than 10 minutes for training.