Guardians of the Hair: Rescuing Soft Boundaries in Depth, Stereo, and Novel Views
Xiang Zhang, Yang Zhang, Lukas Mehl, Markus Gross, Christopher Schroers
2026-01-09
Summary
This paper introduces a new system called HairGuard that improves how computers 'see' and understand soft boundaries, like wisps of hair or fur, in images and videos.
What's the problem?
Computers struggle with objects that don't have clearly defined edges, like hair or smoke. These 'soft boundaries' create confusion because it's hard to tell where the object ends and the background begins, leading to inaccurate 3D reconstructions and blurry images when creating new viewpoints. Existing 3D vision systems often get these areas wrong, making them look messy or unrealistic.
What's the solution?
HairGuard tackles this problem in a few key steps. First, it uses existing image editing techniques to learn what soft boundaries look like. Then, it uses a special 'depth fixer' network to precisely refine the depth information around these fuzzy edges, without messing up the overall 3D shape. When creating new views of a scene, it carefully warps existing textures and then uses a 'scene painter' to fill in missing parts and remove unwanted background details within the soft boundaries. Finally, it blends these results together to create a realistic and detailed new image.
Why it matters?
This research is important because it significantly improves the quality of 3D vision applications, especially when dealing with complex scenes containing soft, delicate features. This has implications for things like creating more realistic special effects in movies, improving virtual reality experiences, and enabling robots to better understand the world around them.
Abstract
Soft boundaries, like thin hairs, are commonly observed in natural and computer-generated imagery, but they remain challenging for 3D vision due to the ambiguous mixing of foreground and background cues. This paper introduces Guardians of the Hair (HairGuard), a framework designed to recover fine-grained soft boundary details in 3D vision tasks. Specifically, we first propose a novel data curation pipeline that leverages image matting datasets for training and design a depth fixer network to automatically identify soft boundary regions. With a gated residual module, the depth fixer refines depth precisely around soft boundaries while maintaining global depth quality, allowing plug-and-play integration with state-of-the-art depth models. For view synthesis, we perform depth-based forward warping to retain high-fidelity textures, followed by a generative scene painter that fills disoccluded regions and eliminates redundant background artifacts within soft boundaries. Finally, a color fuser adaptively combines warped and inpainted results to produce novel views with consistent geometry and fine-grained details. Extensive experiments demonstrate that HairGuard achieves state-of-the-art performance across monocular depth estimation, stereo image/video conversion, and novel view synthesis, with significant improvements in soft boundary regions.