Floating No More: Object-Ground Reconstruction from a Single Image
Yunze Man, Yichen Sheng, Jianming Zhang, Liang-Yan Gui, Yu-Xiong Wang
2024-07-29

Summary
This paper introduces ORG, a new method for reconstructing 3D objects from a single image while accurately representing their relationship with the ground. It aims to improve how objects are visualized in 3D space, making them look more realistic when placed on surfaces.
What's the problem?
Current techniques for creating 3D models from single images often fail to accurately depict how objects relate to the ground and camera. This can result in objects appearing as if they are floating or tilted, which is not realistic. This issue is particularly problematic for applications that require precise image editing, like adding shadows or adjusting object positions.
What's the solution?
The authors developed ORG (Object Reconstruction with Ground), which uses two pixel-level representations to better understand the relationship between the object, the ground, and the camera. By doing this, ORG can create more accurate 3D models that look correct when placed on flat surfaces. The experiments showed that ORG performs well even on new data it hasn't seen before, leading to better shadow generation and object positioning compared to older methods.
Why it matters?
This research is significant because it enhances the realism of 3D object representations, which is important for various applications in computer graphics and image editing. By improving how we can visualize objects in relation to their environment, ORG can help create more convincing visual content in areas like gaming, virtual reality, and film production.
Abstract
Recent advancements in 3D object reconstruction from single images have primarily focused on improving the accuracy of object shapes. Yet, these techniques often fail to accurately capture the inter-relation between the object, ground, and camera. As a result, the reconstructed objects often appear floating or tilted when placed on flat surfaces. This limitation significantly affects 3D-aware image editing applications like shadow rendering and object pose manipulation. To address this issue, we introduce ORG (Object Reconstruction with Ground), a novel task aimed at reconstructing 3D object geometry in conjunction with the ground surface. Our method uses two compact pixel-level representations to depict the relationship between camera, object, and ground. Experiments show that the proposed ORG model can effectively reconstruct object-ground geometry on unseen data, significantly enhancing the quality of shadow generation and pose manipulation compared to conventional single-image 3D reconstruction techniques.