SpaceBlender: Creating Context-Rich Collaborative Spaces Through Generative 3D Scene Blending
Nels Numan, Shwetha Rajaram, Balasaravanan Thoravi Kumaravel, Nicolai Marquardt, Andrew D. Wilson
2024-09-24

Summary
This paper introduces SpaceBlender, a new system that uses generative AI to create virtual reality (VR) environments that blend users' real-world surroundings with digital spaces. It aims to make collaborative tasks in VR more effective by incorporating the user's physical context.
What's the problem?
Current VR systems often create artificial environments that don't consider the user's actual surroundings, making it difficult for people to collaborate effectively in virtual spaces. This lack of context can lead to a disconnect between the user and the virtual environment, reducing the effectiveness of collaborative tasks.
What's the solution?
To solve this problem, the researchers developed SpaceBlender, which takes 2D images of a user's physical environment and transforms them into rich 3D virtual spaces. The process involves several steps: estimating depth (how far away things are), aligning the shapes of objects, and filling in any gaps in the scene using advanced AI techniques. They tested SpaceBlender with participants performing collaborative tasks in VR and found that it provided a more familiar and context-aware experience compared to standard virtual environments.
Why it matters?
This research is important because it enhances how people interact in virtual reality, making it more natural and effective for teamwork. By creating environments that reflect users' real-world contexts, SpaceBlender can improve collaboration in various fields such as education, design, and remote work, leading to better outcomes in group projects.
Abstract
There is increased interest in using generative AI to create 3D spaces for Virtual Reality (VR) applications. However, today's models produce artificial environments, falling short of supporting collaborative tasks that benefit from incorporating the user's physical context. To generate environments that support VR telepresence, we introduce SpaceBlender, a novel pipeline that utilizes generative AI techniques to blend users' physical surroundings into unified virtual spaces. This pipeline transforms user-provided 2D images into context-rich 3D environments through an iterative process consisting of depth estimation, mesh alignment, and diffusion-based space completion guided by geometric priors and adaptive text prompts. In a preliminary within-subjects study, where 20 participants performed a collaborative VR affinity diagramming task in pairs, we compared SpaceBlender with a generic virtual environment and a state-of-the-art scene generation framework, evaluating its ability to create virtual spaces suitable for collaboration. Participants appreciated the enhanced familiarity and context provided by SpaceBlender but also noted complexities in the generative environments that could detract from task focus. Drawing on participant feedback, we propose directions for improving the pipeline and discuss the value and design of blended spaces for different scenarios.