fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction

Jianxiong Gao, Yuqian Fu, Yun Wang, Xuelin Qian, Jianfeng Feng, Yanwei Fu

2024-09-19

fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction

Summary

This paper presents the fMRI-3D dataset, which is designed to help researchers improve the process of creating 3D images from brain activity data collected through functional Magnetic Resonance Imaging (fMRI).

What's the problem?

Reconstructing 3D visuals from fMRI data is important for understanding how our brains process visual information, but existing datasets are often limited in size and diversity. This makes it difficult for researchers to develop effective methods for translating brain signals into accurate 3D representations.

What's the solution?

The authors introduce the fMRI-3D dataset, which includes data from 15 participants and features a total of 4,768 different 3D objects. The dataset is divided into two parts: fMRI-Shape, which has been previously introduced, and fMRI-Objaverse, which includes data from five subjects who viewed 3,142 objects across various categories. Additionally, they propose a new framework called MinD-3D that uses advanced techniques to decode 3D visual information from fMRI signals, improving the accuracy of the reconstruction process.

Why it matters?

This research is significant because it provides a comprehensive dataset that can enhance our understanding of how the brain processes visual information and improve techniques for creating accurate 3D models from brain activity. This has potential applications in fields like cognitive neuroscience and computer vision, helping to develop better tools for studying the brain and creating assistive technologies.

Abstract

Reconstructing 3D visuals from functional Magnetic Resonance Imaging (fMRI) data, introduced as Recon3DMind in our conference work, is of significant interest to both cognitive neuroscience and computer vision. To advance this task, we present the fMRI-3D dataset, which includes data from 15 participants and showcases a total of 4768 3D objects. The dataset comprises two components: fMRI-Shape, previously introduced and accessible at https://huggingface.co/datasets/Fudan-fMRI/fMRI-Shape, and fMRI-Objaverse, proposed in this paper and available at https://huggingface.co/datasets/Fudan-fMRI/fMRI-Objaverse. fMRI-Objaverse includes data from 5 subjects, 4 of whom are also part of the Core set in fMRI-Shape, with each subject viewing 3142 3D objects across 117 categories, all accompanied by text captions. This significantly enhances the diversity and potential applications of the dataset. Additionally, we propose MinD-3D, a novel framework designed to decode 3D visual information from fMRI signals. The framework first extracts and aggregates features from fMRI data using a neuro-fusion encoder, then employs a feature-bridge diffusion model to generate visual features, and finally reconstructs the 3D object using a generative transformer decoder. We establish new benchmarks by designing metrics at both semantic and structural levels to evaluate model performance. Furthermore, we assess our model's effectiveness in an Out-of-Distribution setting and analyze the attribution of the extracted features and the visual ROIs in fMRI signals. Our experiments demonstrate that MinD-3D not only reconstructs 3D objects with high semantic and spatial accuracy but also deepens our understanding of how human brain processes 3D visual information. Project page at: https://jianxgao.github.io/MinD-3D.

View Paper