SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images

Risa Shinoda, Kuniaki Saito, Shohei Tanaka, Tosho Hirasawa, Yoshitaka Ushiku

2024-12-30

SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images

Summary

This paper talks about SBS Figures, a new dataset created to help train AI systems to answer questions about figures and charts more effectively by generating synthetic images step-by-step.

What's the problem?

Creating a large dataset for figure question answering (QA) is very labor-intensive and requires a lot of manual work, such as gathering images, labeling them, and generating relevant questions. Existing methods often struggle with issues like code errors and repetitive content, which can lead to low-quality figures that don't help AI systems learn effectively.

What's the solution?

To address these challenges, the authors developed SBS Figures (Stage-by-Stage Synthetic Figures), which uses a systematic pipeline to create chart figures with detailed annotations without needing any manual input. This method generates diverse figures efficiently while reducing errors. The dataset includes 1 million synthetic figure images, each paired with accurate data annotations and question-answer pairs, allowing AI models to learn from high-quality examples. The authors also explored how different aspects of the dataset affect the training process.

Why it matters?

This research is important because it provides a scalable and efficient way to create high-quality training data for AI systems that need to understand and answer questions about visual information. By improving how AI learns from figures, SBS Figures can enhance applications in fields like education, data analysis, and scientific research, making it easier for AI to assist users in interpreting complex information.

Abstract

Building a large-scale figure QA dataset requires a considerable amount of work, from gathering and selecting figures to extracting attributes like text, numbers, and colors, and generating QAs. Although recent developments in LLMs have led to efforts to synthesize figures, most of these focus primarily on QA generation. Additionally, creating figures directly using LLMs often encounters issues such as code errors, similar-looking figures, and repetitive content in figures. To address this issue, we present SBSFigures (Stage-by-Stage Synthetic Figures), a dataset for pre-training figure QA. Our proposed pipeline enables the creation of chart figures with complete annotations of the visualized data and dense QA annotations without any manual annotation process. Our stage-by-stage pipeline makes it possible to create diverse topic and appearance figures efficiently while minimizing code errors. Our SBSFigures demonstrate a strong pre-training effect, making it possible to achieve efficient training with a limited amount of real-world chart data starting from our pre-trained weights.

View Paper