CoDA: Agentic Systems for Collaborative Data Visualization

Zichen Chen, Jiefeng Chen, Sercan Ö. Arik, Misha Sra, Tomas Pfister, Jinsung Yoon

2025-10-06

CoDA: Agentic Systems for Collaborative Data Visualization

Summary

This paper is about building a better system to automatically create data visualizations from simple text requests, like asking a computer to 'show me sales by region'.

What's the problem?

Currently, making visualizations from text is hard because real-world data is often complex, spread across multiple files, and requires a lot of back-and-forth tweaking to get right. Existing systems can handle simple requests, but they fall apart when dealing with complicated data or when you need to refine the visualization after the first attempt. They often focus on just understanding the initial request and don't handle errors or ensure the final visualization is actually good.

What's the solution?

The researchers created a system called CoDA that uses multiple 'agent' programs, each powered by a large language model, working together. One agent analyzes the data's structure, another plans the steps to create the visualization, a third writes the code, and a final agent checks the work and suggests improvements. By breaking down the task and having agents specialize, CoDA can handle more complex data and create higher-quality visualizations, even with iterative refinement.

Why it matters?

This research shows that the best way to automate visualization isn't just to have one program generate code, but to build a team of programs that collaborate. This approach makes visualization automation much more powerful and reliable, meaning data scientists can spend less time on tedious tasks and more time actually analyzing data.

Abstract

Deep research has revolutionized data analysis, yet data scientists still devote substantial time to manually crafting visualizations, highlighting the need for robust automation from natural language queries. However, current systems struggle with complex datasets containing multiple files and iterative refinement. Existing approaches, including simple single- or multi-agent systems, often oversimplify the task, focusing on initial query parsing while failing to robustly manage data complexity, code errors, or final visualization quality. In this paper, we reframe this challenge as a collaborative multi-agent problem. We introduce CoDA, a multi-agent system that employs specialized LLM agents for metadata analysis, task planning, code generation, and self-reflection. We formalize this pipeline, demonstrating how metadata-focused analysis bypasses token limits and quality-driven refinement ensures robustness. Extensive evaluations show CoDA achieves substantial gains in the overall score, outperforming competitive baselines by up to 41.5%. This work demonstrates that the future of visualization automation lies not in isolated code generation but in integrated, collaborative agentic workflows.

View Paper