OralGPT-Omni: A Versatile Dental Multimodal Large Language Model

Jing Hao, Yuci Liang, Lizhuo Lin, Yuxuan Fan, Wenkai Zhou, Kaixin Guo, Zanting Ye, Yanpeng Sun, Xinyu Zhang, Yanqi Yang, Qiankun Li, Hao Tang, James Kit-Hon Tsoi, Linlin Shen, Kuo Feng Hung

2025-12-01

OralGPT-Omni: A Versatile Dental Multimodal Large Language Model

Summary

This paper introduces OralGPT-Omni, a new artificial intelligence model specifically designed to understand and analyze dental images and clinical information, making it the first of its kind.

What's the problem?

Currently, AI models haven't been fully developed for dentistry because there isn't much specialized data available, dental experts haven't created enough labeled examples for training, existing models don't handle different types of dental images well, and it's hard to ensure these models are reliable in their diagnoses. Basically, applying powerful AI to dentistry has been held back by a lack of resources and specific expertise.

What's the solution?

The researchers created a new dataset called TRACE-CoT which mimics how dentists actually think when looking at images to make a diagnosis. They then used this dataset, along with a four-step training process, to build OralGPT-Omni. They also built a benchmark called MMOral-Uni, a large collection of questions and answers about dental images, to test how well the model performs. OralGPT-Omni significantly outperformed other models, even very advanced ones like GPT-5, on these tests.

Why it matters?

This work is important because it brings the power of advanced AI to dentistry, potentially helping dentists diagnose problems more accurately and efficiently. It also provides the tools – the model and the benchmark – for other researchers to build upon and further improve AI in the field of dental healthcare, ultimately leading to better patient care.

Abstract

Multimodal Large Language Models (MLLMs) have exhibited immense potential across numerous medical specialties; yet, dentistry remains underexplored, in part due to limited domain-specific data, scarce dental expert annotations, insufficient modality-specific modeling, and challenges in reliability. In this paper, we present OralGPT-Omni, the first dental-specialized MLLM designed for comprehensive and trustworthy analysis across diverse dental imaging modalities and clinical tasks. To explicitly capture dentists' diagnostic reasoning, we construct TRACE-CoT, a clinically grounded chain-of-thought dataset that mirrors dental radiologists' decision-making processes. This reasoning supervision, combined with our proposed four-stage training paradigm, substantially strengthens the model's capacity for dental image understanding and analysis. In parallel, we introduce MMOral-Uni, the first unified multimodal benchmark for dental image analysis. It comprises 2,809 open-ended question-answer pairs spanning five modalities and five tasks, offering a comprehensive evaluation suite to date for MLLMs in digital dentistry. OralGPT-Omni achieves an overall score of 51.84 on the MMOral-Uni benchmark and 45.31 on the MMOral-OPG benchmark, dramatically outperforming the scores of GPT-5. Our work promotes intelligent dentistry and paves the way for future advances in dental image analysis. All code, benchmark, and models will be made publicly available.

View Paper