TAGS: A Test-Time Generalist-Specialist Framework with Retrieval-Augmented Reasoning and Verification

Jianghao Wu, Feilong Tang, Yulong Li, Ming Hu, Haochen Xue, Shoaib Jameel, Yutong Xie, Imran Razzak

2025-05-27

TAGS: A Test-Time Generalist-Specialist Framework with
Retrieval-Augmented Reasoning and Verification

Summary

This paper talks about TAGS, a new system that helps medical language models give better answers by combining the strengths of both general-purpose and specialist AI models, using smart information retrieval and reliability checks.

What's the problem?

The problem is that medical language models often struggle to provide accurate and trustworthy answers, especially when they haven't been specifically trained for every possible medical scenario. Relying on just one model can lead to mistakes or missed details, which is risky in healthcare.

What's the solution?

The authors created TAGS, a framework that, during testing, brings together a generalist model (which knows a little about everything) and specialist models (which are experts in certain areas). It uses a system to look up relevant information and checks how reliable each answer is, all without needing to retrain the models.

Why it matters?

This is important because it allows AI to give more accurate and dependable medical advice, making it safer for doctors and patients to use these tools, even when new or unusual questions come up.

Abstract

TAGS, a test-time framework combining generalist and specialist models with hierarchical retrieval and reliability scoring, enhances medical LLM reasoning without fine-tuning.

View Paper