MedMobile: A mobile-sized language model with expert-level clinical capabilities

Krithik Vishwanath, Jaden Stryker, Anton Alaykin, Daniel Alexander Alber, Eric Karl Oermann

2024-10-18

MedMobile: A mobile-sized language model with expert-level clinical capabilities

Summary

This paper introduces MedMobile, a compact language model designed for medical applications that can run on mobile devices while still providing expert-level clinical capabilities.

What's the problem?

Large language models (LLMs) have shown great potential in assisting with medical tasks, but they are often too large and require too much computing power to be used on mobile devices. Additionally, there are concerns about privacy and the costs associated with using these models in everyday healthcare settings.

What's the solution?

To address these issues, the authors developed MedMobile, a smaller language model with 3.8 billion parameters that can effectively run on mobile devices. They demonstrated that MedMobile can score 75.7% on the MedQA (USMLE) exam, which is above the passing mark for physicians (around 60%). The researchers also explored different methods to improve the model's performance, such as using techniques called 'chain of thought' and 'ensembling,' which helped boost accuracy. They found that while some methods like retrieval-augmented generation didn't significantly improve results, the overall approach made MedMobile very effective for medical tasks.

Why it matters?

This research is important because it makes advanced medical AI technology accessible on mobile devices, allowing healthcare professionals to use it in real-time for diagnosis and treatment recommendations. By providing expert-level support without needing bulky hardware, MedMobile can help improve patient care and streamline medical workflows, making it a valuable tool in modern healthcare.

Abstract

Language models (LMs) have demonstrated expert-level reasoning and recall abilities in medicine. However, computational costs and privacy concerns are mounting barriers to wide-scale implementation. We introduce a parsimonious adaptation of phi-3-mini, MedMobile, a 3.8 billion parameter LM capable of running on a mobile device, for medical applications. We demonstrate that MedMobile scores 75.7% on the MedQA (USMLE), surpassing the passing mark for physicians (~60%), and approaching the scores of models 100 times its size. We subsequently perform a careful set of ablations, and demonstrate that chain of thought, ensembling, and fine-tuning lead to the greatest performance gains, while unexpectedly retrieval augmented generation fails to demonstrate significant improvements

View Paper