OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets
Maziyar Panahi
2025-08-07
Summary
This paper talks about OpenMed NER, a set of open-source transformer models specially trained to recognize important terms like diseases, chemicals, and genes in biomedical texts. These models are designed to work well across many types of biomedical data.
What's the problem?
The problem is that biomedical texts contain a lot of complex and varied information, but most existing models struggle to accurately detect and classify different biomedical entities, especially while being computationally efficient.
What's the solution?
The solution was to use a three-stage training process involving domain-adaptive pre-training on large biomedical datasets combined with lightweight adapters called LoRA, allowing the models to specialize in biomedical language without needing huge computational resources. This approach was applied to strong transformer backbones, resulting in state-of-the-art performance across many biomedical datasets.
Why it matters?
This matters because accurate named-entity recognition in biomedical texts helps researchers and healthcare professionals extract valuable structured information from vast amounts of unstructured data, improving tasks like disease diagnosis, drug discovery, and patient care, all while keeping the technology accessible and efficient.
Abstract
OpenMed NER, a suite of open-source transformer models using DAPT and LoRA, achieves state-of-the-art performance on diverse biomedical NER benchmarks with high efficiency and low computational cost.