DrugReasoner: Interpretable Drug Approval Prediction with a Reasoning-augmented Language Model

Mohammadreza Ghaffarzadeh-Esfahani, Ali Motahharynia, Nahid Yousefian, Navid Mazrouei, Jafar Ghaisari, Yousof Gheisari

2025-08-27

DrugReasoner: Interpretable Drug Approval Prediction with a Reasoning-augmented Language Model

Summary

This paper introduces DrugReasoner, a new artificial intelligence model designed to predict whether a potential new drug will be approved by regulators, like the FDA.

What's the problem?

Discovering new drugs is incredibly expensive and takes a long time. Many potential drugs ultimately fail during the approval process, wasting resources. Current AI methods can *predict* approval, but often act like a 'black box' – they give an answer without explaining *why*, making it hard for scientists to trust or learn from the predictions.

What's the solution?

The researchers built DrugReasoner, which is based on a powerful language model similar to those used for chatbots. However, they specifically trained it on information about drugs, including their chemical structure. Importantly, DrugReasoner doesn't just give a 'yes' or 'no' answer; it explains its reasoning by comparing the new drug to similar drugs that have already been approved or rejected, and provides a confidence score. They used a special training technique called group relative policy optimization to improve its performance.

Why it matters?

DrugReasoner is better at predicting drug approval than many traditional AI methods, and it’s also more transparent. By providing explanations for its predictions, it helps scientists understand *why* a drug might succeed or fail, potentially saving time and money in the drug development process and leading to better decisions about which drugs to pursue.

Abstract

Drug discovery is a complex and resource-intensive process, making early prediction of approval outcomes critical for optimizing research investments. While classical machine learning and deep learning methods have shown promise in drug approval prediction, their limited interpretability constraints their impact. Here, we present DrugReasoner, a reasoning-based large language model (LLM) built on the LLaMA architecture and fine-tuned with group relative policy optimization (GRPO) to predict the likelihood of small-molecule approval. DrugReasoner integrates molecular descriptors with comparative reasoning against structurally similar approved and unapproved compounds, generating predictions alongside step-by-step rationales and confidence scores. DrugReasoner achieved robust performance with an AUC of 0.732 and an F1 score of 0.729 on the validation set and 0.725 and 0.718 on the test set, respectively. These results outperformed conventional baselines, including logistic regression, support vector machine, and k-nearest neighbors and had competitive performance relative to XGBoost. On an external independent dataset, DrugReasoner outperformed both baseline and the recently developed ChemAP model, achieving an AUC of 0.728 and an F1-score of 0.774, while maintaining high precision and balanced sensitivity, demonstrating robustness in real-world scenarios. These findings demonstrate that DrugReasoner not only delivers competitive predictive accuracy but also enhances transparency through its reasoning outputs, thereby addressing a key bottleneck in AI-assisted drug discovery. This study highlights the potential of reasoning-augmented LLMs as interpretable and effective tools for pharmaceutical decision-making.

View Paper