Background: Drug-drug interaction (DDI) is a global health concern affecting patient safety and treatment outcomes. Large language models (LLMs), such as ChatGPT, offer accessible alternatives; however, their effectiveness in DDI analysis remains unclear. This review evaluates the current evidence on the performance of LLM-based chatbots in identifying DDIs.
Methods: A PRISMA-compliant systematic review (PROSPERO: CRD420251020360) was conducted using PubMed, Scopus, and Web of Science (studies published between 1 January 2015, and 31 March 2025). Eligible studies included those using publicly accessible LLM chatbots for DDI detection.
Results: Nine studies (2023-2025) evaluated publicly accessible LLM chatbots, including ChatGPT, Bing AI, and Google Bard, for DDI identification. Methods varied from patient-level polypharmacy screening to single-drug checks and case vignettes. Chatbot performance was inconsistent: ChatGPT identified many potential DDIs, with ChatGPT-4.0 generally identifying more potential DDIs, but with variable accuracy, while Bing AI and Google Bard were less reliable.
Conclusion: Publicly accessible LLM chatbots demonstrate variable and partial effectiveness in detecting DDIs. There is a clear need to develop dedicated, freely available chatbots designed specifically for DDI identification. Future research should focus on standardizing evaluation methods and expanding access to improve medication safety in clinical practice.
Prospero: CRD420251020360.
扫码关注我们
求助内容:
应助结果提醒方式:
