Dyke Ferber, Omar S. M. El Nahhas, Georg Wölflein, Isabella C. Wiest, Jan Clusmann, Marie-Elisabeth Leßman, Sebastian Foersch, Jacqueline Lammert, Maximilian Tschochohei, Dirk Jäger, Manuel Salto-Tellez, Nikolaus Schultz, Daniel Truhn, Jakob Nikolas Kather
{"title":"Autonomous Artificial Intelligence Agents for Clinical Decision Making in Oncology","authors":"Dyke Ferber, Omar S. M. El Nahhas, Georg Wölflein, Isabella C. Wiest, Jan Clusmann, Marie-Elisabeth Leßman, Sebastian Foersch, Jacqueline Lammert, Maximilian Tschochohei, Dirk Jäger, Manuel Salto-Tellez, Nikolaus Schultz, Daniel Truhn, Jakob Nikolas Kather","doi":"arxiv-2404.04667","DOIUrl":null,"url":null,"abstract":"Multimodal artificial intelligence (AI) systems have the potential to enhance\nclinical decision-making by interpreting various types of medical data.\nHowever, the effectiveness of these models across all medical fields is\nuncertain. Each discipline presents unique challenges that need to be addressed\nfor optimal performance. This complexity is further increased when attempting\nto integrate different fields into a single model. Here, we introduce an\nalternative approach to multimodal medical AI that utilizes the generalist\ncapabilities of a large language model (LLM) as a central reasoning engine.\nThis engine autonomously coordinates and deploys a set of specialized medical\nAI tools. These tools include text, radiology and histopathology image\ninterpretation, genomic data processing, web searches, and document retrieval\nfrom medical guidelines. We validate our system across a series of clinical\noncology scenarios that closely resemble typical patient care workflows. We\nshow that the system has a high capability in employing appropriate tools\n(97%), drawing correct conclusions (93.6%), and providing complete (94%), and\nhelpful (89.2%) recommendations for individual patient cases while consistently\nreferencing relevant literature (82.5%) upon instruction. This work provides\nevidence that LLMs can effectively plan and execute domain-specific models to\nretrieve or synthesize new information when used as autonomous agents. This\nenables them to function as specialist, patient-tailored clinical assistants.\nIt also simplifies regulatory compliance by allowing each component tool to be\nindividually validated and approved. We believe, that our work can serve as a\nproof-of-concept for more advanced LLM-agents in the medical domain.","PeriodicalId":501572,"journal":{"name":"arXiv - QuanBio - Tissues and Organs","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Tissues and Organs","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2404.04667","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multimodal artificial intelligence (AI) systems have the potential to enhance
clinical decision-making by interpreting various types of medical data.
However, the effectiveness of these models across all medical fields is
uncertain. Each discipline presents unique challenges that need to be addressed
for optimal performance. This complexity is further increased when attempting
to integrate different fields into a single model. Here, we introduce an
alternative approach to multimodal medical AI that utilizes the generalist
capabilities of a large language model (LLM) as a central reasoning engine.
This engine autonomously coordinates and deploys a set of specialized medical
AI tools. These tools include text, radiology and histopathology image
interpretation, genomic data processing, web searches, and document retrieval
from medical guidelines. We validate our system across a series of clinical
oncology scenarios that closely resemble typical patient care workflows. We
show that the system has a high capability in employing appropriate tools
(97%), drawing correct conclusions (93.6%), and providing complete (94%), and
helpful (89.2%) recommendations for individual patient cases while consistently
referencing relevant literature (82.5%) upon instruction. This work provides
evidence that LLMs can effectively plan and execute domain-specific models to
retrieve or synthesize new information when used as autonomous agents. This
enables them to function as specialist, patient-tailored clinical assistants.
It also simplifies regulatory compliance by allowing each component tool to be
individually validated and approved. We believe, that our work can serve as a
proof-of-concept for more advanced LLM-agents in the medical domain.