Ceca Kraišniković , Robert Harb , Markus Plass , Wael Al Zoughbi , Andreas Holzinger , Heimo Müller
{"title":"微调语言模型嵌入以揭示领域知识:从可解释的人工智能角度看医疗决策","authors":"Ceca Kraišniković , Robert Harb , Markus Plass , Wael Al Zoughbi , Andreas Holzinger , Heimo Müller","doi":"10.1016/j.engappai.2024.109561","DOIUrl":null,"url":null,"abstract":"<div><div>Integrating large language models (LLMs) to retrieve targeted medical knowledge from electronic health records enables significant advancements in medical research. However, recognizing the challenges associated with using LLMs in healthcare is essential for successful implementation. One challenge is that medical records combine unstructured textual information with highly sensitive personal data. This, in turn, highlights the need for explainable Artificial Intelligence (XAI) methods to understand better how LLMs function in the medical domain. In this study, we propose a novel XAI tool to accelerate data-driven cancer research. We apply the Bidirectional Encoder Representations from Transformers (BERT) model to German language pathology reports examining the effects of domain-specific language adaptation and fine-tuning. We demonstrate our model on a real-world pathology dataset, analyzing the contextual representations of diagnostic reports. By illustrating decisions made by fine-tuned models, we provide decision values that can be applied in medical research. To address interpretability, we conduct a performance evaluation of the classifications generated by our fine-tuned model, as assessed by an expert pathologist. In domains such as medicine, inspection of the medical knowledge map in conjunction with expert evaluation reveals valuable information about how contextual representations of key disease features are categorized. This ultimately benefits data structuring and labeling and paves the way for even more advanced approaches to XAI, combining text with other input modalities, such as images which are then applicable to various engineering problems.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109561"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fine-tuning language model embeddings to reveal domain knowledge: An explainable artificial intelligence perspective on medical decision making\",\"authors\":\"Ceca Kraišniković , Robert Harb , Markus Plass , Wael Al Zoughbi , Andreas Holzinger , Heimo Müller\",\"doi\":\"10.1016/j.engappai.2024.109561\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Integrating large language models (LLMs) to retrieve targeted medical knowledge from electronic health records enables significant advancements in medical research. However, recognizing the challenges associated with using LLMs in healthcare is essential for successful implementation. One challenge is that medical records combine unstructured textual information with highly sensitive personal data. This, in turn, highlights the need for explainable Artificial Intelligence (XAI) methods to understand better how LLMs function in the medical domain. In this study, we propose a novel XAI tool to accelerate data-driven cancer research. We apply the Bidirectional Encoder Representations from Transformers (BERT) model to German language pathology reports examining the effects of domain-specific language adaptation and fine-tuning. We demonstrate our model on a real-world pathology dataset, analyzing the contextual representations of diagnostic reports. By illustrating decisions made by fine-tuned models, we provide decision values that can be applied in medical research. To address interpretability, we conduct a performance evaluation of the classifications generated by our fine-tuned model, as assessed by an expert pathologist. In domains such as medicine, inspection of the medical knowledge map in conjunction with expert evaluation reveals valuable information about how contextual representations of key disease features are categorized. This ultimately benefits data structuring and labeling and paves the way for even more advanced approaches to XAI, combining text with other input modalities, such as images which are then applicable to various engineering problems.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"139 \",\"pages\":\"Article 109561\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197624017196\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624017196","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Fine-tuning language model embeddings to reveal domain knowledge: An explainable artificial intelligence perspective on medical decision making
Integrating large language models (LLMs) to retrieve targeted medical knowledge from electronic health records enables significant advancements in medical research. However, recognizing the challenges associated with using LLMs in healthcare is essential for successful implementation. One challenge is that medical records combine unstructured textual information with highly sensitive personal data. This, in turn, highlights the need for explainable Artificial Intelligence (XAI) methods to understand better how LLMs function in the medical domain. In this study, we propose a novel XAI tool to accelerate data-driven cancer research. We apply the Bidirectional Encoder Representations from Transformers (BERT) model to German language pathology reports examining the effects of domain-specific language adaptation and fine-tuning. We demonstrate our model on a real-world pathology dataset, analyzing the contextual representations of diagnostic reports. By illustrating decisions made by fine-tuned models, we provide decision values that can be applied in medical research. To address interpretability, we conduct a performance evaluation of the classifications generated by our fine-tuned model, as assessed by an expert pathologist. In domains such as medicine, inspection of the medical knowledge map in conjunction with expert evaluation reveals valuable information about how contextual representations of key disease features are categorized. This ultimately benefits data structuring and labeling and paves the way for even more advanced approaches to XAI, combining text with other input modalities, such as images which are then applicable to various engineering problems.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.