Aida Brankovic, David Cook, Jessica Rahman, Sankalp Khanna, Wenjie Huang
{"title":"对用于解释临床预测模型的最流行的XAI进行基准测试:不可信但可能有用。","authors":"Aida Brankovic, David Cook, Jessica Rahman, Sankalp Khanna, Wenjie Huang","doi":"10.1177/14604582241304730","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aimed to assess the practicality and trustworthiness of explainable artificial intelligence (XAI) methods used for explaining clinical predictive models.</p><p><strong>Methods: </strong>Two popular XAIs used for explaining clinical predictive models were evaluated based on their ability to generate domain-appropriate representations, impact clinical workflow, and consistency. Explanations were benchmarked against true clinical deterioration triggers recorded in the data system and agreement was quantified. The evaluation was conducted using two Electronic Medical Records datasets from major hospitals in Australia. Results were examined and commented on by a senior clinician.</p><p><strong>Results: </strong>Findings demonstrate a violation of consistency criteria and moderate concordance (0.47-0.8) with true triggers, undermining reliability and actionability, criteria for clinicians' trust in XAI.</p><p><strong>Conclusion: </strong>Explanations are not trustworthy to guide clinical interventions, though they may offer useful insights and help model troubleshooting. Clinician-informed XAI development and presentation, clear disclaimers on limitations, and critical clinical judgment can promote informed decisions and prevent over-reliance.</p>","PeriodicalId":55069,"journal":{"name":"Health Informatics Journal","volume":"30 4","pages":"14604582241304730"},"PeriodicalIF":2.2000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Benchmarking the most popular XAI used for explaining clinical predictive models: Untrustworthy but could be useful.\",\"authors\":\"Aida Brankovic, David Cook, Jessica Rahman, Sankalp Khanna, Wenjie Huang\",\"doi\":\"10.1177/14604582241304730\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>This study aimed to assess the practicality and trustworthiness of explainable artificial intelligence (XAI) methods used for explaining clinical predictive models.</p><p><strong>Methods: </strong>Two popular XAIs used for explaining clinical predictive models were evaluated based on their ability to generate domain-appropriate representations, impact clinical workflow, and consistency. Explanations were benchmarked against true clinical deterioration triggers recorded in the data system and agreement was quantified. The evaluation was conducted using two Electronic Medical Records datasets from major hospitals in Australia. Results were examined and commented on by a senior clinician.</p><p><strong>Results: </strong>Findings demonstrate a violation of consistency criteria and moderate concordance (0.47-0.8) with true triggers, undermining reliability and actionability, criteria for clinicians' trust in XAI.</p><p><strong>Conclusion: </strong>Explanations are not trustworthy to guide clinical interventions, though they may offer useful insights and help model troubleshooting. Clinician-informed XAI development and presentation, clear disclaimers on limitations, and critical clinical judgment can promote informed decisions and prevent over-reliance.</p>\",\"PeriodicalId\":55069,\"journal\":{\"name\":\"Health Informatics Journal\",\"volume\":\"30 4\",\"pages\":\"14604582241304730\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health Informatics Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/14604582241304730\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Informatics Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/14604582241304730","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Benchmarking the most popular XAI used for explaining clinical predictive models: Untrustworthy but could be useful.
Objective: This study aimed to assess the practicality and trustworthiness of explainable artificial intelligence (XAI) methods used for explaining clinical predictive models.
Methods: Two popular XAIs used for explaining clinical predictive models were evaluated based on their ability to generate domain-appropriate representations, impact clinical workflow, and consistency. Explanations were benchmarked against true clinical deterioration triggers recorded in the data system and agreement was quantified. The evaluation was conducted using two Electronic Medical Records datasets from major hospitals in Australia. Results were examined and commented on by a senior clinician.
Results: Findings demonstrate a violation of consistency criteria and moderate concordance (0.47-0.8) with true triggers, undermining reliability and actionability, criteria for clinicians' trust in XAI.
Conclusion: Explanations are not trustworthy to guide clinical interventions, though they may offer useful insights and help model troubleshooting. Clinician-informed XAI development and presentation, clear disclaimers on limitations, and critical clinical judgment can promote informed decisions and prevent over-reliance.
期刊介绍:
Health Informatics Journal is an international peer-reviewed journal. All papers submitted to Health Informatics Journal are subject to peer review by members of a carefully appointed editorial board. The journal operates a conventional single-blind reviewing policy in which the reviewer’s name is always concealed from the submitting author.