Concept-based AI interpretability in physiological time-series data: Example of abnormality detection in electroencephalography

IF 4.9 2区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computer methods and programs in biomedicine Pub Date : 2024-09-30 DOI:10.1016/j.cmpb.2024.108448

Alexander Brenner , Felix Knispel , Florian P. Fischer , Peter Rossmanith , Yvonne Weber , Henner Koch , Rainer Röhrig , Julian Varghese , Ekaterina Kutafina

{"title":"Concept-based AI interpretability in physiological time-series data: Example of abnormality detection in electroencephalography","authors":"Alexander Brenner , Felix Knispel , Florian P. Fischer , Peter Rossmanith , Yvonne Weber , Henner Koch , Rainer Röhrig , Julian Varghese , Ekaterina Kutafina","doi":"10.1016/j.cmpb.2024.108448","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>Despite recent performance advancements, deep learning models are not yet adopted in clinical practice on a wide scale. The intrinsic intransparency of such systems is commonly cited as one major reason for this reluctance. This has motivated methods that aim to provide explanations of model functioning. Known limitations of feature-based explanations have led to an increased interest in concept-based interpretability. <em>Testing with Concept Activation Vectors</em> (TCAV) employs human-understandable, abstract concepts to explain model behavior. The method has previously been applied to the medical domain in the context of electronic health records, retinal fundus images and magnetic resonance imaging.</div></div><div><h3>Methods</h3><div>We explore the usage of TCAV for building interpretable models on physiological time series, using an example of abnormality detection in electroencephalography (EEG). For this purpose, we adopt the XceptionTime model, which is suitable for multi-channel physiological data of variable sizes. The model provides state-of-the-art performance on raw EEG data and is publically available. We propose and test several ideas regarding concept definition through metadata mining, using additional labeled EEG data and extracting interpretable signal characteristics in the form of frequencies. By including our own hospital data with analog labeling, we further evaluate the robustness of our approach.</div></div><div><h3>Results</h3><div>The tested concepts show a TCAV score distribution that is in line with the clinical expectations, i.e. concepts known to have strong links with EEG pathologies (such as epileptiform discharges) received higher scores than the neutral concepts (e.g. sex). The scores were consistent across the applied concept generation strategies.</div></div><div><h3>Conclusions</h3><div>TCAV has the potential to improve interpretability of deep learning applied to multi-channel signals as well as to detect possible biases in the data. Still, further work on developing the strategies for concept definition and validation on clinical physiological time series is needed to better understand how to extract clinically relevant information from the concept sensitivity scores.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"257 ","pages":"Article 108448"},"PeriodicalIF":4.9000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724004413","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Background and Objective

Despite recent performance advancements, deep learning models are not yet adopted in clinical practice on a wide scale. The intrinsic intransparency of such systems is commonly cited as one major reason for this reluctance. This has motivated methods that aim to provide explanations of model functioning. Known limitations of feature-based explanations have led to an increased interest in concept-based interpretability. Testing with Concept Activation Vectors (TCAV) employs human-understandable, abstract concepts to explain model behavior. The method has previously been applied to the medical domain in the context of electronic health records, retinal fundus images and magnetic resonance imaging.

Methods

We explore the usage of TCAV for building interpretable models on physiological time series, using an example of abnormality detection in electroencephalography (EEG). For this purpose, we adopt the XceptionTime model, which is suitable for multi-channel physiological data of variable sizes. The model provides state-of-the-art performance on raw EEG data and is publically available. We propose and test several ideas regarding concept definition through metadata mining, using additional labeled EEG data and extracting interpretable signal characteristics in the form of frequencies. By including our own hospital data with analog labeling, we further evaluate the robustness of our approach.

Results

The tested concepts show a TCAV score distribution that is in line with the clinical expectations, i.e. concepts known to have strong links with EEG pathologies (such as epileptiform discharges) received higher scores than the neutral concepts (e.g. sex). The scores were consistent across the applied concept generation strategies.

Conclusions

TCAV has the potential to improve interpretability of deep learning applied to multi-channel signals as well as to detect possible biases in the data. Still, further work on developing the strategies for concept definition and validation on clinical physiological time series is needed to better understand how to extract clinically relevant information from the concept sensitivity scores.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

生理时间序列数据中基于概念的人工智能可解释性：脑电图异常检测实例

背景与目标尽管最近在性能方面取得了进步，但深度学习模型尚未在临床实践中得到广泛采用。这类系统固有的不透明性通常被认为是造成这种不情愿的一个主要原因。这就促使人们采用旨在解释模型功能的方法。基于特征的解释存在已知的局限性，因此人们对基于概念的可解释性越来越感兴趣。概念激活向量测试（TCAV）采用人类可理解的抽象概念来解释模型行为。我们以脑电图（EEG）中的异常检测为例，探讨了如何利用 TCAV 建立生理时间序列上的可解释模型。为此，我们采用了 XceptionTime 模型，该模型适用于不同规模的多通道生理数据。该模型在原始脑电图数据上提供了最先进的性能，并已公开发布。我们提出并测试了通过元数据挖掘、使用附加标记的脑电图数据和提取频率形式的可解释信号特征来定义概念的若干想法。结果经测试的概念显示，TCAV 分数分布符合临床预期，即与脑电图病理有密切联系的概念（如癫痫样放电）的得分高于中性概念（如性别）。结论TCAV 有潜力提高应用于多通道信号的深度学习的可解释性，并检测数据中可能存在的偏差。不过，为了更好地理解如何从概念灵敏度分数中提取临床相关信息，还需要进一步开发概念定义策略并在临床生理时间序列上进行验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computer methods and programs in biomedicine 工程技术-工程：生物医学

CiteScore

12.30

自引率

6.60%

发文量

601

审稿时长

135 days

期刊介绍： To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.