Comparison of the methods of operation of the artificial intelligence system in the ultra-high sensitivity mode for the autonomous description of chest X-rays without pathology

Digital Diagnostics Pub Date : 2024-07-03 DOI:10.17816/dd626001

E. D. Nikitin, Nikita S. Plaksin, Maria B. Garetz, Evgeniy M. Gutin

{"title":"Comparison of the methods of operation of the artificial intelligence system in the ultra-high sensitivity mode for the autonomous description of chest X-rays without pathology","authors":"E. D. Nikitin, Nikita S. Plaksin, Maria B. Garetz, Evgeniy M. Gutin","doi":"10.17816/dd626001","DOIUrl":null,"url":null,"abstract":"BACKGROUND: Up to 95% of digital fluoroscopy screening studies are free of pathologic changes. Radiologists typically spend the majority of their time reviewing and describing such studies. In these cases, artificial intelligence systems can be used to automate the description, thereby saving physicians’ time [1–3]. \nAIM: The aim of this study was to compare the efficacy of various algorithms within an existing artificial intelligence system in an ultra-high sensitivity scenario and to estimate the percentage of X-rays that could be automatically characterized. \nMATERIALS AND METHODS: The artificial intelligence system “Cels.Fluorography” version 0.15.3 was used for the analysis. A dataset derived from disparate medical organizations, comprising 11,707 studies devoid of pathology and 5,846 studies exhibiting pathology, was selected for comparison. A subsample of 500 studies with pathology and 9,500 studies without pathology (5% to 95% balance) was randomly selected 1,000 times from the dataset to calculate the metrics. The resulting metrics were then averaged. \nThe markup of two physicians was used as the source of the target variable. In the event of a discrepancy in opinion, the study was subjected to an expert physician evaluation. An X-ray was considered pathological if the final markup contained at least one of 12 radiological features [4]. \nFive methods were used to compare metrics: by maximum (1) and mean (2) probability of radiological features localized by the neural network-detector; by maximum (3) and mean (4) probability of feature presence derived from dedicated “heads” of the neural network trained to determine the presence of each feature on the image (0 for no feature, 1 for presence); by probability (5) derived from a separate “head” of the neural network trained to determine the binary presence of pathology on the study (0 for normal, 1 for pathology). \nFor each method, a response threshold was selected to ensure that no more than one missed pathology was identified per 1,000 examinations in the current subsample. The percentage of X-rays that could be correctly identified as pathology-free by artificial intelligence was calculated as the main quality metric. \nRESULTS: The methods demonstrated the following average percentages of norm dropout: 66.4%, 72.2%, 69.0%, 74.1%, 68.7%—and the following area under the ROC curve: 0.948, 0.957, 0.964, 0.967, 0.971. The 95% confidence interval for the dropout rate associated with the optimal method was found to be 66.1% to 79.4%. \nCONCLUSIONS: Modern artificial intelligence systems can be used to automate the description of a significant portion of screenings. The most efficacious method for norm screening (over 74% of the flow) was demonstrated by the averaging of probabilities derived from special “heads” of the neural network trained to identify the presence of pathology.","PeriodicalId":34831,"journal":{"name":"Digital Diagnostics","volume":"92 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Diagnostics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17816/dd626001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

BACKGROUND: Up to 95% of digital fluoroscopy screening studies are free of pathologic changes. Radiologists typically spend the majority of their time reviewing and describing such studies. In these cases, artificial intelligence systems can be used to automate the description, thereby saving physicians’ time [1–3]. AIM: The aim of this study was to compare the efficacy of various algorithms within an existing artificial intelligence system in an ultra-high sensitivity scenario and to estimate the percentage of X-rays that could be automatically characterized. MATERIALS AND METHODS: The artificial intelligence system “Cels.Fluorography” version 0.15.3 was used for the analysis. A dataset derived from disparate medical organizations, comprising 11,707 studies devoid of pathology and 5,846 studies exhibiting pathology, was selected for comparison. A subsample of 500 studies with pathology and 9,500 studies without pathology (5% to 95% balance) was randomly selected 1,000 times from the dataset to calculate the metrics. The resulting metrics were then averaged. The markup of two physicians was used as the source of the target variable. In the event of a discrepancy in opinion, the study was subjected to an expert physician evaluation. An X-ray was considered pathological if the final markup contained at least one of 12 radiological features [4]. Five methods were used to compare metrics: by maximum (1) and mean (2) probability of radiological features localized by the neural network-detector; by maximum (3) and mean (4) probability of feature presence derived from dedicated “heads” of the neural network trained to determine the presence of each feature on the image (0 for no feature, 1 for presence); by probability (5) derived from a separate “head” of the neural network trained to determine the binary presence of pathology on the study (0 for normal, 1 for pathology). For each method, a response threshold was selected to ensure that no more than one missed pathology was identified per 1,000 examinations in the current subsample. The percentage of X-rays that could be correctly identified as pathology-free by artificial intelligence was calculated as the main quality metric. RESULTS: The methods demonstrated the following average percentages of norm dropout: 66.4%, 72.2%, 69.0%, 74.1%, 68.7%—and the following area under the ROC curve: 0.948, 0.957, 0.964, 0.967, 0.971. The 95% confidence interval for the dropout rate associated with the optimal method was found to be 66.1% to 79.4%. CONCLUSIONS: Modern artificial intelligence systems can be used to automate the description of a significant portion of screenings. The most efficacious method for norm screening (over 74% of the flow) was demonstrated by the averaging of probabilities derived from special “heads” of the neural network trained to identify the presence of pathology.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

比较人工智能系统在超高灵敏度模式下自主描述无病理胸部 X 射线的操作方法

背景：多达 95% 的数字透视筛查研究无病理变化。放射科医生通常要花费大部分时间来审查和描述这类研究。在这种情况下，人工智能系统可用于自动描述，从而节省医生的时间[1-3]。目的：本研究旨在比较现有人工智能系统中各种算法在超高灵敏度情况下的功效，并估算可自动描述特征的 X 射线的百分比。材料与方法：分析使用的是人工智能系统 "Cels.Fluorography "0.15.3 版。我们选择了一个来自不同医疗机构的数据集进行比较，该数据集包括 11,707 项无病理特征的研究和 5,846 项有病理特征的研究。从数据集中随机抽取 500 项有病理变化的研究和 9500 项无病理变化的研究（5% 至 95% 的平衡）作为子样本，进行 1000 次计算。然后对计算出的指标取平均值。两名医生的标记被用作目标变量的来源。如果出现意见分歧，则由专家医师对研究进行评估。如果最终标记包含 12 个放射学特征中的至少一个，则 X 光片被视为病理[4]。比较指标的方法有五种：神经网络探测器定位的放射学特征概率的最大值（1）和平均值（2）；根据为确定图像上是否存在每个特征而训练的神经网络专用 "头 "得出的特征存在概率的最大值（3）和平均值（4）（0 表示无特征，1 表示存在）；根据为确定研究中是否存在二元病理而训练的神经网络单独 "头 "得出的概率（5）（0 表示正常，1 表示病理）。对于每种方法，都选择了一个响应阈值，以确保在当前的子样本中，每 1,000 次检查中漏检的病理情况不超过一次。人工智能可正确识别为无病理的 X 光片的百分比作为主要质量指标进行计算。结果：这些方法的平均标准遗漏率分别为：66.4%、72.2%、69.0%、74.1%、68.7%，ROC 曲线下的面积分别为：0.948、0.95%、0.948、0.95%：0.948, 0.957, 0.964, 0.967, 0.971.与最佳方法相关的辍学率的 95% 置信区间为 66.1% 至 79.4%。结论：现代人工智能系统可用于自动描述大部分筛查结果。最有效的规范筛查方法（超过 74% 的流量）是通过平均神经网络的特殊 "头 "得出的概率来识别是否存在病变。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊