Computerized Medical Imaging and Graphics最新文献_第6页

A Parkinson’s disease-related nuclei segmentation network based on CNN-Transformer interleaved encoder with feature fusion 基于 CNN-Transformer 交错编码器与特征融合的帕金森病相关核团分割网络

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-19 DOI: 10.1016/j.compmedimag.2024.102465

Hongyi Chen , Junyan Fu , Xiao Liu , Zhiji Zheng , Xiao Luo , Kun Zhou , Zhijian Xu , Daoying Geng

Automatic segmentation of Parkinson’s disease (PD) related deep gray matter (DGM) nuclei based on brain magnetic resonance imaging (MRI) is significant in assisting the diagnosis of PD. However, due to the degenerative-induced changes in appearance, low tissue contrast, and tiny DGM nuclei size in elders’ brain MRI images, many existing segmentation models are limited in the application. To address these challenges, this paper proposes a PD-related DGM nuclei segmentation network to provide precise prior knowledge for aiding diagnosis PD. The encoder of network is designed as an alternating encoding structure where the convolutional neural network (CNN) captures spatial and depth texture features, while the Transformer complements global position information between DGM nuclei. Moreover, we propose a cascaded channel-spatial-wise block to fuse features extracted by the CNN and Transformer, thereby achieving more precise DGM nuclei segmentation. The decoder incorporates a symmetrical boundary attention module, leveraging the symmetrical structures of bilateral nuclei regions by constructing signed distance maps for symmetric differences, which optimizes segmentation boundaries. Furthermore, we employ a dynamic adaptive region of interests weighted Dice loss to enhance sensitivity towards smaller structures, thereby improving segmentation accuracy. In qualitative analysis, our method achieved optimal average values for PD-related DGM nuclei (DSC: 0.854, IOU: 0.750, HD95: 1.691 mm, ASD: 0.195 mm). Experiments conducted on multi-center clinical datasets and public datasets demonstrate the good generalizability of the proposed method. Furthermore, a volumetric analysis of segmentation results reveals significant differences between HCs and PDs. Our method holds promise for assisting clinicians in the rapid and accurate diagnosis of PD, offering a practical method for the imaging analysis of neurodegenerative diseases.

基于脑磁共振成像（MRI）的帕金森病（PD）相关深部灰质（DGM）核的自动分割对帕金森病的诊断具有重要意义。然而，由于退化引起的外观变化、低组织对比度以及老年人脑磁共振成像图像中 DGM 核的微小尺寸，许多现有的分割模型在应用中受到限制。针对这些挑战，本文提出了一种与帕金森病相关的 DGM 核分割网络，为帕金森病的辅助诊断提供精确的先验知识。该网络的编码器设计为交替编码结构，其中卷积神经网络（CNN）捕捉空间和深度纹理特征，而变换器则补充 DGM 核之间的全局位置信息。此外，我们还提出了一个级联通道空间块，以融合 CNN 和变换器提取的特征，从而实现更精确的 DGM 核分割。解码器集成了对称边界关注模块，通过构建对称差异的带符号距离图来利用双侧核区的对称结构，从而优化分割边界。此外，我们还采用了动态自适应兴趣区域加权骰子损失，以增强对较小结构的敏感性，从而提高分割准确性。在定性分析中，我们的方法获得了与帕金森病相关的 DGM 核的最佳平均值（DSC：0.854；IOU：0.750；HD95：1.691 毫米；ASD：0.195 毫米）。在多中心临床数据集和公共数据集上进行的实验证明，所提出的方法具有良好的普适性。此外，对分割结果进行的容积分析表明，HCs 和 PDs 之间存在显著差异。我们的方法有望帮助临床医生快速、准确地诊断帕金森病，为神经退行性疾病的成像分析提供了一种实用的方法。

{"title":"A Parkinson’s disease-related nuclei segmentation network based on CNN-Transformer interleaved encoder with feature fusion","authors":"Hongyi Chen , Junyan Fu , Xiao Liu , Zhiji Zheng , Xiao Luo , Kun Zhou , Zhijian Xu , Daoying Geng","doi":"10.1016/j.compmedimag.2024.102465","DOIUrl":"10.1016/j.compmedimag.2024.102465","url":null,"abstract":"<div><div>Automatic segmentation of Parkinson’s disease (PD) related deep gray matter (DGM) nuclei based on brain magnetic resonance imaging (MRI) is significant in assisting the diagnosis of PD. However, due to the degenerative-induced changes in appearance, low tissue contrast, and tiny DGM nuclei size in elders’ brain MRI images, many existing segmentation models are limited in the application. To address these challenges, this paper proposes a PD-related DGM nuclei segmentation network to provide precise prior knowledge for aiding diagnosis PD. The encoder of network is designed as an alternating encoding structure where the convolutional neural network (CNN) captures spatial and depth texture features, while the Transformer complements global position information between DGM nuclei. Moreover, we propose a cascaded channel-spatial-wise block to fuse features extracted by the CNN and Transformer, thereby achieving more precise DGM nuclei segmentation. The decoder incorporates a symmetrical boundary attention module, leveraging the symmetrical structures of bilateral nuclei regions by constructing signed distance maps for symmetric differences, which optimizes segmentation boundaries. Furthermore, we employ a dynamic adaptive region of interests weighted Dice loss to enhance sensitivity towards smaller structures, thereby improving segmentation accuracy. In qualitative analysis, our method achieved optimal average values for PD-related DGM nuclei (DSC: 0.854, IOU: 0.750, HD95: 1.691 mm, ASD: 0.195 mm). Experiments conducted on multi-center clinical datasets and public datasets demonstrate the good generalizability of the proposed method. Furthermore, a volumetric analysis of segmentation results reveals significant differences between HCs and PDs. Our method holds promise for assisting clinicians in the rapid and accurate diagnosis of PD, offering a practical method for the imaging analysis of neurodegenerative diseases.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102465"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142700984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Retinal structure guidance-and-adaption network for early Parkinson’s disease recognition based on OCT images 基于 OCT 图像的视网膜结构引导和适应网络，用于早期帕金森病识别

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-19 DOI: 10.1016/j.compmedimag.2024.102463

Hanfeng Shi , Jiaqi Wei , Richu Jin , Jiaxin Peng , Xingyue Wang , Yan Hu , Xiaoqing Zhang , Jiang Liu

Parkinson’s disease (PD) is a leading neurodegenerative disease globally. Precise and objective PD diagnosis is significant for early intervention and treatment. Recent studies have shown significant correlations between retinal structure information and PD based on optical coherence tomography (OCT) images, providing another potential means for early PD recognition. However, how to exploit the retinal structure information (e.g., thickness and mean intensity) from different retinal layers to improve PD recognition performance has not been studied before. Motivated by the above observations, we first propose a structural prior knowledge extraction (SPKE) module to obtain the retinal structure feature maps; then, we develop a structure-guided-and-adaption attention (SGDA) module to fully leverage the potential of different retinal layers based on the extracted retinal structure feature maps. By embedding SPKE and SGDA modules at the low stage of deep neural networks (DNNs), a retinal structure-guided-and-adaption network (RSGA-Net) is constructed for early PD recognition based on OCT images. The extensive experiments on a clinical OCT-PD dataset demonstrate the superiority of RSGA-Net over state-of-the-art methods. Additionally, we provide a visual analysis to explain how retinal structure information affects the decision-making process of DNNs.

帕金森病（PD）是全球主要的神经退行性疾病。精确客观的帕金森病诊断对早期干预和治疗具有重要意义。最近的研究表明，基于光学相干断层扫描（OCT）图像的视网膜结构信息与帕金森病之间存在明显的相关性，这为早期帕金森病识别提供了另一种潜在的方法。然而，如何利用不同视网膜层的视网膜结构信息（如厚度和平均强度）来提高白内障的识别性能，以前还没有人研究过。受上述观察结果的启发，我们首先提出了结构先验知识抽取（SPKE）模块，以获得视网膜结构特征图；然后，我们开发了结构引导和适应注意（SGDA）模块，以根据抽取的视网膜结构特征图充分利用不同视网膜层的潜力。通过在深度神经网络（DNN）的低级阶段嵌入 SPKE 和 SGDA 模块，我们构建了一个视网膜结构引导和适应网络（RSGA-Net），用于基于 OCT 图像的早期 PD 识别。在临床 OCT-PD 数据集上进行的大量实验证明，RSGA-Net 优于最先进的方法。此外，我们还通过视觉分析解释了视网膜结构信息如何影响 DNN 的决策过程。

{"title":"Retinal structure guidance-and-adaption network for early Parkinson’s disease recognition based on OCT images","authors":"Hanfeng Shi , Jiaqi Wei , Richu Jin , Jiaxin Peng , Xingyue Wang , Yan Hu , Xiaoqing Zhang , Jiang Liu","doi":"10.1016/j.compmedimag.2024.102463","DOIUrl":"10.1016/j.compmedimag.2024.102463","url":null,"abstract":"<div><div>Parkinson’s disease (PD) is a leading neurodegenerative disease globally. Precise and objective PD diagnosis is significant for early intervention and treatment. Recent studies have shown significant correlations between retinal structure information and PD based on optical coherence tomography (OCT) images, providing another potential means for early PD recognition. However, how to exploit the retinal structure information (e.g., thickness and mean intensity) from different retinal layers to improve PD recognition performance has not been studied before. Motivated by the above observations, we first propose a structural prior knowledge extraction (SPKE) module to obtain the retinal structure feature maps; then, we develop a structure-guided-and-adaption attention (SGDA) module to fully leverage the potential of different retinal layers based on the extracted retinal structure feature maps. By embedding SPKE and SGDA modules at the low stage of deep neural networks (DNNs), a retinal structure-guided-and-adaption network (RSGA-Net) is constructed for early PD recognition based on OCT images. The extensive experiments on a clinical OCT-PD dataset demonstrate the superiority of RSGA-Net over state-of-the-art methods. Additionally, we provide a visual analysis to explain how retinal structure information affects the decision-making process of DNNs.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102463"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142722393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploratory analysis of Type B Aortic Dissection (TBAD) segmentation in 2D CTA images using various kernels 使用各种核对二维 CTA 图像中 B 型主动脉夹层（TBAD）分割的探索性分析。

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-18 DOI: 10.1016/j.compmedimag.2024.102460

Ayman Abaid , Srinivas Ilancheran , Talha Iqbal , Niamh Hynes , Ihsan Ullah

Type-B Aortic Dissection is a rare but fatal cardiovascular disease characterized by a tear in the inner layer of the aorta, affecting 3.5 per 100,000 individuals annually. In this work, we explore the feasibility of leveraging two-dimensional Convolutional Neural Network (CNN) models to perform accurate slice-by-slice segmentation of true lumen, false lumen and false lumen thrombus in Computed Tomography Angiography images. The study performed an exploratory analysis of three 2D U-Net models: the baseline 2D U-Net, a variant of U-Net with atrous convolutions, and a U-Net with a custom layer featuring a position-oriented, partially shared weighting scheme kernel. These models were trained and benchmarked against a state-of-the-art baseline 3D U-Net model. Overall, our U-Net with the VGG19 encoder architecture achieved the best performance score among all other models, with a mean Dice score of 80.48% and an IoU score of 72.93%. The segmentation results were also compared with the Segment Anything Model (SAM) and the UniverSeg models. Our findings indicate that our 2D U-Net models excel in false lumen and true lumen segmentation accuracy while achieving lower false lumen thrombus segmentation accuracy compared to the state-of-the-art 3D U-Net model. The study findings highlight the complexities involved in developing segmentation models, especially for cardiovascular medical images, and emphasize the importance of developing lightweight models for real-time decision-making to improve overall patient care.

B 型主动脉夹层是一种以主动脉内层撕裂为特征的罕见但致命的心血管疾病，每年每 10 万人中就有 3.5 人患病。在这项研究中，我们探索了利用二维卷积神经网络（CNN）模型对计算机断层扫描血管造影图像中的真腔、假腔和假腔血栓进行逐片精确分割的可行性。该研究对三种二维 U-Net 模型进行了探索性分析：基线二维 U-Net、带有无齿卷积的 U-Net 变体以及带有自定义层的 U-Net，自定义层的特点是位置导向、部分共享加权方案内核。这些模型都经过了训练，并与最先进的基准 3D U-Net 模型进行了比较。总体而言，我们采用 VGG19 编码器架构的 U-Net 在所有其他模型中取得了最佳性能得分，平均 Dice 得分为 80.48%，IoU 得分为 72.93%。分割结果还与任意分割模型（SAM）和 UniverSeg 模型进行了比较。研究结果表明，与最先进的三维 U-Net 模型相比，我们的二维 U-Net 模型在假腔和真腔分割准确率方面表现出色，而假腔血栓分割准确率较低。研究结果凸显了开发分割模型（尤其是心血管医学图像）的复杂性，并强调了开发轻量级模型用于实时决策以改善整体患者护理的重要性。

{"title":"Exploratory analysis of Type B Aortic Dissection (TBAD) segmentation in 2D CTA images using various kernels","authors":"Ayman Abaid , Srinivas Ilancheran , Talha Iqbal , Niamh Hynes , Ihsan Ullah","doi":"10.1016/j.compmedimag.2024.102460","DOIUrl":"10.1016/j.compmedimag.2024.102460","url":null,"abstract":"<div><div>Type-B Aortic Dissection is a rare but fatal cardiovascular disease characterized by a tear in the inner layer of the aorta, affecting 3.5 per 100,000 individuals annually. In this work, we explore the feasibility of leveraging two-dimensional Convolutional Neural Network (CNN) models to perform accurate slice-by-slice segmentation of true lumen, false lumen and false lumen thrombus in Computed Tomography Angiography images. The study performed an exploratory analysis of three 2D U-Net models: the baseline 2D U-Net, a variant of U-Net with atrous convolutions, and a U-Net with a custom layer featuring a position-oriented, partially shared weighting scheme kernel. These models were trained and benchmarked against a state-of-the-art baseline 3D U-Net model. Overall, our U-Net with the VGG19 encoder architecture achieved the best performance score among all other models, with a mean Dice score of 80.48% and an IoU score of 72.93%. The segmentation results were also compared with the Segment Anything Model (SAM) and the UniverSeg models. Our findings indicate that our 2D U-Net models excel in false lumen and true lumen segmentation accuracy while achieving lower false lumen thrombus segmentation accuracy compared to the state-of-the-art 3D U-Net model. The study findings highlight the complexities involved in developing segmentation models, especially for cardiovascular medical images, and emphasize the importance of developing lightweight models for real-time decision-making to improve overall patient care.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102460"},"PeriodicalIF":5.4,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142693784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring transformer reliability in clinically significant prostate cancer segmentation: A comprehensive in-depth investigation 探索具有临床意义的前列腺癌分段中变压器的可靠性：全面深入的调查

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-17 DOI: 10.1016/j.compmedimag.2024.102459

Gustavo Andrade-Miranda , Pedro Soto Vega , Kamilia Taguelmimt , Hong-Phuong Dang , Dimitris Visvikis , Julien Bert

Despite the growing prominence of transformers in medical image segmentation, their application to clinically significant prostate cancer (csPCa) has been overlooked. Minimal attention has been paid to domain shift analysis and uncertainty assessment, critical for safely implementing computer-aided diagnosis (CAD) systems. Domain shift in medical imagery refers to differences between the data used to train a model and the data evaluated later, arising from variations in imaging equipment, protocols, patient populations, and acquisition noise. While recent models enhance in-domain performance, areas such as robustness and uncertainty estimation in out-of-domain distributions have received limited investigation, creating indecisiveness about model reliability. In contrast, our study addresses csPCa at voxel, lesion, and image levels, investigating models from traditional U-Net to cutting-edge transformers. We focus on four key points: robustness, calibration, out-of-distribution (OOD), and misclassification detection (MD). Findings show that transformer-based models exhibit enhanced robustness at image and lesion levels, both in and out of domain. However, this improvement is not fully translated to the voxel level, where Convolutional Neural Networks (CNNs) outperform in most robustness metrics. Regarding uncertainty, hybrid transformers and transformer encoders performed better, but this trend depends on misclassification or out-of-distribution tasks.

尽管变换器在医学图像分割中的作用日益突出，但其在具有临床意义的前列腺癌（csPCa）中的应用却一直被忽视。人们对域偏移分析和不确定性评估的关注极少，而这对安全实施计算机辅助诊断（CAD）系统至关重要。医学影像中的域偏移指的是用于训练模型的数据与随后评估的数据之间的差异，这种差异是由成像设备、协议、患者群体和采集噪声的变化引起的。虽然最近的模型提高了域内性能，但对域外分布的鲁棒性和不确定性估计等领域的研究却很有限，导致对模型的可靠性举棋不定。相比之下，我们的研究涉及体素、病灶和图像层面的 csPCa，研究了从传统 U-Net 到尖端变换器的各种模型。我们重点关注四个关键点：稳健性、校准、分布外 (OOD) 和误分类检测 (MD)。研究结果表明，基于变换器的模型在图像和病变水平上表现出更强的鲁棒性，无论是在域内还是域外。然而，这种改进并没有完全转化到体素层面，在体素层面，卷积神经网络（CNN）在大多数鲁棒性指标上都表现出色。在不确定性方面，混合变换器和变换器编码器表现更好，但这一趋势取决于误分类或分布外任务。

{"title":"Exploring transformer reliability in clinically significant prostate cancer segmentation: A comprehensive in-depth investigation","authors":"Gustavo Andrade-Miranda , Pedro Soto Vega , Kamilia Taguelmimt , Hong-Phuong Dang , Dimitris Visvikis , Julien Bert","doi":"10.1016/j.compmedimag.2024.102459","DOIUrl":"10.1016/j.compmedimag.2024.102459","url":null,"abstract":"<div><div>Despite the growing prominence of transformers in medical image segmentation, their application to clinically significant prostate cancer (csPCa) has been overlooked. Minimal attention has been paid to domain shift analysis and uncertainty assessment, critical for safely implementing computer-aided diagnosis (CAD) systems. Domain shift in medical imagery refers to differences between the data used to train a model and the data evaluated later, arising from variations in imaging equipment, protocols, patient populations, and acquisition noise. While recent models enhance in-domain performance, areas such as robustness and uncertainty estimation in out-of-domain distributions have received limited investigation, creating indecisiveness about model reliability. In contrast, our study addresses csPCa at voxel, lesion, and image levels, investigating models from traditional U-Net to cutting-edge transformers. We focus on four key points: robustness, calibration, out-of-distribution (OOD), and misclassification detection (MD). Findings show that transformer-based models exhibit enhanced robustness at image and lesion levels, both in and out of domain. However, this improvement is not fully translated to the voxel level, where Convolutional Neural Networks (CNNs) outperform in most robustness metrics. Regarding uncertainty, hybrid transformers and transformer encoders performed better, but this trend depends on misclassification or out-of-distribution tasks.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102459"},"PeriodicalIF":5.4,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NACNet: A histology context-aware transformer graph convolution network for predicting treatment response to neoadjuvant chemotherapy in Triple Negative Breast Cancer NACNet：用于预测三阴性乳腺癌新辅助化疗治疗反应的组织学上下文感知变换图卷积网络

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-17 DOI: 10.1016/j.compmedimag.2024.102467

Qiang Li , George Teodoro , Yi Jiang , Jun Kong

Neoadjuvant chemotherapy (NAC) response prediction for triple negative breast cancer (TNBC) patients is a challenging task clinically as it requires understanding complex histology interactions within the tumor microenvironment (TME). Digital whole slide images (WSIs) capture detailed tissue information, but their giga-pixel size necessitates computational methods based on multiple instance learning, which typically analyze small, isolated image tiles without the spatial context of the TME. To address this limitation and incorporate TME spatial histology interactions in predicting NAC response for TNBC patients, we developed a histology context-aware transformer graph convolution network (NACNet). Our deep learning method identifies the histopathological labels on individual image tiles from WSIs, constructs a spatial TME graph, and represents each node with features derived from tissue texture and social network analysis. It predicts NAC response using a transformer graph convolution network model enhanced with graph isomorphism network layers. We evaluate our method with WSIs of a cohort of TNBC patient (N=105) and compared its performance with multiple state-of-the-art machine learning and deep learning models, including both graph and non-graph approaches. Our NACNet achieves 90.0% accuracy, 96.0% sensitivity, 88.0% specificity, and an AUC of 0.82, through eight-fold cross-validation, outperforming baseline models. These comprehensive experimental results suggest that NACNet holds strong potential for stratifying TNBC patients by NAC response, thereby helping to prevent overtreatment, improve patient quality of life, reduce treatment cost, and enhance clinical outcomes, marking an important advancement toward personalized breast cancer treatment.

三阴性乳腺癌（TNBC）患者的新辅助化疗（NAC）反应预测是一项具有挑战性的临床任务，因为它需要了解肿瘤微环境（TME）中复杂的组织学相互作用。数字全切片图像（WSI）能捕捉到详细的组织信息，但其千兆像素的尺寸使得基于多实例学习的计算方法成为必要，这种方法通常分析的是孤立的小块图像，而不考虑肿瘤微环境的空间背景。为了解决这一局限性，并结合TME空间组织学相互作用来预测TNBC患者的NAC反应，我们开发了一种组织学上下文感知变换图卷积网络（NACNet）。我们的深度学习方法从 WSIs 中识别单个图像瓦片上的组织病理学标签，构建空间 TME 图，并用组织纹理和社交网络分析得出的特征来表示每个节点。该方法使用变压器图卷积网络模型预测 NAC 反应，该模型使用图同构网络层进行增强。我们用一组 TNBC 患者（N=105）的 WSI 评估了我们的方法，并将其性能与多种最先进的机器学习和深度学习模型（包括图和非图方法）进行了比较。通过八倍交叉验证，我们的 NACNet 实现了 90.0% 的准确率、96.0% 的灵敏度、88.0% 的特异性和 0.82 的 AUC，表现优于基线模型。这些全面的实验结果表明，NACNet 在根据 NAC 反应对 TNBC 患者进行分层方面具有强大的潜力，从而有助于防止过度治疗、改善患者生活质量、降低治疗成本和提高临床疗效，标志着乳腺癌个性化治疗取得了重要进展。

{"title":"NACNet: A histology context-aware transformer graph convolution network for predicting treatment response to neoadjuvant chemotherapy in Triple Negative Breast Cancer","authors":"Qiang Li , George Teodoro , Yi Jiang , Jun Kong","doi":"10.1016/j.compmedimag.2024.102467","DOIUrl":"10.1016/j.compmedimag.2024.102467","url":null,"abstract":"<div><div>Neoadjuvant chemotherapy (NAC) response prediction for triple negative breast cancer (TNBC) patients is a challenging task clinically as it requires understanding complex histology interactions within the tumor microenvironment (TME). Digital whole slide images (WSIs) capture detailed tissue information, but their giga-pixel size necessitates computational methods based on multiple instance learning, which typically analyze small, isolated image tiles without the spatial context of the TME. To address this limitation and incorporate TME spatial histology interactions in predicting NAC response for TNBC patients, we developed a histology context-aware transformer graph convolution network (NACNet). Our deep learning method identifies the histopathological labels on individual image tiles from WSIs, constructs a spatial TME graph, and represents each node with features derived from tissue texture and social network analysis. It predicts NAC response using a transformer graph convolution network model enhanced with graph isomorphism network layers. We evaluate our method with WSIs of a cohort of TNBC patient (N=105) and compared its performance with multiple state-of-the-art machine learning and deep learning models, including both graph and non-graph approaches. Our NACNet achieves 90.0% accuracy, 96.0% sensitivity, 88.0% specificity, and an AUC of 0.82, through eight-fold cross-validation, outperforming baseline models. These comprehensive experimental results suggest that NACNet holds strong potential for stratifying TNBC patients by NAC response, thereby helping to prevent overtreatment, improve patient quality of life, reduce treatment cost, and enhance clinical outcomes, marking an important advancement toward personalized breast cancer treatment.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102467"},"PeriodicalIF":5.4,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142700926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Self-supervised multi-modal feature fusion for predicting early recurrence of hepatocellular carcinoma 预测肝细胞癌早期复发的自我监督多模态特征融合。

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-14 DOI: 10.1016/j.compmedimag.2024.102457

Sen Wang , Ying Zhao , Jiayi Li , Zongmin Yi , Jun Li , Can Zuo , Yu Yao , Ailian Liu

Surgical resection stands as the primary treatment option for early-stage hepatocellular carcinoma (HCC) patients. Postoperative early recurrence (ER) is a significant factor contributing to the mortality of HCC patients. Therefore, accurately predicting the risk of ER after curative resection is crucial for clinical decision-making and improving patient prognosis. This study leverages a self-supervised multi-modal feature fusion approach, combining multi-phase MRI and clinical features, to predict ER of HCC. Specifically, we utilized attention mechanisms to suppress redundant features, enabling efficient extraction and fusion of multi-phase features. Through self-supervised learning (SSL), we pretrained an encoder on our dataset to extract more generalizable feature representations. Finally, we achieved effective multi-modal information fusion via attention modules. To enhance explainability, we employed Score-CAM to visualize the key regions influencing the model’s predictions. We evaluated the effectiveness of the proposed method on our dataset and found that predictions based on multi-phase feature fusion outperformed those based on single-phase features. Additionally, predictions based on multi-modal feature fusion were superior to those based on single-modal features.

手术切除是早期肝细胞癌（HCC）患者的主要治疗方案。术后早期复发（ER）是导致 HCC 患者死亡的一个重要因素。因此，准确预测治愈性切除术后的早期复发风险对于临床决策和改善患者预后至关重要。本研究利用自监督多模态特征融合方法，结合多相磁共振成像和临床特征，预测 HCC 的 ER。具体来说，我们利用注意力机制抑制冗余特征，从而实现多相特征的高效提取和融合。通过自我监督学习（SSL），我们在数据集上预训练了编码器，以提取更具通用性的特征表征。最后，我们通过注意力模块实现了有效的多模态信息融合。为了提高可解释性，我们采用了 Score-CAM 来可视化影响模型预测的关键区域。我们在数据集上评估了所提方法的有效性，发现基于多相特征融合的预测结果优于基于单相特征的预测结果。此外，基于多模态特征融合的预测结果优于基于单模态特征的预测结果。

{"title":"Self-supervised multi-modal feature fusion for predicting early recurrence of hepatocellular carcinoma","authors":"Sen Wang , Ying Zhao , Jiayi Li , Zongmin Yi , Jun Li , Can Zuo , Yu Yao , Ailian Liu","doi":"10.1016/j.compmedimag.2024.102457","DOIUrl":"10.1016/j.compmedimag.2024.102457","url":null,"abstract":"<div><div>Surgical resection stands as the primary treatment option for early-stage hepatocellular carcinoma (HCC) patients. Postoperative early recurrence (ER) is a significant factor contributing to the mortality of HCC patients. Therefore, accurately predicting the risk of ER after curative resection is crucial for clinical decision-making and improving patient prognosis. This study leverages a self-supervised multi-modal feature fusion approach, combining multi-phase MRI and clinical features, to predict ER of HCC. Specifically, we utilized attention mechanisms to suppress redundant features, enabling efficient extraction and fusion of multi-phase features. Through self-supervised learning (SSL), we pretrained an encoder on our dataset to extract more generalizable feature representations. Finally, we achieved effective multi-modal information fusion via attention modules. To enhance explainability, we employed Score-CAM to visualize the key regions influencing the model’s predictions. We evaluated the effectiveness of the proposed method on our dataset and found that predictions based on multi-phase feature fusion outperformed those based on single-phase features. Additionally, predictions based on multi-modal feature fusion were superior to those based on single-modal features.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102457"},"PeriodicalIF":5.4,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142689549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DSIFNet: Implicit feature network for nasal cavity and vestibule segmentation from 3D head CT DSIFNet：用于从三维头部 CT 中分割鼻腔和前庭的隐含特征网络

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-12 DOI: 10.1016/j.compmedimag.2024.102462

Yi Lu , Hongjian Gao , Jikuan Qiu , Zihan Qiu , Junxiu Liu , Xiangzhi Bai

This study is dedicated to accurately segment the nasal cavity and its intricate internal anatomy from head CT images, which is critical for understanding nasal physiology, diagnosing diseases, and planning surgeries. Nasal cavity and it’s anatomical structures such as the sinuses, and vestibule exhibit significant scale differences, with complex shapes and variable microstructures. These features require the segmentation method to have strong cross-scale feature extraction capabilities. To effectively address this challenge, we propose an image segmentation network named the Deeply Supervised Implicit Feature Network (DSIFNet). This network uniquely incorporates an Implicit Feature Function Module Guided by Local and Global Positional Information (LGPI-IFF), enabling effective fusion of features across scales and enhancing the network's ability to recognize details and overall structures. Additionally, we introduce a deep supervision mechanism based on implicit feature functions in the network's decoding phase, optimizing the utilization of multi-scale feature information, thus improving segmentation precision and detail representation. Furthermore, we constructed a dataset comprising 7116 CT volumes (including 1,292,508 slices) and implemented PixPro-based self-supervised pretraining to utilize unlabeled data for enhanced feature extraction. Our tests on nasal cavity and vestibule segmentation, conducted on a dataset comprising 128 head CT volumes (including 34,006 slices), demonstrate the robustness and superior performance of proposed method, achieving leading results across multiple segmentation metrics.

这项研究致力于从头部 CT 图像中准确分割鼻腔及其复杂的内部解剖结构，这对了解鼻腔生理、诊断疾病和规划手术至关重要。鼻腔及其解剖结构（如鼻窦和鼻前庭）具有明显的尺度差异、复杂的形状和多变的微观结构。这些特征要求分割方法具有强大的跨尺度特征提取能力。为了有效应对这一挑战，我们提出了一种名为深度监督隐含特征网络（DSIFNet）的图像分割网络。该网络独特地集成了由本地和全局位置信息（LGPI-IFF）引导的隐式特征功能模块，从而实现了跨尺度特征的有效融合，并增强了网络识别细节和整体结构的能力。此外，我们还在网络解码阶段引入了基于隐式特征函数的深度监督机制，优化了多尺度特征信息的利用，从而提高了分割精度和细节表示能力。此外，我们还构建了一个由 7116 个 CT 卷（包括 1,292,508 个切片）组成的数据集，并实施了基于 PixPro 的自监督预训练，以利用未标记数据增强特征提取。我们在由 128 个头部 CT 卷（包括 34006 张切片）组成的数据集上进行了鼻腔和前庭分割测试，测试结果证明了所提方法的鲁棒性和卓越性能，在多个分割指标上都取得了领先的结果。

{"title":"DSIFNet: Implicit feature network for nasal cavity and vestibule segmentation from 3D head CT","authors":"Yi Lu , Hongjian Gao , Jikuan Qiu , Zihan Qiu , Junxiu Liu , Xiangzhi Bai","doi":"10.1016/j.compmedimag.2024.102462","DOIUrl":"10.1016/j.compmedimag.2024.102462","url":null,"abstract":"<div><div>This study is dedicated to accurately segment the nasal cavity and its intricate internal anatomy from head CT images, which is critical for understanding nasal physiology, diagnosing diseases, and planning surgeries. Nasal cavity and it’s anatomical structures such as the sinuses, and vestibule exhibit significant scale differences, with complex shapes and variable microstructures. These features require the segmentation method to have strong cross-scale feature extraction capabilities. To effectively address this challenge, we propose an image segmentation network named the Deeply Supervised Implicit Feature Network (DSIFNet). This network uniquely incorporates an Implicit Feature Function Module Guided by Local and Global Positional Information (LGPI-IFF), enabling effective fusion of features across scales and enhancing the network's ability to recognize details and overall structures. Additionally, we introduce a deep supervision mechanism based on implicit feature functions in the network's decoding phase, optimizing the utilization of multi-scale feature information, thus improving segmentation precision and detail representation. Furthermore, we constructed a dataset comprising 7116 CT volumes (including 1,292,508 slices) and implemented PixPro-based self-supervised pretraining to utilize unlabeled data for enhanced feature extraction. Our tests on nasal cavity and vestibule segmentation, conducted on a dataset comprising 128 head CT volumes (including 34,006 slices), demonstrate the robustness and superior performance of proposed method, achieving leading results across multiple segmentation metrics.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102462"},"PeriodicalIF":5.4,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142656920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AFSegNet: few-shot 3D ankle-foot bone segmentation via hierarchical feature distillation and multi-scale attention and fusion AFSegNet：通过分层特征提炼和多尺度关注与融合进行少量三维踝足骨骼分割

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-01 DOI: 10.1016/j.compmedimag.2024.102456

Yuan Huang , Sven A. Holcombe , Stewart C. Wang , Jisi Tang

Accurate segmentation of ankle and foot bones from CT scans is essential for morphological analysis. Ankle and foot bone segmentation challenges due to the blurred bone boundaries, narrow inter-bone gaps, gaps in the cortical shell, and uneven spongy bone textures. Our study endeavors to create a deep learning framework that harnesses advantages of 3D deep learning and tackles the hurdles in accurately segmenting ankle and foot bones from clinical CT scans. A few-shot framework AFSegNet is proposed considering the computational cost, which comprises three 3D deep-learning networks adhering to the principles of progressing from simple to complex tasks and network structures. Specifically, a shallow network first over-segments the foreground, and along with the foreground ground truth are used to supervise a subsequent network to detect the over-segmented regions, which are overwhelmingly inter-bone gaps. The foreground and inter-bone gap probability map are then input into a network with multi-scale attentions and feature fusion, a loss function combining region-, boundary-, and topology-based terms to get the fine-level bone segmentation. AFSegNet is applied to the 16-class segmentation task utilizing 123 in-house CT scans, which only requires a GPU with 24 GB memory since the three sub-networks can be successively and individually trained. AFSegNet achieves a Dice of 0.953 and average surface distance of 0.207. The ablation study and comparison with two basic state-of-the-art networks indicates the effectiveness of the progressively distilled features, attention and feature fusion modules, and hybrid loss functions, with the mean surface distance error decreased up to 50 %.

从 CT 扫描中准确分割踝骨和足骨对形态分析至关重要。由于骨骼边界模糊、骨骼间隙狭窄、皮质外壳存在间隙以及海绵状骨骼纹理不均匀，踝骨和足骨的分割面临挑战。我们的研究致力于创建一个深度学习框架，利用三维深度学习的优势，解决从临床 CT 扫描中准确分割踝骨和足骨的难题。考虑到计算成本，本研究提出了一个由三个三维深度学习网络组成的 "几镜式 "框架 AFSegNet，该框架遵循任务和网络结构由简到繁的原则。具体来说，一个浅层网络首先对前景进行过度分割，并与前景地面实况一起用于监督后续网络检测过度分割的区域，这些区域绝大多数是骨间间隙。然后，将前景和骨间间隙概率图输入一个多尺度关注和特征融合网络，该损失函数结合了基于区域、边界和拓扑的术语，以获得精细的骨骼分割。AFSegNet 利用 123 张内部 CT 扫描图像完成了 16 级骨骼分割任务，由于三个子网络可以连续单独训练，因此只需要配备 24 GB 内存的 GPU。AFSegNet 的 Dice 值为 0.953，平均表面距离为 0.207。消融研究以及与两个最先进的基本网络的比较表明，逐步提炼的特征、注意力和特征融合模块以及混合损失函数非常有效，平均表面距离误差降低了 50%。

{"title":"AFSegNet: few-shot 3D ankle-foot bone segmentation via hierarchical feature distillation and multi-scale attention and fusion","authors":"Yuan Huang , Sven A. Holcombe , Stewart C. Wang , Jisi Tang","doi":"10.1016/j.compmedimag.2024.102456","DOIUrl":"10.1016/j.compmedimag.2024.102456","url":null,"abstract":"<div><div>Accurate segmentation of ankle and foot bones from CT scans is essential for morphological analysis. Ankle and foot bone segmentation challenges due to the blurred bone boundaries, narrow inter-bone gaps, gaps in the cortical shell, and uneven spongy bone textures. Our study endeavors to create a deep learning framework that harnesses advantages of 3D deep learning and tackles the hurdles in accurately segmenting ankle and foot bones from clinical CT scans. A few-shot framework AFSegNet is proposed considering the computational cost, which comprises three 3D deep-learning networks adhering to the principles of progressing from simple to complex tasks and network structures. Specifically, a shallow network first over-segments the foreground, and along with the foreground ground truth are used to supervise a subsequent network to detect the over-segmented regions, which are overwhelmingly inter-bone gaps. The foreground and inter-bone gap probability map are then input into a network with multi-scale attentions and feature fusion, a loss function combining region-, boundary-, and topology-based terms to get the fine-level bone segmentation. AFSegNet is applied to the 16-class segmentation task utilizing 123 in-house CT scans, which only requires a GPU with 24 GB memory since the three sub-networks can be successively and individually trained. AFSegNet achieves a Dice of 0.953 and average surface distance of 0.207. The ablation study and comparison with two basic state-of-the-art networks indicates the effectiveness of the progressively distilled features, attention and feature fusion modules, and hybrid loss functions, with the mean surface distance error decreased up to 50 %.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102456"},"PeriodicalIF":5.4,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VLFATRollout: Fully transformer-based classifier for retinal OCT volumes VLFATRollout：完全基于变换器的视网膜 OCT 容量分类器。

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-10-29 DOI: 10.1016/j.compmedimag.2024.102452

Marzieh Oghbaie , Teresa Araújo , Ursula Schmidt-Erfurth , Hrvoje Bogunović

Background and Objective:

Despite the promising capabilities of 3D transformer architectures in video analysis, their application to high-resolution 3D medical volumes encounters several challenges. One major limitation is the high number of 3D patches, which reduces the efficiency of the global self-attention mechanisms of transformers. Additionally, background information can distract vision transformers from focusing on crucial areas of the input image, thereby introducing noise into the final representation. Moreover, the variability in the number of slices per volume complicates the development of models capable of processing input volumes of any resolution while simple solutions like subsampling may risk losing essential diagnostic details.

Methods:

To address these challenges, we introduce an end-to-end transformer-based framework, variable length feature aggregator transformer rollout (VLFATRollout), to classify volumetric data. The proposed VLFATRollout enjoys several merits. First, the proposed VLFATRollout can effectively mine slice-level fore-background information with the help of transformer’s attention matrices. Second, randomization of volume-wise resolution (i.e. the number of slices) during training enhances the learning capacity of the learnable positional embedding (PE) assigned to each volume slice. This technique allows the PEs to generalize across neighboring slices, facilitating the handling of high-resolution volumes at the test time.

Results:

VLFATRollout was thoroughly tested on the retinal optical coherence tomography (OCT) volume classification task, demonstrating a notable average improvement of 5.47% in balanced accuracy over the leading convolutional models for a 5-class diagnostic task. These results emphasize the effectiveness of our framework in enhancing slice-level representation and its adaptability across different volume resolutions, paving the way for advanced transformer applications in medical image analysis. The code is available at https://github.com/marziehoghbaie/VLFATRollout/.

背景和目的：尽管三维变压器架构在视频分析中的应用前景广阔，但将其应用于高分辨率三维医疗卷却面临着一些挑战。其中一个主要限制是三维斑块数量较多，这降低了变换器全局自我关注机制的效率。此外，背景信息会分散视觉转换器的注意力，使其无法聚焦于输入图像的关键区域，从而在最终表示中引入噪声。此外，每个体的切片数的变化使得开发能够处理任何分辨率的输入体的模型变得更加复杂，而简单的解决方案（如子采样）可能会丢失重要的诊断细节：为了应对这些挑战，我们引入了一种基于变压器的端到端框架--可变长度特征聚合变压器推出（VLFATRollout），用于对体积数据进行分类。所提出的 VLFATRollout 有几个优点。首先，拟议的 VLFATRollout 可借助变换器的注意力矩阵有效挖掘切片级前景信息。其次，在训练过程中对体积分辨率（即切片数）进行随机化，可增强分配给每个体积切片的可学习位置嵌入（PE）的学习能力。这种技术可以让位置嵌入在相邻切片之间进行泛化，从而在测试时更容易处理高分辨率的容积：VLFATRollout 在视网膜光学相干断层扫描（OCT）容积分类任务中进行了全面测试，在 5 类诊断任务中，与领先的卷积模型相比，平均平衡准确率显著提高了 5.47%。这些结果凸显了我们的框架在增强切片级表示方面的有效性及其对不同体分辨率的适应性，为医学图像分析中的高级变换器应用铺平了道路。代码见 https://github.com/marziehoghbaie/VLFATRollout/。

{"title":"VLFATRollout: Fully transformer-based classifier for retinal OCT volumes","authors":"Marzieh Oghbaie , Teresa Araújo , Ursula Schmidt-Erfurth , Hrvoje Bogunović","doi":"10.1016/j.compmedimag.2024.102452","DOIUrl":"10.1016/j.compmedimag.2024.102452","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Despite the promising capabilities of 3D transformer architectures in video analysis, their application to high-resolution 3D medical volumes encounters several challenges. One major limitation is the high number of 3D patches, which reduces the efficiency of the global self-attention mechanisms of transformers. Additionally, background information can distract vision transformers from focusing on crucial areas of the input image, thereby introducing noise into the final representation. Moreover, the variability in the number of slices per volume complicates the development of models capable of processing input volumes of any resolution while simple solutions like subsampling may risk losing essential diagnostic details.</div></div><div><h3>Methods:</h3><div>To address these challenges, we introduce an end-to-end transformer-based framework, variable length feature aggregator transformer rollout (VLFATRollout), to classify volumetric data. The proposed VLFATRollout enjoys several merits. First, the proposed VLFATRollout can effectively mine slice-level fore-background information with the help of transformer’s attention matrices. Second, randomization of volume-wise resolution (i.e. the number of slices) during training enhances the learning capacity of the learnable positional embedding (PE) assigned to each volume slice. This technique allows the PEs to generalize across neighboring slices, facilitating the handling of high-resolution volumes at the test time.</div></div><div><h3>Results:</h3><div>VLFATRollout was thoroughly tested on the retinal optical coherence tomography (OCT) volume classification task, demonstrating a notable average improvement of 5.47% in balanced accuracy over the leading convolutional models for a 5-class diagnostic task. These results emphasize the effectiveness of our framework in enhancing slice-level representation and its adaptability across different volume resolutions, paving the way for advanced transformer applications in medical image analysis. The code is available at <span><span>https://github.com/marziehoghbaie/VLFATRollout/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102452"},"PeriodicalIF":5.4,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

WISE: Efficient WSI selection for active learning in histopathology WISE：组织病理学主动学习的高效 WSI 选择

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-10-28 DOI: 10.1016/j.compmedimag.2024.102455

Hyeongu Kang , Mujin Kim , Young Sin Ko , Yesung Cho , Mun Yong Yi

Deep neural network (DNN) models have been applied to a wide variety of medical image analysis tasks, often with the successful performance outcomes that match those of medical doctors. However, given that even minor errors in a model can impact patients’ life, it is critical that these models are continuously improved. Hence, active learning (AL) has garnered attention as an effective and sustainable strategy for enhancing DNN models for the medical domain. Extant AL research in histopathology has primarily focused on patch datasets derived from whole-slide images (WSIs), a standard form of cancer diagnostic images obtained from a high-resolution scanner. However, this approach has failed to address the selection of WSIs, which can impede the performance improvement of deep learning models and increase the number of WSIs needed to achieve the target performance. This study introduces a WSI-level AL method, termed WSI-informative selection (WISE). WISE is designed to select informative WSIs using a newly formulated WSI-level class distance metric. This method aims to identify diverse and uncertain cases of WSIs, thereby contributing to model performance enhancement. WISE demonstrates state-of-the-art performance across the Colon and Stomach datasets, collected in the real world, as well as the public DigestPath dataset, significantly reducing the required number of WSIs by more than threefold compared to the one-pool dataset setting, which has been dominantly used in the field.

深度神经网络（DNN）模型已被广泛应用于各种医学影像分析任务中，其成功的性能结果往往与医生不相上下。然而，鉴于模型中的微小错误都可能影响患者的生命，因此不断改进这些模型至关重要。因此，主动学习（AL）作为增强医疗领域 DNN 模型的一种有效且可持续的策略备受关注。组织病理学领域的现有主动学习研究主要集中在从全切片图像（WSI）中获得的补丁数据集上，全切片图像是一种从高分辨率扫描仪中获得的标准癌症诊断图像。然而，这种方法未能解决 WSI 的选择问题，这可能会阻碍深度学习模型性能的提高，并增加实现目标性能所需的 WSI 数量。本研究引入了一种 WSI 级 AL 方法，称为 WSI 信息选择（WISE）。WISE 旨在使用新制定的 WSI 级类距离度量来选择有信息量的 WSI。该方法旨在识别 WSI 的多样性和不确定性情况，从而有助于提高模型性能。WISE 在现实世界中收集的结肠和胃数据集以及公共 DigestPath 数据集上表现出了最先进的性能，与该领域中主要使用的单数据集设置相比，所需的 WSI 数量显著减少了三倍以上。

{"title":"WISE: Efficient WSI selection for active learning in histopathology","authors":"Hyeongu Kang , Mujin Kim , Young Sin Ko , Yesung Cho , Mun Yong Yi","doi":"10.1016/j.compmedimag.2024.102455","DOIUrl":"10.1016/j.compmedimag.2024.102455","url":null,"abstract":"<div><div>Deep neural network (DNN) models have been applied to a wide variety of medical image analysis tasks, often with the successful performance outcomes that match those of medical doctors. However, given that even minor errors in a model can impact patients’ life, it is critical that these models are continuously improved. Hence, active learning (AL) has garnered attention as an effective and sustainable strategy for enhancing DNN models for the medical domain. Extant AL research in histopathology has primarily focused on patch datasets derived from whole-slide images (WSIs), a standard form of cancer diagnostic images obtained from a high-resolution scanner. However, this approach has failed to address the selection of WSIs, which can impede the performance improvement of deep learning models and increase the number of WSIs needed to achieve the target performance. This study introduces a WSI-level AL method, termed WSI-informative selection (WISE). WISE is designed to select informative WSIs using a newly formulated WSI-level class distance metric. This method aims to identify diverse and uncertain cases of WSIs, thereby contributing to model performance enhancement. WISE demonstrates state-of-the-art performance across the Colon and Stomach datasets, collected in the real world, as well as the public DigestPath dataset, significantly reducing the required number of WSIs by more than threefold compared to the one-pool dataset setting, which has been dominantly used in the field.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102455"},"PeriodicalIF":5.4,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0