Computerized Medical Imaging and Graphics最新文献_第8页

Exploring transformer reliability in clinically significant prostate cancer segmentation: A comprehensive in-depth investigation 探索具有临床意义的前列腺癌分段中变压器的可靠性：全面深入的调查

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-17 DOI: 10.1016/j.compmedimag.2024.102459

Gustavo Andrade-Miranda , Pedro Soto Vega , Kamilia Taguelmimt , Hong-Phuong Dang , Dimitris Visvikis , Julien Bert

Despite the growing prominence of transformers in medical image segmentation, their application to clinically significant prostate cancer (csPCa) has been overlooked. Minimal attention has been paid to domain shift analysis and uncertainty assessment, critical for safely implementing computer-aided diagnosis (CAD) systems. Domain shift in medical imagery refers to differences between the data used to train a model and the data evaluated later, arising from variations in imaging equipment, protocols, patient populations, and acquisition noise. While recent models enhance in-domain performance, areas such as robustness and uncertainty estimation in out-of-domain distributions have received limited investigation, creating indecisiveness about model reliability. In contrast, our study addresses csPCa at voxel, lesion, and image levels, investigating models from traditional U-Net to cutting-edge transformers. We focus on four key points: robustness, calibration, out-of-distribution (OOD), and misclassification detection (MD). Findings show that transformer-based models exhibit enhanced robustness at image and lesion levels, both in and out of domain. However, this improvement is not fully translated to the voxel level, where Convolutional Neural Networks (CNNs) outperform in most robustness metrics. Regarding uncertainty, hybrid transformers and transformer encoders performed better, but this trend depends on misclassification or out-of-distribution tasks.

尽管变换器在医学图像分割中的作用日益突出，但其在具有临床意义的前列腺癌（csPCa）中的应用却一直被忽视。人们对域偏移分析和不确定性评估的关注极少，而这对安全实施计算机辅助诊断（CAD）系统至关重要。医学影像中的域偏移指的是用于训练模型的数据与随后评估的数据之间的差异，这种差异是由成像设备、协议、患者群体和采集噪声的变化引起的。虽然最近的模型提高了域内性能，但对域外分布的鲁棒性和不确定性估计等领域的研究却很有限，导致对模型的可靠性举棋不定。相比之下，我们的研究涉及体素、病灶和图像层面的 csPCa，研究了从传统 U-Net 到尖端变换器的各种模型。我们重点关注四个关键点：稳健性、校准、分布外 (OOD) 和误分类检测 (MD)。研究结果表明，基于变换器的模型在图像和病变水平上表现出更强的鲁棒性，无论是在域内还是域外。然而，这种改进并没有完全转化到体素层面，在体素层面，卷积神经网络（CNN）在大多数鲁棒性指标上都表现出色。在不确定性方面，混合变换器和变换器编码器表现更好，但这一趋势取决于误分类或分布外任务。

{"title":"Exploring transformer reliability in clinically significant prostate cancer segmentation: A comprehensive in-depth investigation","authors":"Gustavo Andrade-Miranda , Pedro Soto Vega , Kamilia Taguelmimt , Hong-Phuong Dang , Dimitris Visvikis , Julien Bert","doi":"10.1016/j.compmedimag.2024.102459","DOIUrl":"10.1016/j.compmedimag.2024.102459","url":null,"abstract":"<div><div>Despite the growing prominence of transformers in medical image segmentation, their application to clinically significant prostate cancer (csPCa) has been overlooked. Minimal attention has been paid to domain shift analysis and uncertainty assessment, critical for safely implementing computer-aided diagnosis (CAD) systems. Domain shift in medical imagery refers to differences between the data used to train a model and the data evaluated later, arising from variations in imaging equipment, protocols, patient populations, and acquisition noise. While recent models enhance in-domain performance, areas such as robustness and uncertainty estimation in out-of-domain distributions have received limited investigation, creating indecisiveness about model reliability. In contrast, our study addresses csPCa at voxel, lesion, and image levels, investigating models from traditional U-Net to cutting-edge transformers. We focus on four key points: robustness, calibration, out-of-distribution (OOD), and misclassification detection (MD). Findings show that transformer-based models exhibit enhanced robustness at image and lesion levels, both in and out of domain. However, this improvement is not fully translated to the voxel level, where Convolutional Neural Networks (CNNs) outperform in most robustness metrics. Regarding uncertainty, hybrid transformers and transformer encoders performed better, but this trend depends on misclassification or out-of-distribution tasks.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102459"},"PeriodicalIF":5.4,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NACNet: A histology context-aware transformer graph convolution network for predicting treatment response to neoadjuvant chemotherapy in Triple Negative Breast Cancer NACNet：用于预测三阴性乳腺癌新辅助化疗治疗反应的组织学上下文感知变换图卷积网络

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-17 DOI: 10.1016/j.compmedimag.2024.102467

Qiang Li , George Teodoro , Yi Jiang , Jun Kong

Neoadjuvant chemotherapy (NAC) response prediction for triple negative breast cancer (TNBC) patients is a challenging task clinically as it requires understanding complex histology interactions within the tumor microenvironment (TME). Digital whole slide images (WSIs) capture detailed tissue information, but their giga-pixel size necessitates computational methods based on multiple instance learning, which typically analyze small, isolated image tiles without the spatial context of the TME. To address this limitation and incorporate TME spatial histology interactions in predicting NAC response for TNBC patients, we developed a histology context-aware transformer graph convolution network (NACNet). Our deep learning method identifies the histopathological labels on individual image tiles from WSIs, constructs a spatial TME graph, and represents each node with features derived from tissue texture and social network analysis. It predicts NAC response using a transformer graph convolution network model enhanced with graph isomorphism network layers. We evaluate our method with WSIs of a cohort of TNBC patient (N=105) and compared its performance with multiple state-of-the-art machine learning and deep learning models, including both graph and non-graph approaches. Our NACNet achieves 90.0% accuracy, 96.0% sensitivity, 88.0% specificity, and an AUC of 0.82, through eight-fold cross-validation, outperforming baseline models. These comprehensive experimental results suggest that NACNet holds strong potential for stratifying TNBC patients by NAC response, thereby helping to prevent overtreatment, improve patient quality of life, reduce treatment cost, and enhance clinical outcomes, marking an important advancement toward personalized breast cancer treatment.

三阴性乳腺癌（TNBC）患者的新辅助化疗（NAC）反应预测是一项具有挑战性的临床任务，因为它需要了解肿瘤微环境（TME）中复杂的组织学相互作用。数字全切片图像（WSI）能捕捉到详细的组织信息，但其千兆像素的尺寸使得基于多实例学习的计算方法成为必要，这种方法通常分析的是孤立的小块图像，而不考虑肿瘤微环境的空间背景。为了解决这一局限性，并结合TME空间组织学相互作用来预测TNBC患者的NAC反应，我们开发了一种组织学上下文感知变换图卷积网络（NACNet）。我们的深度学习方法从 WSIs 中识别单个图像瓦片上的组织病理学标签，构建空间 TME 图，并用组织纹理和社交网络分析得出的特征来表示每个节点。该方法使用变压器图卷积网络模型预测 NAC 反应，该模型使用图同构网络层进行增强。我们用一组 TNBC 患者（N=105）的 WSI 评估了我们的方法，并将其性能与多种最先进的机器学习和深度学习模型（包括图和非图方法）进行了比较。通过八倍交叉验证，我们的 NACNet 实现了 90.0% 的准确率、96.0% 的灵敏度、88.0% 的特异性和 0.82 的 AUC，表现优于基线模型。这些全面的实验结果表明，NACNet 在根据 NAC 反应对 TNBC 患者进行分层方面具有强大的潜力，从而有助于防止过度治疗、改善患者生活质量、降低治疗成本和提高临床疗效，标志着乳腺癌个性化治疗取得了重要进展。

{"title":"NACNet: A histology context-aware transformer graph convolution network for predicting treatment response to neoadjuvant chemotherapy in Triple Negative Breast Cancer","authors":"Qiang Li , George Teodoro , Yi Jiang , Jun Kong","doi":"10.1016/j.compmedimag.2024.102467","DOIUrl":"10.1016/j.compmedimag.2024.102467","url":null,"abstract":"<div><div>Neoadjuvant chemotherapy (NAC) response prediction for triple negative breast cancer (TNBC) patients is a challenging task clinically as it requires understanding complex histology interactions within the tumor microenvironment (TME). Digital whole slide images (WSIs) capture detailed tissue information, but their giga-pixel size necessitates computational methods based on multiple instance learning, which typically analyze small, isolated image tiles without the spatial context of the TME. To address this limitation and incorporate TME spatial histology interactions in predicting NAC response for TNBC patients, we developed a histology context-aware transformer graph convolution network (NACNet). Our deep learning method identifies the histopathological labels on individual image tiles from WSIs, constructs a spatial TME graph, and represents each node with features derived from tissue texture and social network analysis. It predicts NAC response using a transformer graph convolution network model enhanced with graph isomorphism network layers. We evaluate our method with WSIs of a cohort of TNBC patient (N=105) and compared its performance with multiple state-of-the-art machine learning and deep learning models, including both graph and non-graph approaches. Our NACNet achieves 90.0% accuracy, 96.0% sensitivity, 88.0% specificity, and an AUC of 0.82, through eight-fold cross-validation, outperforming baseline models. These comprehensive experimental results suggest that NACNet holds strong potential for stratifying TNBC patients by NAC response, thereby helping to prevent overtreatment, improve patient quality of life, reduce treatment cost, and enhance clinical outcomes, marking an important advancement toward personalized breast cancer treatment.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102467"},"PeriodicalIF":5.4,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142700926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Self-supervised multi-modal feature fusion for predicting early recurrence of hepatocellular carcinoma 预测肝细胞癌早期复发的自我监督多模态特征融合。

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-14 DOI: 10.1016/j.compmedimag.2024.102457

Sen Wang , Ying Zhao , Jiayi Li , Zongmin Yi , Jun Li , Can Zuo , Yu Yao , Ailian Liu

Surgical resection stands as the primary treatment option for early-stage hepatocellular carcinoma (HCC) patients. Postoperative early recurrence (ER) is a significant factor contributing to the mortality of HCC patients. Therefore, accurately predicting the risk of ER after curative resection is crucial for clinical decision-making and improving patient prognosis. This study leverages a self-supervised multi-modal feature fusion approach, combining multi-phase MRI and clinical features, to predict ER of HCC. Specifically, we utilized attention mechanisms to suppress redundant features, enabling efficient extraction and fusion of multi-phase features. Through self-supervised learning (SSL), we pretrained an encoder on our dataset to extract more generalizable feature representations. Finally, we achieved effective multi-modal information fusion via attention modules. To enhance explainability, we employed Score-CAM to visualize the key regions influencing the model’s predictions. We evaluated the effectiveness of the proposed method on our dataset and found that predictions based on multi-phase feature fusion outperformed those based on single-phase features. Additionally, predictions based on multi-modal feature fusion were superior to those based on single-modal features.

手术切除是早期肝细胞癌（HCC）患者的主要治疗方案。术后早期复发（ER）是导致 HCC 患者死亡的一个重要因素。因此，准确预测治愈性切除术后的早期复发风险对于临床决策和改善患者预后至关重要。本研究利用自监督多模态特征融合方法，结合多相磁共振成像和临床特征，预测 HCC 的 ER。具体来说，我们利用注意力机制抑制冗余特征，从而实现多相特征的高效提取和融合。通过自我监督学习（SSL），我们在数据集上预训练了编码器，以提取更具通用性的特征表征。最后，我们通过注意力模块实现了有效的多模态信息融合。为了提高可解释性，我们采用了 Score-CAM 来可视化影响模型预测的关键区域。我们在数据集上评估了所提方法的有效性，发现基于多相特征融合的预测结果优于基于单相特征的预测结果。此外，基于多模态特征融合的预测结果优于基于单模态特征的预测结果。

{"title":"Self-supervised multi-modal feature fusion for predicting early recurrence of hepatocellular carcinoma","authors":"Sen Wang , Ying Zhao , Jiayi Li , Zongmin Yi , Jun Li , Can Zuo , Yu Yao , Ailian Liu","doi":"10.1016/j.compmedimag.2024.102457","DOIUrl":"10.1016/j.compmedimag.2024.102457","url":null,"abstract":"<div><div>Surgical resection stands as the primary treatment option for early-stage hepatocellular carcinoma (HCC) patients. Postoperative early recurrence (ER) is a significant factor contributing to the mortality of HCC patients. Therefore, accurately predicting the risk of ER after curative resection is crucial for clinical decision-making and improving patient prognosis. This study leverages a self-supervised multi-modal feature fusion approach, combining multi-phase MRI and clinical features, to predict ER of HCC. Specifically, we utilized attention mechanisms to suppress redundant features, enabling efficient extraction and fusion of multi-phase features. Through self-supervised learning (SSL), we pretrained an encoder on our dataset to extract more generalizable feature representations. Finally, we achieved effective multi-modal information fusion via attention modules. To enhance explainability, we employed Score-CAM to visualize the key regions influencing the model’s predictions. We evaluated the effectiveness of the proposed method on our dataset and found that predictions based on multi-phase feature fusion outperformed those based on single-phase features. Additionally, predictions based on multi-modal feature fusion were superior to those based on single-modal features.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102457"},"PeriodicalIF":5.4,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142689549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DSIFNet: Implicit feature network for nasal cavity and vestibule segmentation from 3D head CT DSIFNet：用于从三维头部 CT 中分割鼻腔和前庭的隐含特征网络

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-12 DOI: 10.1016/j.compmedimag.2024.102462

Yi Lu , Hongjian Gao , Jikuan Qiu , Zihan Qiu , Junxiu Liu , Xiangzhi Bai

This study is dedicated to accurately segment the nasal cavity and its intricate internal anatomy from head CT images, which is critical for understanding nasal physiology, diagnosing diseases, and planning surgeries. Nasal cavity and it’s anatomical structures such as the sinuses, and vestibule exhibit significant scale differences, with complex shapes and variable microstructures. These features require the segmentation method to have strong cross-scale feature extraction capabilities. To effectively address this challenge, we propose an image segmentation network named the Deeply Supervised Implicit Feature Network (DSIFNet). This network uniquely incorporates an Implicit Feature Function Module Guided by Local and Global Positional Information (LGPI-IFF), enabling effective fusion of features across scales and enhancing the network's ability to recognize details and overall structures. Additionally, we introduce a deep supervision mechanism based on implicit feature functions in the network's decoding phase, optimizing the utilization of multi-scale feature information, thus improving segmentation precision and detail representation. Furthermore, we constructed a dataset comprising 7116 CT volumes (including 1,292,508 slices) and implemented PixPro-based self-supervised pretraining to utilize unlabeled data for enhanced feature extraction. Our tests on nasal cavity and vestibule segmentation, conducted on a dataset comprising 128 head CT volumes (including 34,006 slices), demonstrate the robustness and superior performance of proposed method, achieving leading results across multiple segmentation metrics.

这项研究致力于从头部 CT 图像中准确分割鼻腔及其复杂的内部解剖结构，这对了解鼻腔生理、诊断疾病和规划手术至关重要。鼻腔及其解剖结构（如鼻窦和鼻前庭）具有明显的尺度差异、复杂的形状和多变的微观结构。这些特征要求分割方法具有强大的跨尺度特征提取能力。为了有效应对这一挑战，我们提出了一种名为深度监督隐含特征网络（DSIFNet）的图像分割网络。该网络独特地集成了由本地和全局位置信息（LGPI-IFF）引导的隐式特征功能模块，从而实现了跨尺度特征的有效融合，并增强了网络识别细节和整体结构的能力。此外，我们还在网络解码阶段引入了基于隐式特征函数的深度监督机制，优化了多尺度特征信息的利用，从而提高了分割精度和细节表示能力。此外，我们还构建了一个由 7116 个 CT 卷（包括 1,292,508 个切片）组成的数据集，并实施了基于 PixPro 的自监督预训练，以利用未标记数据增强特征提取。我们在由 128 个头部 CT 卷（包括 34006 张切片）组成的数据集上进行了鼻腔和前庭分割测试，测试结果证明了所提方法的鲁棒性和卓越性能，在多个分割指标上都取得了领先的结果。

{"title":"DSIFNet: Implicit feature network for nasal cavity and vestibule segmentation from 3D head CT","authors":"Yi Lu , Hongjian Gao , Jikuan Qiu , Zihan Qiu , Junxiu Liu , Xiangzhi Bai","doi":"10.1016/j.compmedimag.2024.102462","DOIUrl":"10.1016/j.compmedimag.2024.102462","url":null,"abstract":"<div><div>This study is dedicated to accurately segment the nasal cavity and its intricate internal anatomy from head CT images, which is critical for understanding nasal physiology, diagnosing diseases, and planning surgeries. Nasal cavity and it’s anatomical structures such as the sinuses, and vestibule exhibit significant scale differences, with complex shapes and variable microstructures. These features require the segmentation method to have strong cross-scale feature extraction capabilities. To effectively address this challenge, we propose an image segmentation network named the Deeply Supervised Implicit Feature Network (DSIFNet). This network uniquely incorporates an Implicit Feature Function Module Guided by Local and Global Positional Information (LGPI-IFF), enabling effective fusion of features across scales and enhancing the network's ability to recognize details and overall structures. Additionally, we introduce a deep supervision mechanism based on implicit feature functions in the network's decoding phase, optimizing the utilization of multi-scale feature information, thus improving segmentation precision and detail representation. Furthermore, we constructed a dataset comprising 7116 CT volumes (including 1,292,508 slices) and implemented PixPro-based self-supervised pretraining to utilize unlabeled data for enhanced feature extraction. Our tests on nasal cavity and vestibule segmentation, conducted on a dataset comprising 128 head CT volumes (including 34,006 slices), demonstrate the robustness and superior performance of proposed method, achieving leading results across multiple segmentation metrics.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102462"},"PeriodicalIF":5.4,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142656920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AFSegNet: few-shot 3D ankle-foot bone segmentation via hierarchical feature distillation and multi-scale attention and fusion AFSegNet：通过分层特征提炼和多尺度关注与融合进行少量三维踝足骨骼分割

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-11-01 DOI: 10.1016/j.compmedimag.2024.102456

Yuan Huang , Sven A. Holcombe , Stewart C. Wang , Jisi Tang

Accurate segmentation of ankle and foot bones from CT scans is essential for morphological analysis. Ankle and foot bone segmentation challenges due to the blurred bone boundaries, narrow inter-bone gaps, gaps in the cortical shell, and uneven spongy bone textures. Our study endeavors to create a deep learning framework that harnesses advantages of 3D deep learning and tackles the hurdles in accurately segmenting ankle and foot bones from clinical CT scans. A few-shot framework AFSegNet is proposed considering the computational cost, which comprises three 3D deep-learning networks adhering to the principles of progressing from simple to complex tasks and network structures. Specifically, a shallow network first over-segments the foreground, and along with the foreground ground truth are used to supervise a subsequent network to detect the over-segmented regions, which are overwhelmingly inter-bone gaps. The foreground and inter-bone gap probability map are then input into a network with multi-scale attentions and feature fusion, a loss function combining region-, boundary-, and topology-based terms to get the fine-level bone segmentation. AFSegNet is applied to the 16-class segmentation task utilizing 123 in-house CT scans, which only requires a GPU with 24 GB memory since the three sub-networks can be successively and individually trained. AFSegNet achieves a Dice of 0.953 and average surface distance of 0.207. The ablation study and comparison with two basic state-of-the-art networks indicates the effectiveness of the progressively distilled features, attention and feature fusion modules, and hybrid loss functions, with the mean surface distance error decreased up to 50 %.

从 CT 扫描中准确分割踝骨和足骨对形态分析至关重要。由于骨骼边界模糊、骨骼间隙狭窄、皮质外壳存在间隙以及海绵状骨骼纹理不均匀，踝骨和足骨的分割面临挑战。我们的研究致力于创建一个深度学习框架，利用三维深度学习的优势，解决从临床 CT 扫描中准确分割踝骨和足骨的难题。考虑到计算成本，本研究提出了一个由三个三维深度学习网络组成的 "几镜式 "框架 AFSegNet，该框架遵循任务和网络结构由简到繁的原则。具体来说，一个浅层网络首先对前景进行过度分割，并与前景地面实况一起用于监督后续网络检测过度分割的区域，这些区域绝大多数是骨间间隙。然后，将前景和骨间间隙概率图输入一个多尺度关注和特征融合网络，该损失函数结合了基于区域、边界和拓扑的术语，以获得精细的骨骼分割。AFSegNet 利用 123 张内部 CT 扫描图像完成了 16 级骨骼分割任务，由于三个子网络可以连续单独训练，因此只需要配备 24 GB 内存的 GPU。AFSegNet 的 Dice 值为 0.953，平均表面距离为 0.207。消融研究以及与两个最先进的基本网络的比较表明，逐步提炼的特征、注意力和特征融合模块以及混合损失函数非常有效，平均表面距离误差降低了 50%。

{"title":"AFSegNet: few-shot 3D ankle-foot bone segmentation via hierarchical feature distillation and multi-scale attention and fusion","authors":"Yuan Huang , Sven A. Holcombe , Stewart C. Wang , Jisi Tang","doi":"10.1016/j.compmedimag.2024.102456","DOIUrl":"10.1016/j.compmedimag.2024.102456","url":null,"abstract":"<div><div>Accurate segmentation of ankle and foot bones from CT scans is essential for morphological analysis. Ankle and foot bone segmentation challenges due to the blurred bone boundaries, narrow inter-bone gaps, gaps in the cortical shell, and uneven spongy bone textures. Our study endeavors to create a deep learning framework that harnesses advantages of 3D deep learning and tackles the hurdles in accurately segmenting ankle and foot bones from clinical CT scans. A few-shot framework AFSegNet is proposed considering the computational cost, which comprises three 3D deep-learning networks adhering to the principles of progressing from simple to complex tasks and network structures. Specifically, a shallow network first over-segments the foreground, and along with the foreground ground truth are used to supervise a subsequent network to detect the over-segmented regions, which are overwhelmingly inter-bone gaps. The foreground and inter-bone gap probability map are then input into a network with multi-scale attentions and feature fusion, a loss function combining region-, boundary-, and topology-based terms to get the fine-level bone segmentation. AFSegNet is applied to the 16-class segmentation task utilizing 123 in-house CT scans, which only requires a GPU with 24 GB memory since the three sub-networks can be successively and individually trained. AFSegNet achieves a Dice of 0.953 and average surface distance of 0.207. The ablation study and comparison with two basic state-of-the-art networks indicates the effectiveness of the progressively distilled features, attention and feature fusion modules, and hybrid loss functions, with the mean surface distance error decreased up to 50 %.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102456"},"PeriodicalIF":5.4,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VLFATRollout: Fully transformer-based classifier for retinal OCT volumes VLFATRollout：完全基于变换器的视网膜 OCT 容量分类器。

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-10-29 DOI: 10.1016/j.compmedimag.2024.102452

Marzieh Oghbaie , Teresa Araújo , Ursula Schmidt-Erfurth , Hrvoje Bogunović

Background and Objective:

Despite the promising capabilities of 3D transformer architectures in video analysis, their application to high-resolution 3D medical volumes encounters several challenges. One major limitation is the high number of 3D patches, which reduces the efficiency of the global self-attention mechanisms of transformers. Additionally, background information can distract vision transformers from focusing on crucial areas of the input image, thereby introducing noise into the final representation. Moreover, the variability in the number of slices per volume complicates the development of models capable of processing input volumes of any resolution while simple solutions like subsampling may risk losing essential diagnostic details.

Methods:

To address these challenges, we introduce an end-to-end transformer-based framework, variable length feature aggregator transformer rollout (VLFATRollout), to classify volumetric data. The proposed VLFATRollout enjoys several merits. First, the proposed VLFATRollout can effectively mine slice-level fore-background information with the help of transformer’s attention matrices. Second, randomization of volume-wise resolution (i.e. the number of slices) during training enhances the learning capacity of the learnable positional embedding (PE) assigned to each volume slice. This technique allows the PEs to generalize across neighboring slices, facilitating the handling of high-resolution volumes at the test time.

Results:

VLFATRollout was thoroughly tested on the retinal optical coherence tomography (OCT) volume classification task, demonstrating a notable average improvement of 5.47% in balanced accuracy over the leading convolutional models for a 5-class diagnostic task. These results emphasize the effectiveness of our framework in enhancing slice-level representation and its adaptability across different volume resolutions, paving the way for advanced transformer applications in medical image analysis. The code is available at https://github.com/marziehoghbaie/VLFATRollout/.

背景和目的：尽管三维变压器架构在视频分析中的应用前景广阔，但将其应用于高分辨率三维医疗卷却面临着一些挑战。其中一个主要限制是三维斑块数量较多，这降低了变换器全局自我关注机制的效率。此外，背景信息会分散视觉转换器的注意力，使其无法聚焦于输入图像的关键区域，从而在最终表示中引入噪声。此外，每个体的切片数的变化使得开发能够处理任何分辨率的输入体的模型变得更加复杂，而简单的解决方案（如子采样）可能会丢失重要的诊断细节：为了应对这些挑战，我们引入了一种基于变压器的端到端框架--可变长度特征聚合变压器推出（VLFATRollout），用于对体积数据进行分类。所提出的 VLFATRollout 有几个优点。首先，拟议的 VLFATRollout 可借助变换器的注意力矩阵有效挖掘切片级前景信息。其次，在训练过程中对体积分辨率（即切片数）进行随机化，可增强分配给每个体积切片的可学习位置嵌入（PE）的学习能力。这种技术可以让位置嵌入在相邻切片之间进行泛化，从而在测试时更容易处理高分辨率的容积：VLFATRollout 在视网膜光学相干断层扫描（OCT）容积分类任务中进行了全面测试，在 5 类诊断任务中，与领先的卷积模型相比，平均平衡准确率显著提高了 5.47%。这些结果凸显了我们的框架在增强切片级表示方面的有效性及其对不同体分辨率的适应性，为医学图像分析中的高级变换器应用铺平了道路。代码见 https://github.com/marziehoghbaie/VLFATRollout/。

{"title":"VLFATRollout: Fully transformer-based classifier for retinal OCT volumes","authors":"Marzieh Oghbaie , Teresa Araújo , Ursula Schmidt-Erfurth , Hrvoje Bogunović","doi":"10.1016/j.compmedimag.2024.102452","DOIUrl":"10.1016/j.compmedimag.2024.102452","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Despite the promising capabilities of 3D transformer architectures in video analysis, their application to high-resolution 3D medical volumes encounters several challenges. One major limitation is the high number of 3D patches, which reduces the efficiency of the global self-attention mechanisms of transformers. Additionally, background information can distract vision transformers from focusing on crucial areas of the input image, thereby introducing noise into the final representation. Moreover, the variability in the number of slices per volume complicates the development of models capable of processing input volumes of any resolution while simple solutions like subsampling may risk losing essential diagnostic details.</div></div><div><h3>Methods:</h3><div>To address these challenges, we introduce an end-to-end transformer-based framework, variable length feature aggregator transformer rollout (VLFATRollout), to classify volumetric data. The proposed VLFATRollout enjoys several merits. First, the proposed VLFATRollout can effectively mine slice-level fore-background information with the help of transformer’s attention matrices. Second, randomization of volume-wise resolution (i.e. the number of slices) during training enhances the learning capacity of the learnable positional embedding (PE) assigned to each volume slice. This technique allows the PEs to generalize across neighboring slices, facilitating the handling of high-resolution volumes at the test time.</div></div><div><h3>Results:</h3><div>VLFATRollout was thoroughly tested on the retinal optical coherence tomography (OCT) volume classification task, demonstrating a notable average improvement of 5.47% in balanced accuracy over the leading convolutional models for a 5-class diagnostic task. These results emphasize the effectiveness of our framework in enhancing slice-level representation and its adaptability across different volume resolutions, paving the way for advanced transformer applications in medical image analysis. The code is available at <span><span>https://github.com/marziehoghbaie/VLFATRollout/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102452"},"PeriodicalIF":5.4,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

WISE: Efficient WSI selection for active learning in histopathology WISE：组织病理学主动学习的高效 WSI 选择

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-10-28 DOI: 10.1016/j.compmedimag.2024.102455

Hyeongu Kang , Mujin Kim , Young Sin Ko , Yesung Cho , Mun Yong Yi

Deep neural network (DNN) models have been applied to a wide variety of medical image analysis tasks, often with the successful performance outcomes that match those of medical doctors. However, given that even minor errors in a model can impact patients’ life, it is critical that these models are continuously improved. Hence, active learning (AL) has garnered attention as an effective and sustainable strategy for enhancing DNN models for the medical domain. Extant AL research in histopathology has primarily focused on patch datasets derived from whole-slide images (WSIs), a standard form of cancer diagnostic images obtained from a high-resolution scanner. However, this approach has failed to address the selection of WSIs, which can impede the performance improvement of deep learning models and increase the number of WSIs needed to achieve the target performance. This study introduces a WSI-level AL method, termed WSI-informative selection (WISE). WISE is designed to select informative WSIs using a newly formulated WSI-level class distance metric. This method aims to identify diverse and uncertain cases of WSIs, thereby contributing to model performance enhancement. WISE demonstrates state-of-the-art performance across the Colon and Stomach datasets, collected in the real world, as well as the public DigestPath dataset, significantly reducing the required number of WSIs by more than threefold compared to the one-pool dataset setting, which has been dominantly used in the field.

深度神经网络（DNN）模型已被广泛应用于各种医学影像分析任务中，其成功的性能结果往往与医生不相上下。然而，鉴于模型中的微小错误都可能影响患者的生命，因此不断改进这些模型至关重要。因此，主动学习（AL）作为增强医疗领域 DNN 模型的一种有效且可持续的策略备受关注。组织病理学领域的现有主动学习研究主要集中在从全切片图像（WSI）中获得的补丁数据集上，全切片图像是一种从高分辨率扫描仪中获得的标准癌症诊断图像。然而，这种方法未能解决 WSI 的选择问题，这可能会阻碍深度学习模型性能的提高，并增加实现目标性能所需的 WSI 数量。本研究引入了一种 WSI 级 AL 方法，称为 WSI 信息选择（WISE）。WISE 旨在使用新制定的 WSI 级类距离度量来选择有信息量的 WSI。该方法旨在识别 WSI 的多样性和不确定性情况，从而有助于提高模型性能。WISE 在现实世界中收集的结肠和胃数据集以及公共 DigestPath 数据集上表现出了最先进的性能，与该领域中主要使用的单数据集设置相比，所需的 WSI 数量显著减少了三倍以上。

{"title":"WISE: Efficient WSI selection for active learning in histopathology","authors":"Hyeongu Kang , Mujin Kim , Young Sin Ko , Yesung Cho , Mun Yong Yi","doi":"10.1016/j.compmedimag.2024.102455","DOIUrl":"10.1016/j.compmedimag.2024.102455","url":null,"abstract":"<div><div>Deep neural network (DNN) models have been applied to a wide variety of medical image analysis tasks, often with the successful performance outcomes that match those of medical doctors. However, given that even minor errors in a model can impact patients’ life, it is critical that these models are continuously improved. Hence, active learning (AL) has garnered attention as an effective and sustainable strategy for enhancing DNN models for the medical domain. Extant AL research in histopathology has primarily focused on patch datasets derived from whole-slide images (WSIs), a standard form of cancer diagnostic images obtained from a high-resolution scanner. However, this approach has failed to address the selection of WSIs, which can impede the performance improvement of deep learning models and increase the number of WSIs needed to achieve the target performance. This study introduces a WSI-level AL method, termed WSI-informative selection (WISE). WISE is designed to select informative WSIs using a newly formulated WSI-level class distance metric. This method aims to identify diverse and uncertain cases of WSIs, thereby contributing to model performance enhancement. WISE demonstrates state-of-the-art performance across the Colon and Stomach datasets, collected in the real world, as well as the public DigestPath dataset, significantly reducing the required number of WSIs by more than threefold compared to the one-pool dataset setting, which has been dominantly used in the field.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102455"},"PeriodicalIF":5.4,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RPDNet: A reconstruction-regularized parallel decoders network for rectal tumor and rectum co-segmentation RPDNet：用于直肠肿瘤和直肠共同分割的重建正则化并行解码器网络。

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-10-28 DOI: 10.1016/j.compmedimag.2024.102453

WenXiang Huang , Ye Xu , Yuanyuan Wang , Hongtu Zheng , Yi Guo

Accurate segmentation of rectal cancer tumor and rectum in magnetic resonance imaging (MRI) is significant for tumor precise diagnosis and treatment plans determination. Variable shapes and unclear boundaries of rectal tumors make this task particularly challenging. Only a few studies have explored deep learning networks in rectal tumor segmentation, which mainly adopt the classical encoder-decoder structure. The frequent downsampling operations during feature extraction result in the loss of detailed information, limiting the network's ability to precisely capture the shape and boundary of rectal tumors. This paper proposes a Reconstruction-regularized Parallel Decoder network (RPDNet) to address the problem of information loss and obtain accurate co-segmentation results of both rectal tumor and rectum. RPDNet initially establishes a shared encoder and parallel decoders framework to fully utilize the common knowledge between two segmentation labels while reducing the number of network parameters. An auxiliary reconstruction branch is subsequently introduced by calculating the consistency loss between the reconstructed and input images to preserve sufficient anatomical structure information. Moreover, a non-parameter target-adaptive attention module is proposed to distinguish the unclear boundary by enhancing the feature-level contrast between rectal tumors and normal tissues. The experimental results indicate that the proposed method outperforms state-of-the-art approaches in rectal tumor and rectum segmentation tasks, with Dice coefficients of 84.91 % and 90.36 %, respectively, demonstrating its potential application value in clinical practice.

在磁共振成像（MRI）中准确分割直肠癌肿瘤和直肠对肿瘤的精确诊断和治疗方案的确定具有重要意义。直肠肿瘤形状多变、边界不清，使得这项任务尤其具有挑战性。只有少数研究探索了深度学习网络在直肠肿瘤分割中的应用，这些研究主要采用经典的编码器-解码器结构。在特征提取过程中频繁的降采样操作会导致细节信息的丢失，从而限制了网络精确捕捉直肠肿瘤形状和边界的能力。本文提出了一种重构正则化并行解码器网络（RPDNet）来解决信息丢失问题，并获得直肠肿瘤和直肠的精确协同分割结果。RPDNet 首先建立了一个共享编码器和并行解码器框架，以充分利用两个分割标签之间的共同知识，同时减少网络参数的数量。随后，通过计算重建图像与输入图像之间的一致性损失，引入辅助重建分支，以保留足够的解剖结构信息。此外，还提出了一个非参数目标自适应注意力模块，通过增强直肠肿瘤与正常组织之间的特征级对比来区分不清晰的边界。实验结果表明，所提出的方法在直肠肿瘤和直肠分割任务中的表现优于最先进的方法，Dice系数分别为84.91%和90.36%，证明了其在临床实践中的潜在应用价值。

{"title":"RPDNet: A reconstruction-regularized parallel decoders network for rectal tumor and rectum co-segmentation","authors":"WenXiang Huang , Ye Xu , Yuanyuan Wang , Hongtu Zheng , Yi Guo","doi":"10.1016/j.compmedimag.2024.102453","DOIUrl":"10.1016/j.compmedimag.2024.102453","url":null,"abstract":"<div><div>Accurate segmentation of rectal cancer tumor and rectum in magnetic resonance imaging (MRI) is significant for tumor precise diagnosis and treatment plans determination. Variable shapes and unclear boundaries of rectal tumors make this task particularly challenging. Only a few studies have explored deep learning networks in rectal tumor segmentation, which mainly adopt the classical encoder-decoder structure. The frequent downsampling operations during feature extraction result in the loss of detailed information, limiting the network's ability to precisely capture the shape and boundary of rectal tumors. This paper proposes a Reconstruction-regularized Parallel Decoder network (RPDNet) to address the problem of information loss and obtain accurate co-segmentation results of both rectal tumor and rectum. RPDNet initially establishes a shared encoder and parallel decoders framework to fully utilize the common knowledge between two segmentation labels while reducing the number of network parameters. An auxiliary reconstruction branch is subsequently introduced by calculating the consistency loss between the reconstructed and input images to preserve sufficient anatomical structure information. Moreover, a non-parameter target-adaptive attention module is proposed to distinguish the unclear boundary by enhancing the feature-level contrast between rectal tumors and normal tissues. The experimental results indicate that the proposed method outperforms state-of-the-art approaches in rectal tumor and rectum segmentation tasks, with Dice coefficients of 84.91 % and 90.36 %, respectively, demonstrating its potential application value in clinical practice.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102453"},"PeriodicalIF":5.4,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust brain MRI image classification with SIBOW-SVM 利用 SIBOW-SVM 进行稳健的脑部 MRI 图像分类。

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-10-24 DOI: 10.1016/j.compmedimag.2024.102451

Liyun Zeng , Hao Helen Zhang

Primary Central Nervous System tumors in the brain are among the most aggressive diseases affecting humans. Early detection and classification of brain tumor types, whether benign or malignant, glial or non-glial, is critical for cancer prevention and treatment, ultimately improving human life expectancy. Magnetic Resonance Imaging (MRI) is the most effective technique for brain tumor detection, generating comprehensive brain scans. However, human examination can be error-prone and inefficient due to the complexity, size, and location variability of brain tumors. Recently, automated classification techniques using machine learning methods, such as Convolutional Neural Networks (CNNs), have demonstrated significantly higher accuracy than manual screening. However, deep learning-based image classification methods, including CNNs, face challenges in estimating class probabilities without proper model calibration (Guo et al., 2017; Minderer et al., 2021). In this paper, we propose a novel brain tumor image classification method called SIBOW-SVM, which integrates the Bag-of-Features model with SIFT feature extraction and weighted Support Vector Machines. This new approach can effectively extract hidden image features, enabling differentiation of various tumor types, provide accurate label predictions, and estimate probabilities of images belonging to each class, offering high-confidence classification decisions. We have also developed scalable and parallelable algorithms to facilitate the practical implementation of SIBOW-SVM for massive image datasets. To benchmark our method, we apply SIBOW-SVM to a public dataset of brain tumor MRI images containing four classes: glioma, meningioma, pituitary, and normal. Our results demonstrate that the new method outperforms state-of-the-art techniques, including CNNs, in terms of uncertainty quantification, classification accuracy, computational efficiency, and data robustness.

脑部原发性中枢神经系统肿瘤是影响人类最严重的疾病之一。无论是良性还是非良性、神经胶质还是非神经胶质的脑肿瘤，其早期检测和分类对于癌症的预防和治疗都至关重要，最终将提高人类的预期寿命。磁共振成像（MRI）是检测脑肿瘤最有效的技术，可生成全面的脑部扫描图像。然而，由于脑肿瘤的复杂性、大小和位置的可变性，人工检查容易出错且效率低下。最近，使用卷积神经网络（CNN）等机器学习方法的自动分类技术已证明其准确性明显高于人工筛查。然而，基于深度学习的图像分类方法，包括 CNN，在没有适当模型校准的情况下，在估计类概率方面面临挑战（Guo 等人，2017 年；Minderer 等人，2021 年）。在本文中，我们提出了一种名为 SIBOW-SVM 的新型脑肿瘤图像分类方法，它将特征袋模型与 SIFT 特征提取和加权支持向量机整合在一起。这种新方法能有效提取隐藏的图像特征，从而区分各种肿瘤类型，提供准确的标签预测，并估算图像属于每一类的概率，从而做出高置信度的分类决策。我们还开发了可扩展、可并行的算法，以促进 SIBOW-SVM 在海量图像数据集上的实际应用。为了对我们的方法进行基准测试，我们将 SIBOW-SVM 应用于脑肿瘤 MRI 图像的公共数据集，其中包含四个类别：胶质瘤、脑膜瘤、垂体瘤和正常。结果表明，新方法在不确定性量化、分类准确性、计算效率和数据鲁棒性方面都优于包括 CNN 在内的最先进技术。

{"title":"Robust brain MRI image classification with SIBOW-SVM","authors":"Liyun Zeng , Hao Helen Zhang","doi":"10.1016/j.compmedimag.2024.102451","DOIUrl":"10.1016/j.compmedimag.2024.102451","url":null,"abstract":"<div><div>Primary Central Nervous System tumors in the brain are among the most aggressive diseases affecting humans. Early detection and classification of brain tumor types, whether benign or malignant, glial or non-glial, is critical for cancer prevention and treatment, ultimately improving human life expectancy. Magnetic Resonance Imaging (MRI) is the most effective technique for brain tumor detection, generating comprehensive brain scans. However, human examination can be error-prone and inefficient due to the complexity, size, and location variability of brain tumors. Recently, automated classification techniques using machine learning methods, such as Convolutional Neural Networks (CNNs), have demonstrated significantly higher accuracy than manual screening. However, deep learning-based image classification methods, including CNNs, face challenges in estimating class probabilities without proper model calibration (Guo et al., 2017; Minderer et al., 2021). In this paper, we propose a novel brain tumor image classification method called SIBOW-SVM, which integrates the Bag-of-Features model with SIFT feature extraction and weighted Support Vector Machines. This new approach can effectively extract hidden image features, enabling differentiation of various tumor types, provide accurate label predictions, and estimate probabilities of images belonging to each class, offering high-confidence classification decisions. We have also developed scalable and parallelable algorithms to facilitate the practical implementation of SIBOW-SVM for massive image datasets. To benchmark our method, we apply SIBOW-SVM to a public dataset of brain tumor MRI images containing four classes: glioma, meningioma, pituitary, and normal. Our results demonstrate that the new method outperforms state-of-the-art techniques, including CNNs, in terms of uncertainty quantification, classification accuracy, computational efficiency, and data robustness.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102451"},"PeriodicalIF":5.4,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142607468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Active learning based on multi-enhanced views for classification of multiple patterns in lung ultrasound images 基于多增强视图的主动学习，用于肺部超声图像中多种模式的分类。

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics

Pub Date : 2024-10-24 DOI: 10.1016/j.compmedimag.2024.102454

Yuanlu Ni , Yang Cong , Chengqian Zhao , Jinhua Yu , Yin Wang , Guohui Zhou , Mengjun Shen

There are several main patterns in lung ultrasound (LUS) images, including A-lines, B-lines, consolidation and pleural effusion. LUS images of healthy lungs typically only exhibit A-lines, while other patterns may emerge or coexist in LUS images associated with different lung diseases. The accurate categorization of these primary patterns is pivotal for effective lung disease screening. However, two challenges complicate the classification task: the first is the inherent blurring of feature differences between main patterns due to ultrasound imaging properties; and the second is the potential coexistence of multiple patterns in a single case, with only the most dominant pattern being clinically annotated. To address these challenges, we propose the active learning based on multi-enhanced views (MEVAL) method to achieve more precise pattern classification in LUS. To accentuate feature differences between multiple patterns, we introduce a feature enhancement module by applying vertical linear fitting and k-means clustering. The multi-enhanced views are then employed in parallel with the original images, thus enhancing MEVAL’s awareness of feature differences between multiple patterns. To tackle the patterns coexistence issue, we propose an active learning strategy based on confidence sets and misclassified sets. This strategy enables the network to simultaneously recognize multiple patterns by selectively labeling of a small number of images. Our dataset comprises 5075 LUS images, with approximately 4% exhibiting multiple patterns. Experimental results showcase the effectiveness of the proposed method in the classification task, with accuracy of 98.72%, AUC of 0.9989, sensitivity of 98.76%, and specificity of 98.16%, which outperforms than the state-of-the-art deep learning-based methods. A series of comprehensive ablation studies suggest the effectiveness of each proposed component and show great potential in clinical application.

肺部超声（LUS）图像有几种主要模式，包括 A 线、B 线、合并和胸腔积液。健康肺部的 LUS 图像通常只显示 A 线，而与不同肺部疾病相关的 LUS 图像中可能会出现或同时出现其他模式。对这些主要模式进行准确分类是有效筛查肺部疾病的关键。然而，有两个挑战使分类任务变得复杂：一是由于超声成像特性，主要模式之间的固有特征差异变得模糊；二是在一个病例中可能同时存在多种模式，而临床上只对最主要的模式进行注释。为了应对这些挑战，我们提出了基于多增强视图（MEVAL）的主动学习方法，以实现更精确的 LUS 模式分类。为了突出多个模式之间的特征差异，我们通过垂直线性拟合和 k-means 聚类引入了一个特征增强模块。然后，多重增强视图与原始图像并行使用，从而增强了 MEVAL 对多种模式之间特征差异的感知。为解决模式共存问题，我们提出了一种基于置信集和错误分类集的主动学习策略。这种策略通过选择性地标记少量图像，使网络能够同时识别多种模式。我们的数据集包括 5075 幅 LUS 图像，其中约 4% 呈现出多种模式。实验结果表明，所提出的方法在分类任务中非常有效，准确率为 98.72%，AUC 为 0.9989，灵敏度为 98.76%，特异度为 98.16%，优于基于深度学习的最先进方法。一系列全面的消融研究表明，所提出的每个组件都很有效，在临床应用中显示出巨大的潜力。

{"title":"Active learning based on multi-enhanced views for classification of multiple patterns in lung ultrasound images","authors":"Yuanlu Ni , Yang Cong , Chengqian Zhao , Jinhua Yu , Yin Wang , Guohui Zhou , Mengjun Shen","doi":"10.1016/j.compmedimag.2024.102454","DOIUrl":"10.1016/j.compmedimag.2024.102454","url":null,"abstract":"<div><div>There are several main patterns in lung ultrasound (LUS) images, including A-lines, B-lines, consolidation and pleural effusion. LUS images of healthy lungs typically only exhibit A-lines, while other patterns may emerge or coexist in LUS images associated with different lung diseases. The accurate categorization of these primary patterns is pivotal for effective lung disease screening. However, two challenges complicate the classification task: the first is the inherent blurring of feature differences between main patterns due to ultrasound imaging properties; and the second is the potential coexistence of multiple patterns in a single case, with only the most dominant pattern being clinically annotated. To address these challenges, we propose the active learning based on multi-enhanced views (MEVAL) method to achieve more precise pattern classification in LUS. To accentuate feature differences between multiple patterns, we introduce a feature enhancement module by applying vertical linear fitting and k-means clustering. The multi-enhanced views are then employed in parallel with the original images, thus enhancing MEVAL’s awareness of feature differences between multiple patterns. To tackle the patterns coexistence issue, we propose an active learning strategy based on confidence sets and misclassified sets. This strategy enables the network to simultaneously recognize multiple patterns by selectively labeling of a small number of images. Our dataset comprises 5075 LUS images, with approximately 4% exhibiting multiple patterns. Experimental results showcase the effectiveness of the proposed method in the classification task, with accuracy of 98.72%, AUC of 0.9989, sensitivity of 98.76%, and specificity of 98.16%, which outperforms than the state-of-the-art deep learning-based methods. A series of comprehensive ablation studies suggest the effectiveness of each proposed component and show great potential in clinical application.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102454"},"PeriodicalIF":5.4,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142565040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0