Medical image analysis最新文献_第2页

Benefit from public unlabeled data: A Frangi filter-based pretraining network for 3D cerebrovascular segmentation. 受益于公共未标记数据：一种基于Frangi滤波器的三维脑血管分割预训练网络。

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-17 DOI: 10.1016/j.media.2024.103442

Gen Shi, Hao Lu, Hui Hui, Jie Tian

Precise cerebrovascular segmentation in Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) data is crucial for computer-aided clinical diagnosis. The sparse distribution of cerebrovascular structures within TOF-MRA images often results in high costs for manual data labeling. Leveraging unlabeled TOF-MRA data can significantly enhance model performance. In this study, we have constructed the largest preprocessed unlabeled TOF-MRA dataset to date, comprising 1510 subjects. Additionally, we provide manually annotated segmentation masks for 113 subjects based on existing external image datasets to facilitate evaluation. We propose a simple yet effective pretraining strategy utilizing the Frangi filter, known for its capability to enhance vessel-like structures, to optimize the use of the unlabeled data for 3D cerebrovascular segmentation. This involves a Frangi filter-based preprocessing workflow tailored for large-scale unlabeled datasets and a multi-task pretraining strategy to efficiently utilize the preprocessed data. This approach ensures maximal extraction of useful knowledge from the unlabeled data. The efficacy of the pretrained model is assessed across four cerebrovascular segmentation datasets, where it demonstrates superior performance, improving the clDice metric by approximately 2%-3% compared to the latest semi- and self-supervised methods. Additionally, ablation studies validate the generalizability and effectiveness of our pretraining method across various backbone structures. The code and data have been open source at: https://github.com/shigen-StoneRoot/FFPN.

飞行时间磁共振血管成像（TOF-MRA）数据中精确的脑血管分割对计算机辅助临床诊断至关重要。脑血管结构在TOF-MRA图像中的稀疏分布往往导致人工数据标记成本高。利用未标记的TOF-MRA数据可以显著提高模型性能。在这项研究中，我们构建了迄今为止最大的预处理无标记TOF-MRA数据集，包括1510名受试者。此外，我们还基于现有的外部图像数据集为113个受试者提供了手动标注的分割掩码，以方便评估。我们提出了一种简单而有效的预训练策略，利用Frangi过滤器，以其增强血管样结构的能力而闻名，以优化未标记数据的3D脑血管分割的使用。这包括为大规模未标记数据集定制的基于Frangi滤波器的预处理工作流和多任务预训练策略，以有效利用预处理数据。这种方法确保从未标记的数据中最大限度地提取有用的知识。在四个脑血管分割数据集上对预训练模型的有效性进行了评估，在这些数据集上，它表现出了卓越的性能，与最新的半监督和自监督方法相比，clDice指标提高了约2%-3%。此外，消融研究验证了我们的预训练方法在各种骨干结构中的广泛性和有效性。代码和数据已在https://github.com/shigen-StoneRoot/FFPN上开放源代码。

{"title":"Benefit from public unlabeled data: A Frangi filter-based pretraining network for 3D cerebrovascular segmentation.","authors":"Gen Shi, Hao Lu, Hui Hui, Jie Tian","doi":"10.1016/j.media.2024.103442","DOIUrl":"https://doi.org/10.1016/j.media.2024.103442","url":null,"abstract":"Precise cerebrovascular segmentation in Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) data is crucial for computer-aided clinical diagnosis. The sparse distribution of cerebrovascular structures within TOF-MRA images often results in high costs for manual data labeling. Leveraging unlabeled TOF-MRA data can significantly enhance model performance. In this study, we have constructed the largest preprocessed unlabeled TOF-MRA dataset to date, comprising 1510 subjects. Additionally, we provide manually annotated segmentation masks for 113 subjects based on existing external image datasets to facilitate evaluation. We propose a simple yet effective pretraining strategy utilizing the Frangi filter, known for its capability to enhance vessel-like structures, to optimize the use of the unlabeled data for 3D cerebrovascular segmentation. This involves a Frangi filter-based preprocessing workflow tailored for large-scale unlabeled datasets and a multi-task pretraining strategy to efficiently utilize the preprocessed data. This approach ensures maximal extraction of useful knowledge from the unlabeled data. The efficacy of the pretrained model is assessed across four cerebrovascular segmentation datasets, where it demonstrates superior performance, improving the clDice metric by approximately 2%-3% compared to the latest semi- and self-supervised methods. Additionally, ablation studies validate the generalizability and effectiveness of our pretraining method across various backbone structures. The code and data have been open source at: https://github.com/shigen-StoneRoot/FFPN.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103442"},"PeriodicalIF":10.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143008076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identifying multilayer network hub by graph representation learning.

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-16 DOI: 10.1016/j.media.2025.103463

Defu Yang, Minjeong Kim, Yu Zhang, Guorong Wu

The recent advances in neuroimaging technology allow us to understand how the human brain is wired in vivo and how functional activity is synchronized across multiple regions. Growing evidence shows that the complexity of the functional connectivity is far beyond the widely used mono-layer network. Indeed, the hierarchical processing information among distinct brain regions and across multiple channels requires using a more advanced multilayer model to understand the synchronization across the brain that underlies functional brain networks. However, the principled approach for characterizing network organization in the context of multilayer topologies is largely unexplored. In this work, we present a novel multi-variate hub identification method that takes both the intra- and inter-layer network topologies into account. Specifically, we put the spotlight on the multilayer graph embeddings that allow us to separate connector hubs (connecting across network modules) with their peripheral nodes. The removal of these hub nodes breaks down the entire multilayer brain network into a set of disconnected communities. We have evaluated our novel multilayer hub identification method in task-based and resting-state functional images. Complimenting ongoing findings using mono-layer brain networks, our multilayer network analysis provides a new understanding of brain network topology that links functional connectivities with brain states and disease progression.

{"title":"Identifying multilayer network hub by graph representation learning.","authors":"Defu Yang, Minjeong Kim, Yu Zhang, Guorong Wu","doi":"10.1016/j.media.2025.103463","DOIUrl":"https://doi.org/10.1016/j.media.2025.103463","url":null,"abstract":"The recent advances in neuroimaging technology allow us to understand how the human brain is wired in vivo and how functional activity is synchronized across multiple regions. Growing evidence shows that the complexity of the functional connectivity is far beyond the widely used mono-layer network. Indeed, the hierarchical processing information among distinct brain regions and across multiple channels requires using a more advanced multilayer model to understand the synchronization across the brain that underlies functional brain networks. However, the principled approach for characterizing network organization in the context of multilayer topologies is largely unexplored. In this work, we present a novel multi-variate hub identification method that takes both the intra- and inter-layer network topologies into account. Specifically, we put the spotlight on the multilayer graph embeddings that allow us to separate connector hubs (connecting across network modules) with their peripheral nodes. The removal of these hub nodes breaks down the entire multilayer brain network into a set of disconnected communities. We have evaluated our novel multilayer hub identification method in task-based and resting-state functional images. Complimenting ongoing findings using mono-layer brain networks, our multilayer network analysis provides a new understanding of brain network topology that links functional connectivities with brain states and disease progression.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103463"},"PeriodicalIF":10.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143023985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

UnICLAM: Contrastive representation learning with adversarial masking for unified and interpretable Medical Vision Question Answering.

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-15 DOI: 10.1016/j.media.2025.103464

Chenlu Zhan, Peng Peng, Hongwei Wang, Gaoang Wang, Yu Lin, Tao Chen, Hongsen Wang

Medical Visual Question Answering aims to assist doctors in decision-making when answering clinical questions regarding radiology images. Nevertheless, current models learn cross-modal representations through residing vision and text encoders in dual separate spaces, which inevitably leads to indirect semantic alignment. In this paper, we propose UnICLAM, a Unified and Interpretable Medical-VQA model through Contrastive Representation Learning with Adversarial Masking. To achieve the learning of an aligned image-text representation, we first establish a unified dual-stream pre-training structure with the gradually soft-parameter sharing strategy for alignment. Specifically, the proposed strategy learns a constraint for the vision and text encoders to be close in the same space, which is gradually loosened as the number of layers increases, so as to narrow the distance between the two different modalities. For grasping the unified semantic cross-modal representation, we extend the adversarial masking data augmentation to the contrastive representation learning of vision and text in a unified manner. While the encoder training minimizes the distance between the original and masking samples, the adversarial masking module keeps adversarial learning to conversely maximize the distance. We also intuitively take a further exploration of the unified adversarial masking augmentation method, which improves the potential ante-hoc interpretability with remarkable performance and efficiency. Experimental results on VQA-RAD and SLAKE benchmarks demonstrate that UnICLAM outperforms existing 11 state-of-the-art Medical-VQA methods. More importantly, we make an additional discussion about the performance of UnICLAM in diagnosing heart failure, verifying that UnICLAM exhibits superior few-shot adaption performance in practical disease diagnosis. The codes and models will be released upon the acceptance of the paper.

{"title":"UnICLAM: Contrastive representation learning with adversarial masking for unified and interpretable Medical Vision Question Answering.","authors":"Chenlu Zhan, Peng Peng, Hongwei Wang, Gaoang Wang, Yu Lin, Tao Chen, Hongsen Wang","doi":"10.1016/j.media.2025.103464","DOIUrl":"https://doi.org/10.1016/j.media.2025.103464","url":null,"abstract":"Medical Visual Question Answering aims to assist doctors in decision-making when answering clinical questions regarding radiology images. Nevertheless, current models learn cross-modal representations through residing vision and text encoders in dual separate spaces, which inevitably leads to indirect semantic alignment. In this paper, we propose UnICLAM, a Unified and Interpretable Medical-VQA model through Contrastive Representation Learning with Adversarial Masking. To achieve the learning of an aligned image-text representation, we first establish a unified dual-stream pre-training structure with the gradually soft-parameter sharing strategy for alignment. Specifically, the proposed strategy learns a constraint for the vision and text encoders to be close in the same space, which is gradually loosened as the number of layers increases, so as to narrow the distance between the two different modalities. For grasping the unified semantic cross-modal representation, we extend the adversarial masking data augmentation to the contrastive representation learning of vision and text in a unified manner. While the encoder training minimizes the distance between the original and masking samples, the adversarial masking module keeps adversarial learning to conversely maximize the distance. We also intuitively take a further exploration of the unified adversarial masking augmentation method, which improves the potential ante-hoc interpretability with remarkable performance and efficiency. Experimental results on VQA-RAD and SLAKE benchmarks demonstrate that UnICLAM outperforms existing 11 state-of-the-art Medical-VQA methods. More importantly, we make an additional discussion about the performance of UnICLAM in diagnosing heart failure, verifying that UnICLAM exhibits superior few-shot adaption performance in practical disease diagnosis. The codes and models will be released upon the acceptance of the paper.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103464"},"PeriodicalIF":10.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143029158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SIRE: Scale-invariant, rotation-equivariant estimation of artery orientations using graph neural networks.

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-15 DOI: 10.1016/j.media.2025.103467

Dieuwertje Alblas, Julian Suk, Christoph Brune, Kak Khee Yeung, Jelmer M Wolterink

The orientation of a blood vessel as visualized in 3D medical images is an important descriptor of its geometry that can be used for centerline extraction and subsequent segmentation, labeling, and visualization. Blood vessels appear at multiple scales and levels of tortuosity, and determining the exact orientation of a vessel is a challenging problem. Recent works have used 3D convolutional neural networks (CNNs) for this purpose, but CNNs are sensitive to variations in vessel size and orientation. We present SIRE: a scale-invariant rotation-equivariant estimator for local vessel orientation. SIRE is modular and has strongly generalizing properties due to symmetry preservations. SIRE consists of a gauge equivariant mesh CNN (GEM-CNN) that operates in parallel on multiple nested spherical meshes with different sizes. The features on each mesh are a projection of image intensities within the corresponding sphere. These features are intrinsic to the sphere and, in combination with the gauge equivariant properties of GEM-CNN, lead to SO(3) rotation equivariance. Approximate scale invariance is achieved by weight sharing and use of a symmetric maximum aggregation function to combine predictions at multiple scales. Hence, SIRE can be trained with arbitrarily oriented vessels with varying radii to generalize to vessels with a wide range of calibres and tortuosity. We demonstrate the efficacy of SIRE using three datasets containing vessels of varying scales; the vascular model repository (VMR), the ASOCA coronary artery set, and an in-house set of abdominal aortic aneurysms (AAAs). We embed SIRE in a centerline tracker which accurately tracks large calibre AAAs, regardless of the data SIRE is trained with. Moreover, a tracker can use SIRE to track small-calibre tortuous coronary arteries, even when trained only with large-calibre, non-tortuous AAAs. Additional experiments are performed to verify the rotational equivariant and scale invariant properties of SIRE. In conclusion, by incorporating SO(3) and scale symmetries, SIRE can be used to determine orientations of vessels outside of the training domain, offering a robust and data-efficient solution to geometric analysis of blood vessels in 3D medical images.

{"title":"SIRE: Scale-invariant, rotation-equivariant estimation of artery orientations using graph neural networks.","authors":"Dieuwertje Alblas, Julian Suk, Christoph Brune, Kak Khee Yeung, Jelmer M Wolterink","doi":"10.1016/j.media.2025.103467","DOIUrl":"https://doi.org/10.1016/j.media.2025.103467","url":null,"abstract":"The orientation of a blood vessel as visualized in 3D medical images is an important descriptor of its geometry that can be used for centerline extraction and subsequent segmentation, labeling, and visualization. Blood vessels appear at multiple scales and levels of tortuosity, and determining the exact orientation of a vessel is a challenging problem. Recent works have used 3D convolutional neural networks (CNNs) for this purpose, but CNNs are sensitive to variations in vessel size and orientation. We present SIRE: a scale-invariant rotation-equivariant estimator for local vessel orientation. SIRE is modular and has strongly generalizing properties due to symmetry preservations. SIRE consists of a gauge equivariant mesh CNN (GEM-CNN) that operates in parallel on multiple nested spherical meshes with different sizes. The features on each mesh are a projection of image intensities within the corresponding sphere. These features are intrinsic to the sphere and, in combination with the gauge equivariant properties of GEM-CNN, lead to SO(3) rotation equivariance. Approximate scale invariance is achieved by weight sharing and use of a symmetric maximum aggregation function to combine predictions at multiple scales. Hence, SIRE can be trained with arbitrarily oriented vessels with varying radii to generalize to vessels with a wide range of calibres and tortuosity. We demonstrate the efficacy of SIRE using three datasets containing vessels of varying scales; the vascular model repository (VMR), the ASOCA coronary artery set, and an in-house set of abdominal aortic aneurysms (AAAs). We embed SIRE in a centerline tracker which accurately tracks large calibre AAAs, regardless of the data SIRE is trained with. Moreover, a tracker can use SIRE to track small-calibre tortuous coronary arteries, even when trained only with large-calibre, non-tortuous AAAs. Additional experiments are performed to verify the rotational equivariant and scale invariant properties of SIRE. In conclusion, by incorporating SO(3) and scale symmetries, SIRE can be used to determine orientations of vessels outside of the training domain, offering a robust and data-efficient solution to geometric analysis of blood vessels in 3D medical images.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103467"},"PeriodicalIF":10.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143024001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

When multiple instance learning meets foundation models: Advancing histological whole slide image analysis.

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-14 DOI: 10.1016/j.media.2025.103456

Hongming Xu, Mingkang Wang, Duanbo Shi, Huamin Qin, Yunpeng Zhang, Zaiyi Liu, Anant Madabhushi, Peng Gao, Fengyu Cong, Cheng Lu

Deep multiple instance learning (MIL) pipelines are the mainstream weakly supervised learning methodologies for whole slide image (WSI) classification. However, it remains unclear how these widely used approaches compare to each other, given the recent proliferation of foundation models (FMs) for patch-level embedding and the diversity of slide-level aggregations. This paper implemented and systematically compared six FMs and six recent MIL methods by organizing different feature extractions and aggregations across seven clinically relevant end-to-end prediction tasks using WSIs from 4044 patients with four different cancer types. We tested state-of-the-art (SOTA) FMs in computational pathology, including CTransPath, PathoDuet, PLIP, CONCH, and UNI, as patch-level feature extractors. Feature aggregators, such as attention-based pooling, transformers, and dynamic graphs were thoroughly tested. Our experiments on cancer grading, biomarker status prediction, and microsatellite instability (MSI) prediction suggest that (1) FMs like UNI, trained with more diverse histological images, outperform generic models with smaller training datasets in patch embeddings, significantly enhancing downstream MIL classification accuracy and model training convergence speed, (2) instance feature fine-tuning, known as online feature re-embedding, to capture both fine-grained details and spatial interactions can often further improve WSI classification performance, (3) FMs advance MIL models by enabling promising grading classifications, biomarker status, and MSI predictions without requiring pixel- or patch-level annotations. These findings encourage the development of advanced, domain-specific FMs, aimed at more universally applicable diagnostic tasks, aligning with the evolving needs of clinical AI in pathology.

{"title":"When multiple instance learning meets foundation models: Advancing histological whole slide image analysis.","authors":"Hongming Xu, Mingkang Wang, Duanbo Shi, Huamin Qin, Yunpeng Zhang, Zaiyi Liu, Anant Madabhushi, Peng Gao, Fengyu Cong, Cheng Lu","doi":"10.1016/j.media.2025.103456","DOIUrl":"https://doi.org/10.1016/j.media.2025.103456","url":null,"abstract":"Deep multiple instance learning (MIL) pipelines are the mainstream weakly supervised learning methodologies for whole slide image (WSI) classification. However, it remains unclear how these widely used approaches compare to each other, given the recent proliferation of foundation models (FMs) for patch-level embedding and the diversity of slide-level aggregations. This paper implemented and systematically compared six FMs and six recent MIL methods by organizing different feature extractions and aggregations across seven clinically relevant end-to-end prediction tasks using WSIs from 4044 patients with four different cancer types. We tested state-of-the-art (SOTA) FMs in computational pathology, including CTransPath, PathoDuet, PLIP, CONCH, and UNI, as patch-level feature extractors. Feature aggregators, such as attention-based pooling, transformers, and dynamic graphs were thoroughly tested. Our experiments on cancer grading, biomarker status prediction, and microsatellite instability (MSI) prediction suggest that (1) FMs like UNI, trained with more diverse histological images, outperform generic models with smaller training datasets in patch embeddings, significantly enhancing downstream MIL classification accuracy and model training convergence speed, (2) instance feature fine-tuning, known as online feature re-embedding, to capture both fine-grained details and spatial interactions can often further improve WSI classification performance, (3) FMs advance MIL models by enabling promising grading classifications, biomarker status, and MSI predictions without requiring pixel- or patch-level annotations. These findings encourage the development of advanced, domain-specific FMs, aimed at more universally applicable diagnostic tasks, aligning with the evolving needs of clinical AI in pathology.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103456"},"PeriodicalIF":10.7,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143024006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic spectrum-driven hierarchical learning network for polyp segmentation.

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-14 DOI: 10.1016/j.media.2024.103449

Haolin Wang, Kai-Ni Wang, Jie Hua, Yi Tang, Yang Chen, Guang-Quan Zhou, Shuo Li

Accurate automatic polyp segmentation in colonoscopy is crucial for the prompt prevention of colorectal cancer. However, the heterogeneous nature of polyps and differences in lighting and visibility conditions present significant challenges in achieving reliable and consistent segmentation across different cases. Therefore, this study proposes a novel dynamic spectrum-driven hierarchical learning model (DSHNet), the first to specifically leverage image frequency domain information to explore region-level salience differences among and within polyps for precise segmentation. A novel spectral decoupler is advanced to separate low-frequency and high-frequency components, leveraging their distinct characteristics to guide the model in learning valuable frequency features without bias through automatic masking. The low-frequency driven region-level saliency modeling then generates dynamic convolution kernels with individual frequency-aware features, which regulate region-level saliency modeling together with the supervision of the hierarchy of labels, thus enabling adaptation to polyp heterogeneous and illumination variation simultaneously. Meanwhile, the high-frequency attention module is designed to preserve the detailed information at the skip connections, which complements the focus on spatial features at various stages. Experimental results demonstrate that the proposed method outperforms other state-of-the-art polyp segmentation techniques, achieving robust and superior results on five diverse datasets. Codes are available at https://github.com/gardnerzhou/DSHNet.

{"title":"Dynamic spectrum-driven hierarchical learning network for polyp segmentation.","authors":"Haolin Wang, Kai-Ni Wang, Jie Hua, Yi Tang, Yang Chen, Guang-Quan Zhou, Shuo Li","doi":"10.1016/j.media.2024.103449","DOIUrl":"https://doi.org/10.1016/j.media.2024.103449","url":null,"abstract":"Accurate automatic polyp segmentation in colonoscopy is crucial for the prompt prevention of colorectal cancer. However, the heterogeneous nature of polyps and differences in lighting and visibility conditions present significant challenges in achieving reliable and consistent segmentation across different cases. Therefore, this study proposes a novel dynamic spectrum-driven hierarchical learning model (DSHNet), the first to specifically leverage image frequency domain information to explore region-level salience differences among and within polyps for precise segmentation. A novel spectral decoupler is advanced to separate low-frequency and high-frequency components, leveraging their distinct characteristics to guide the model in learning valuable frequency features without bias through automatic masking. The low-frequency driven region-level saliency modeling then generates dynamic convolution kernels with individual frequency-aware features, which regulate region-level saliency modeling together with the supervision of the hierarchy of labels, thus enabling adaptation to polyp heterogeneous and illumination variation simultaneously. Meanwhile, the high-frequency attention module is designed to preserve the detailed information at the skip connections, which complements the focus on spatial features at various stages. Experimental results demonstrate that the proposed method outperforms other state-of-the-art polyp segmentation techniques, achieving robust and superior results on five diverse datasets. Codes are available at https://github.com/gardnerzhou/DSHNet.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103449"},"PeriodicalIF":10.7,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143029157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-center brain age prediction via dual-modality fusion convolutional network 基于双模融合卷积网络的多中心脑年龄预测

IF 10.9 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-10 DOI: 10.1016/j.media.2025.103455

Xuebin Chang, Xiaoyan Jia, Simon B. Eickhoff, Debo Dong, Wei Zeng

Accurate prediction of brain age is crucial for identifying deviations between typical individual brain development trajectories and neuropsychiatric disease progression. Although current research has made progress, the effective application of brain age prediction models to multi-center datasets, particularly those with small-sample sizes, remains a significant challenge that is yet to be addressed. To this end, we propose a multi-center data correction method, which employs a domain adaptation correction strategy with Wasserstein distance of optimal transport, along with maximum mean discrepancy to improve the generalizability of brain-age prediction models on small-sample datasets. Additionally, most of the existing brain age models based on neuroimage identify the task of predicting brain age as a regression or classification problem, which may affect the accuracy of the prediction. Therefore, we propose a brain dual-modality fused convolutional neural network model (BrainDCN) for brain age prediction, and optimize this model by introducing a joint loss function of mean absolute error and cross-entropy, which identifies the prediction of brain age as both a regression and classification task. Furthermore, to highlight age-related features, we construct weighting matrices and vectors from a single-center training set and apply them to multi-center datasets to weight important features. We validate the BrainDCN model on the CamCAN dataset and achieve the lowest average absolute error compared to state-of-the-art models, demonstrating its superiority. Notably, the joint loss function and weighted features can further improve the prediction accuracy. More importantly, our proposed multi-center correction method is tested on four neuroimaging datasets and achieves the lowest average absolute error compared to widely used correction methods, highlighting the superior performance of the method in cross-center data integration and analysis. Furthermore, the application to multi-center schizophrenia data shows a mean accelerated aging compared to normal controls. Thus, this research establishes a pivotal methodological foundation for multi-center brain age prediction studies, exhibiting considerable applicability in clinical contexts, which are predominantly characterized by small-sample datasets.

准确预测脑年龄对于识别典型个体大脑发育轨迹和神经精神疾病进展之间的偏差至关重要。尽管目前的研究已经取得了一定的进展，但如何将脑年龄预测模型有效地应用于多中心数据集，特别是小样本数据集，仍然是一个有待解决的重大挑战。为此，我们提出了一种多中心数据校正方法，该方法采用Wasserstein最优传输距离和最大平均差异的域自适应校正策略，以提高脑年龄预测模型在小样本数据集上的可泛化性。此外，大多数现有的基于神经图像的脑年龄模型将预测脑年龄的任务识别为回归或分类问题，这可能会影响预测的准确性。因此，我们提出了脑年龄预测的脑双模融合卷积神经网络模型（BrainDCN），并通过引入平均绝对误差和交叉熵的联合损失函数对该模型进行优化，该模型将脑年龄预测识别为回归和分类任务。此外，为了突出年龄相关特征，我们从单中心训练集构建加权矩阵和向量，并将其应用于多中心数据集，对重要特征进行加权。我们在CamCAN数据集上验证了BrainDCN模型，与最先进的模型相比，实现了最低的平均绝对误差，证明了它的优越性。值得注意的是，联合损失函数和加权特征可以进一步提高预测精度。更重要的是，我们提出的多中心校正方法在四个神经影像数据集上进行了测试，与广泛使用的校正方法相比，获得了最低的平均绝对误差，突出了该方法在跨中心数据集成和分析方面的优越性能。此外，应用于多中心精神分裂症数据显示，与正常对照相比，平均加速衰老。因此，本研究为多中心脑年龄预测研究奠定了关键的方法学基础，在临床环境中表现出相当大的适用性，这些研究主要以小样本数据集为特征。

{"title":"Multi-center brain age prediction via dual-modality fusion convolutional network","authors":"Xuebin Chang, Xiaoyan Jia, Simon B. Eickhoff, Debo Dong, Wei Zeng","doi":"10.1016/j.media.2025.103455","DOIUrl":"https://doi.org/10.1016/j.media.2025.103455","url":null,"abstract":"Accurate prediction of brain age is crucial for identifying deviations between typical individual brain development trajectories and neuropsychiatric disease progression. Although current research has made progress, the effective application of brain age prediction models to multi-center datasets, particularly those with small-sample sizes, remains a significant challenge that is yet to be addressed. To this end, we propose a multi-center data correction method, which employs a domain adaptation correction strategy with Wasserstein distance of optimal transport, along with maximum mean discrepancy to improve the generalizability of brain-age prediction models on small-sample datasets. Additionally, most of the existing brain age models based on neuroimage identify the task of predicting brain age as a regression or classification problem, which may affect the accuracy of the prediction. Therefore, we propose a brain dual-modality fused convolutional neural network model (BrainDCN) for brain age prediction, and optimize this model by introducing a joint loss function of mean absolute error and cross-entropy, which identifies the prediction of brain age as both a regression and classification task. Furthermore, to highlight age-related features, we construct weighting matrices and vectors from a single-center training set and apply them to multi-center datasets to weight important features. We validate the BrainDCN model on the CamCAN dataset and achieve the lowest average absolute error compared to state-of-the-art models, demonstrating its superiority. Notably, the joint loss function and weighted features can further improve the prediction accuracy. More importantly, our proposed multi-center correction method is tested on four neuroimaging datasets and achieves the lowest average absolute error compared to widely used correction methods, highlighting the superior performance of the method in cross-center data integration and analysis. Furthermore, the application to multi-center schizophrenia data shows a mean accelerated aging compared to normal controls. Thus, this research establishes a pivotal methodological foundation for multi-center brain age prediction studies, exhibiting considerable applicability in clinical contexts, which are predominantly characterized by small-sample datasets.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"9 1","pages":""},"PeriodicalIF":10.9,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142990539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Measurement of biomechanical properties of transversely isotropic biological tissue using traveling wave expansion 用行波扩展法测量横向各向同性生物组织的生物力学特性

IF 10.9 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-09 DOI: 10.1016/j.media.2025.103457

Shengyuan Ma, Zhao He, Runke Wang, Aili Zhang, Qingfang Sun, Jun Liu, Fuhua Yan, Michael S. Sacks, Xi-Qiao Feng, Guang-Zhong Yang, Yuan Feng

The anisotropic mechanical properties of fiber-embedded biological tissues are essential for understanding their development, aging, disease progression, and response to therapy. However, accurate and fast assessment of mechanical anisotropy in vivo using elastography remains challenging. To address the dilemma of achieving both accuracy and efficiency in this inverse problem involving complex wave equations, we propose a computational framework that utilizes the traveling wave expansion model. This framework leverages the unique wave characteristics of transversely isotropic material and physically meaningful operator combinations. The analytical solutions for inversion are derived and engineering optimization is made to adapt to actual scenarios. Measurement results using simulations, ex vivo muscle tissue, and in vivo human white matter validate the framework in determining in vivo anisotropic biomechanical properties, highlighting its potential for measurement of a variety of fiber-embedded biological tissues.

纤维嵌入生物组织的各向异性力学特性对于理解其发育、衰老、疾病进展和对治疗的反应至关重要。然而，使用弹性成像准确、快速地评估体内力学各向异性仍然具有挑战性。为了解决在涉及复杂波动方程的反问题中实现精度和效率的困境，我们提出了一个利用行波展开模型的计算框架。该框架利用了横向各向同性材料的独特波特性和物理上有意义的算子组合。推导了反演的解析解，并根据实际情况进行了工程优化。使用模拟、离体肌肉组织和体内人类白质的测量结果验证了确定体内各向异性生物力学特性的框架，突出了其测量各种纤维嵌入生物组织的潜力。

{"title":"Measurement of biomechanical properties of transversely isotropic biological tissue using traveling wave expansion","authors":"Shengyuan Ma, Zhao He, Runke Wang, Aili Zhang, Qingfang Sun, Jun Liu, Fuhua Yan, Michael S. Sacks, Xi-Qiao Feng, Guang-Zhong Yang, Yuan Feng","doi":"10.1016/j.media.2025.103457","DOIUrl":"https://doi.org/10.1016/j.media.2025.103457","url":null,"abstract":"The anisotropic mechanical properties of fiber-embedded biological tissues are essential for understanding their development, aging, disease progression, and response to therapy. However, accurate and fast assessment of mechanical anisotropy in <ce:italic>vivo</ce:italic> using elastography remains challenging. To address the dilemma of achieving both accuracy and efficiency in this inverse problem involving complex wave equations, we propose a computational framework that utilizes the traveling wave expansion model. This framework leverages the unique wave characteristics of transversely isotropic material and physically meaningful operator combinations. The analytical solutions for inversion are derived and engineering optimization is made to adapt to actual scenarios. Measurement results using simulations, <ce:italic>ex vivo</ce:italic> muscle tissue, and <ce:italic>in vivo</ce:italic> human white matter validate the framework in determining <ce:italic>in vivo</ce:italic> anisotropic biomechanical properties, highlighting its potential for measurement of a variety of fiber-embedded biological tissues.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"7 1","pages":"103457"},"PeriodicalIF":10.9,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142990543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Graph neural networks in histopathology: Emerging trends and future directions. 组织病理学中的神经网络图：新兴趋势和未来方向。

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-07 DOI: 10.1016/j.media.2024.103444

Siemen Brussee, Giorgio Buzzanca, Anne M R Schrader, Jesper Kers

Histopathological analysis of whole slide images (WSIs) has seen a surge in the utilization of deep learning methods, particularly Convolutional Neural Networks (CNNs). However, CNNs often fail to capture the intricate spatial dependencies inherent in WSIs. Graph Neural Networks (GNNs) present a promising alternative, adept at directly modeling pairwise interactions and effectively discerning the topological tissue and cellular structures within WSIs. Recognizing the pressing need for deep learning techniques that harness the topological structure of WSIs, the application of GNNs in histopathology has experienced rapid growth. In this comprehensive review, we survey GNNs in histopathology, discuss their applications, and explore emerging trends that pave the way for future advancements in the field. We begin by elucidating the fundamentals of GNNs and their potential applications in histopathology. Leveraging quantitative literature analysis, we explore four emerging trends: Hierarchical GNNs, Adaptive Graph Structure Learning, Multimodal GNNs, and Higher-order GNNs. Through an in-depth exploration of these trends, we offer insights into the evolving landscape of GNNs in histopathological analysis. Based on our findings, we propose future directions to propel the field forward. Our analysis serves to guide researchers and practitioners towards innovative approaches and methodologies, fostering advancements in histopathological analysis through the lens of graph neural networks.

对整个幻灯片图像（wsi）的组织病理学分析已经看到了深度学习方法的使用激增，特别是卷积神经网络（cnn）。然而，cnn往往无法捕捉到wsi固有的复杂空间依赖性。图神经网络（gnn）提供了一个有前途的替代方案，擅长于直接建模成对相互作用，并有效识别wsi内的拓扑组织和细胞结构。认识到利用wsi拓扑结构的深度学习技术的迫切需要，gnn在组织病理学中的应用经历了快速增长。在这篇全面的综述中，我们调查了组织病理学中的gnn，讨论了它们的应用，并探讨了为该领域未来发展铺平道路的新兴趋势。我们首先阐明gnn的基本原理及其在组织病理学中的潜在应用。利用定量文献分析，我们探讨了四个新兴趋势：分层gnn、自适应图结构学习、多模态gnn和高阶gnn。通过对这些趋势的深入探索，我们提供了对组织病理学分析中gnn演变景观的见解。基于我们的发现，我们提出了推动该领域向前发展的未来方向。我们的分析有助于指导研究人员和实践者走向创新的方法和方法，通过图神经网络的镜头促进组织病理学分析的进步。

{"title":"Graph neural networks in histopathology: Emerging trends and future directions.","authors":"Siemen Brussee, Giorgio Buzzanca, Anne M R Schrader, Jesper Kers","doi":"10.1016/j.media.2024.103444","DOIUrl":"https://doi.org/10.1016/j.media.2024.103444","url":null,"abstract":"Histopathological analysis of whole slide images (WSIs) has seen a surge in the utilization of deep learning methods, particularly Convolutional Neural Networks (CNNs). However, CNNs often fail to capture the intricate spatial dependencies inherent in WSIs. Graph Neural Networks (GNNs) present a promising alternative, adept at directly modeling pairwise interactions and effectively discerning the topological tissue and cellular structures within WSIs. Recognizing the pressing need for deep learning techniques that harness the topological structure of WSIs, the application of GNNs in histopathology has experienced rapid growth. In this comprehensive review, we survey GNNs in histopathology, discuss their applications, and explore emerging trends that pave the way for future advancements in the field. We begin by elucidating the fundamentals of GNNs and their potential applications in histopathology. Leveraging quantitative literature analysis, we explore four emerging trends: Hierarchical GNNs, Adaptive Graph Structure Learning, Multimodal GNNs, and Higher-order GNNs. Through an in-depth exploration of these trends, we offer insights into the evolving landscape of GNNs in histopathological analysis. Based on our findings, we propose future directions to propel the field forward. Our analysis serves to guide researchers and practitioners towards innovative approaches and methodologies, fostering advancements in histopathological analysis through the lens of graph neural networks.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103444"},"PeriodicalIF":10.7,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142965949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Abductive multi-instance multi-label learning for periodontal disease classification with prior domain knowledge 基于先验领域知识的牙周病分类的溯因多实例多标签学习

IF 10.9 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis

Pub Date : 2025-01-07 DOI: 10.1016/j.media.2024.103452

Zi-Yuan Wu, Wei Guo, Wei Zhou, Han-Jia Ye, Yuan Jiang, Houxuan Li, Zhi-Hua Zhou

Machine learning is widely used in dentistry nowadays, offering efficient solutions for diagnosing dental diseases, such as periodontitis and gingivitis. Most existing methods for diagnosing periodontal diseases follow a two-stage process. Initially, they detect and classify potential Regions of Interest (ROIs) and subsequently determine the labels of the whole images. However, unlike the recognition of natural images, the diagnosis of periodontal diseases relies significantly on pinpointing specific affected regions, which requires professional expertise that is not fully captured by existing models. To bridge this gap, we propose a novel ABductive Multi-Instance Multi-Label learning (AB-MIML) approach. In our approach, we treat entire intraoral images as “bags” and local patches as “instances”. By improving current multi-instance multi-label methods, AB-MIML seeks to establish a comprehensive many-to-many relationship to model the intricate correspondence among images, patches, and corresponding labels. Moreover, to harness the power of prior domain knowledge, AB-MIML converts the expertise of doctors and the structural information of images into a knowledge base and performs abductive reasoning to assist the classification and diagnosis process. Experiments unequivocally confirm the superior performance of our proposed method in diagnosing periodontal diseases compared to state-of-the-art approaches across various metrics. Moreover, our method proves invaluable in identifying critical areas correlated with the diagnosis process, aligning closely with determinations made by human doctors.

目前，机器学习在牙科领域得到了广泛的应用，为牙周炎、牙龈炎等口腔疾病的诊断提供了有效的解决方案。大多数现有的诊断牙周病的方法都遵循两个阶段的过程。首先，他们检测和分类潜在的兴趣区域（roi），随后确定整个图像的标签。然而，与自然图像的识别不同，牙周病的诊断在很大程度上依赖于精确定位特定的受影响区域，这需要专业知识，而现有模型无法完全捕获这些专业知识。为了弥补这一差距，我们提出了一种新的溯因多实例多标签学习（AB-MIML）方法。在我们的方法中，我们将整个口腔内图像视为“袋”，将局部斑块视为“实例”。通过改进现有的多实例多标签方法，AB-MIML寻求建立一种全面的多对多关系，以模拟图像、补丁和相应标签之间复杂的对应关系。此外，为了利用先验领域知识的力量，AB-MIML将医生的专业知识和图像的结构信息转换为知识库，并进行溯因推理以辅助分类和诊断过程。实验明确地证实了我们提出的方法在诊断牙周病方面的优越性能，与各种指标的最先进方法相比。此外，我们的方法在识别与诊断过程相关的关键区域方面被证明是无价的，与人类医生的决定密切相关。

{"title":"Abductive multi-instance multi-label learning for periodontal disease classification with prior domain knowledge","authors":"Zi-Yuan Wu, Wei Guo, Wei Zhou, Han-Jia Ye, Yuan Jiang, Houxuan Li, Zhi-Hua Zhou","doi":"10.1016/j.media.2024.103452","DOIUrl":"https://doi.org/10.1016/j.media.2024.103452","url":null,"abstract":"Machine learning is widely used in dentistry nowadays, offering efficient solutions for diagnosing dental diseases, such as periodontitis and gingivitis. Most existing methods for diagnosing periodontal diseases follow a two-stage process. Initially, they detect and classify potential Regions of Interest (ROIs) and subsequently determine the labels of the whole images. However, unlike the recognition of natural images, the diagnosis of periodontal diseases relies significantly on pinpointing specific affected regions, which requires professional expertise that is not fully captured by existing models. To bridge this gap, we propose a novel <ce:bold>AB</ce:bold>ductive <ce:bold>M</ce:bold>ulti-<ce:bold>I</ce:bold>nstance <ce:bold>M</ce:bold>ulti-<ce:bold>L</ce:bold>abel learning (<ce:bold>AB-MIML</ce:bold>) approach. In our approach, we treat entire intraoral images as “bags” and local patches as “instances”. By improving current multi-instance multi-label methods, AB-MIML seeks to establish a comprehensive many-to-many relationship to model the intricate correspondence among images, patches, and corresponding labels. Moreover, to harness the power of prior domain knowledge, AB-MIML converts the expertise of doctors and the structural information of images into a knowledge base and performs abductive reasoning to assist the classification and diagnosis process. Experiments unequivocally confirm the superior performance of our proposed method in diagnosing periodontal diseases compared to state-of-the-art approaches across various metrics. Moreover, our method proves invaluable in identifying critical areas correlated with the diagnosis process, aligning closely with determinations made by human doctors.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"7 1","pages":""},"PeriodicalIF":10.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142990544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0