International Journal of Imaging Systems and Technology最新文献

An Explainable AI for Blood Image Classification With Dynamic CNN Model Selection Framework

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-04-11 DOI: 10.1002/ima.70084

Datenji Sherpa, Dibakar Raj Pant

Explainable AI (XAI) frameworks are becoming essential in many areas, including the medical field, as they help us to understand AI decisions, increasing clinical trust and improving patient care. This research presents a robust and comprehensive Explainable AI framework. To classify images from the BloodMNIST and Raabin-WBC datasets, various pre-trained convolutional neural network (CNN) architectures: the VGG, the ResNet, the DenseNet, the EfficientNet, the MobileNet variants, the SqueezeNet, and the Xception are implemented both individually and in combination with SpinalNet. For parameter analysis, four models, VGG16, VGG19, ResNet50, and ResNet101, were combined with SpinalNet. Notably, these SpinalNet hybrid models significantly reduced the model parameters while maintaining or even improving the model accuracy. For example, the VGG 16 + SpinalNet shows a 40.74% parameter reduction and accuracy of 98.92% (BloodMnist) and 98.32% (Raabin-WBC). Similarly, the combinations of VGG19, ResNet50, and ResNet101 with SpinalNet resulted in weight parameter reductions by 36.36%, 65.33%, and 52.13%, respectively, with improved accuracy for both datasets. These hybrid SpinalNet models are highly efficient and well-suited for resource-limited environments. The authors have developed a dynamic model selection framework. This framework optimally selects the best models based on prediction scores, prioritizing lightweight models in cases of ties. This method guarantees that for every input, the most effective model is used, which results in higher accuracy as well as better outcomes. Explainable AI (XAI) techniques: Local Interpretable Model-agnostic Explanations (LIME), SHapley Additive ExPlanations (SHAP), and Gradient-weighted Class Activation Mapping (Grad-CAM) are implemented. These help us to understand the key features that influence the model predictions. By combining these XAI methods with dynamic model selection, this research not only achieves excellent accuracy but also provides useful insights into the elements that influence model predictions.

{"title":"An Explainable AI for Blood Image Classification With Dynamic CNN Model Selection Framework","authors":"Datenji Sherpa, Dibakar Raj Pant","doi":"10.1002/ima.70084","DOIUrl":"https://doi.org/10.1002/ima.70084","url":null,"abstract":"<div>\u0000 \u0000 <p>Explainable AI (XAI) frameworks are becoming essential in many areas, including the medical field, as they help us to understand AI decisions, increasing clinical trust and improving patient care. This research presents a robust and comprehensive Explainable AI framework. To classify images from the BloodMNIST and Raabin-WBC datasets, various pre-trained convolutional neural network (CNN) architectures: the VGG, the ResNet, the DenseNet, the EfficientNet, the MobileNet variants, the SqueezeNet, and the Xception are implemented both individually and in combination with SpinalNet. For parameter analysis, four models, VGG16, VGG19, ResNet50, and ResNet101, were combined with SpinalNet. Notably, these SpinalNet hybrid models significantly reduced the model parameters while maintaining or even improving the model accuracy. For example, the VGG 16 + SpinalNet shows a 40.74% parameter reduction and accuracy of 98.92% (BloodMnist) and 98.32% (Raabin-WBC). Similarly, the combinations of VGG19, ResNet50, and ResNet101 with SpinalNet resulted in weight parameter reductions by 36.36%, 65.33%, and 52.13%, respectively, with improved accuracy for both datasets. These hybrid SpinalNet models are highly efficient and well-suited for resource-limited environments. The authors have developed a dynamic model selection framework. This framework optimally selects the best models based on prediction scores, prioritizing lightweight models in cases of ties. This method guarantees that for every input, the most effective model is used, which results in higher accuracy as well as better outcomes. Explainable AI (XAI) techniques: Local Interpretable Model-agnostic Explanations (LIME), SHapley Additive ExPlanations (SHAP), and Gradient-weighted Class Activation Mapping (Grad-CAM) are implemented. These help us to understand the key features that influence the model predictions. By combining these XAI methods with dynamic model selection, this research not only achieves excellent accuracy but also provides useful insights into the elements that influence model predictions.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143818781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EDenseNetViT: Leveraging Ensemble Vision Transform Integrated Transfer Learning for Advanced Differentiation and Severity Scoring of Tuberculosis

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-04-11 DOI: 10.1002/ima.70082

Mamta Patankar, Vijayshri Chaurasia, Madhu Shandilya

Lung infections such as tuberculosis (TB), COVID-19, and pneumonia share similar symptoms, making early differentiation challenging with x-ray imaging. This can delay correct treatment and increase disease transmission. The study focuses on extracting hybrid features using multiple techniques to effectively distinguish between TB and other lung infections, proposing several methods for early detection and differentiation. To better diagnose TB, the paper presented an ensemble DenseNet with a Vision Transformer (ViT) network (EDenseNetViT). The proposed EDenseNetViT is an ensemble model of Densenet201 and a ViT network that will enhance the detection performance of TB with other lung infections such as pneumonia and COVID-19. Additionally, the EDenseNetViT is extended to predict the severity level of TB. This severity score approach is based on combined weighted low-level features and high-level features to show the severity level of TB as mild, moderate, severe, and fatal. The result evaluation was conducted using chest image datasets, that is Montgomery Dataset, Shenzhen Dataset, Chest x-ray Dataset, and COVID-19 Radiography Database. All data are merged and approx. Seven thousand images were selected for experimental design. The study tested seven baseline models for lung infection differentiation. Initially, DenseNet transfer learning models, including DenseNet121, DenseNet169, and DenseNet201, were assessed, with DenseNet201 performing the best. Subsequently, DenseNet201 was combined with Principal component analysis (PCA) and various classifiers, with the combination of PCA and random forest classifier proving the most effective. However, the EDenseNetViT model surpassed all and achieved approximately 99% accuracy in detecting TB and distinguishing it from other lung infections like pneumonia and COVID-19. The proposed EdenseNetViT model was used for classifying TB, Pneumonia, and COVID-19 and achieved an average accuracy of 99%, 98%, and 96% respectively. Compared to other existing models, EDenseNetViT outperformed the best.

{"title":"EDenseNetViT: Leveraging Ensemble Vision Transform Integrated Transfer Learning for Advanced Differentiation and Severity Scoring of Tuberculosis","authors":"Mamta Patankar, Vijayshri Chaurasia, Madhu Shandilya","doi":"10.1002/ima.70082","DOIUrl":"https://doi.org/10.1002/ima.70082","url":null,"abstract":"<div>\u0000 \u0000 <p>Lung infections such as tuberculosis (TB), COVID-19, and pneumonia share similar symptoms, making early differentiation challenging with x-ray imaging. This can delay correct treatment and increase disease transmission. The study focuses on extracting hybrid features using multiple techniques to effectively distinguish between TB and other lung infections, proposing several methods for early detection and differentiation. To better diagnose TB, the paper presented an ensemble DenseNet with a Vision Transformer (ViT) network (EDenseNetViT). The proposed EDenseNetViT is an ensemble model of Densenet201 and a ViT network that will enhance the detection performance of TB with other lung infections such as pneumonia and COVID-19. Additionally, the EDenseNetViT is extended to predict the severity level of TB. This severity score approach is based on combined weighted low-level features and high-level features to show the severity level of TB as mild, moderate, severe, and fatal. The result evaluation was conducted using chest image datasets, that is Montgomery Dataset, Shenzhen Dataset, Chest x-ray Dataset, and COVID-19 Radiography Database. All data are merged and approx. Seven thousand images were selected for experimental design. The study tested seven baseline models for lung infection differentiation. Initially, DenseNet transfer learning models, including DenseNet121, DenseNet169, and DenseNet201, were assessed, with DenseNet201 performing the best. Subsequently, DenseNet201 was combined with Principal component analysis (PCA) and various classifiers, with the combination of PCA and random forest classifier proving the most effective. However, the EDenseNetViT model surpassed all and achieved approximately 99% accuracy in detecting TB and distinguishing it from other lung infections like pneumonia and COVID-19. The proposed EdenseNetViT model was used for classifying TB, Pneumonia, and COVID-19 and achieved an average accuracy of 99%, 98%, and 96% respectively. Compared to other existing models, EDenseNetViT outperformed the best.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143818780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing 3D Global and Local Feature Extraction for Pneumonia Multilesion Segmentation

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-04-10 DOI: 10.1002/ima.70083

Huiyao He, Yinwei Zhan, Yulan Yan, Yuefu Zhan

Precise segmentation of pneumonia lesions using deep learning has been a research focus in medical image segmentation, in which convolutional neural networks (CNNs) excel at capturing local features through convolutional layers but struggle with global information, while Transformers handle global features and long-range dependencies well but require substantial computational resources and data. Motivated by the recently introduced Mamba that effectively models long-range dependencies with less complexity, we develop a novel network architecture in order to simultaneously enhance the handling of both global and local features. It integrates an enhanced Mamba module SE3DMamba to improve the extraction of three-dimensional global features and a medical version of deep residual convolution MDRConv to enhance the extraction of local features with a self-configuring mechanism. Experiments conducted on two pneumonia CT datasets, including the pneumonia multilesion segmentation dataset (PMLSegData) with three lesion types—consolidations, nodules, and cavities—and MosMedData of ground-glass opacifications demonstrate that our network surpasses state-of-the-art CNN and Transformer-based segmentation models across all tasks, advancing the clinical feasibility of deep learning for pneumonia multilesion segmentation.

{"title":"Enhancing 3D Global and Local Feature Extraction for Pneumonia Multilesion Segmentation","authors":"Huiyao He, Yinwei Zhan, Yulan Yan, Yuefu Zhan","doi":"10.1002/ima.70083","DOIUrl":"https://doi.org/10.1002/ima.70083","url":null,"abstract":"<div>\u0000 \u0000 <p>Precise segmentation of pneumonia lesions using deep learning has been a research focus in medical image segmentation, in which convolutional neural networks (CNNs) excel at capturing local features through convolutional layers but struggle with global information, while Transformers handle global features and long-range dependencies well but require substantial computational resources and data. Motivated by the recently introduced Mamba that effectively models long-range dependencies with less complexity, we develop a novel network architecture in order to simultaneously enhance the handling of both global and local features. It integrates an enhanced Mamba module SE3DMamba to improve the extraction of three-dimensional global features and a medical version of deep residual convolution MDRConv to enhance the extraction of local features with a self-configuring mechanism. Experiments conducted on two pneumonia CT datasets, including the pneumonia multilesion segmentation dataset (PMLSegData) with three lesion types—consolidations, nodules, and cavities—and MosMedData of ground-glass opacifications demonstrate that our network surpasses state-of-the-art CNN and Transformer-based segmentation models across all tasks, advancing the clinical feasibility of deep learning for pneumonia multilesion segmentation.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dual-Resonant RF Coil for Proton and Phosphorus Imaging at 7 Tesla MRI

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-04-10 DOI: 10.1002/ima.70081

Ashraf Abuelhaija, Gameel Saleh, Emad Awada, Sanaa Salama, Samer Issa, Osama Nashwan

Magnetic resonance spectroscopy (MRS) provides a non-invasive method for examining metabolic alterations associated with diseases. While ¹H-based MRS is commonly employed, its effectiveness is often limited by signal interference from water, reducing the accuracy of metabolite differentiation. In contrast, X-nuclei MRS leverages the broader chemical shift dispersion of non-hydrogen nuclei to enhance the ability to distinguish between metabolites. This article presents the design and analysis of a dual-resonant meandered coil for 7 Tesla magnetic resonance imaging (MRI), to simultaneously help in image hydrogen protons (¹H) and detect Phosphorus (³¹P) atomic nuclei at 298 MHz and 120.6 MHz, respectively. Both single-channel and four-channel configurations were designed and analyzed. The single-channel coil integrates an LC network for dual resonance, achieving excellent impedance matching (S₁₁ < −10 dB) and a homogeneous magnetic field distribution within the region of interest. A transmission-line-based matching network was implemented to optimize performance at both frequencies. The four-channel coil was simulated using CST Microwave Studio and experimentally validated. Simulations demonstrated impedance matching and minimal mutual coupling of −38 dB at 298 MHz and −24 dB at 120.6 MHz. The measured S-parameters confirmed these results, showing high decoupling and robust performance across all channels. The prototype featured integrated LC networks and optimized meander structures, ensuring efficient power transmission and uniform field distribution. This work highlights the effectiveness of the proposed dual-resonant coil designs for MRS applications, offering promising potential for advanced clinical diagnostics.

{"title":"Dual-Resonant RF Coil for Proton and Phosphorus Imaging at 7 Tesla MRI","authors":"Ashraf Abuelhaija, Gameel Saleh, Emad Awada, Sanaa Salama, Samer Issa, Osama Nashwan","doi":"10.1002/ima.70081","DOIUrl":"https://doi.org/10.1002/ima.70081","url":null,"abstract":"<div>\u0000 \u0000 <p>Magnetic resonance spectroscopy (MRS) provides a non-invasive method for examining metabolic alterations associated with diseases. While <sup>1</sup>H-based MRS is commonly employed, its effectiveness is often limited by signal interference from water, reducing the accuracy of metabolite differentiation. In contrast, X-nuclei MRS leverages the broader chemical shift dispersion of non-hydrogen nuclei to enhance the ability to distinguish between metabolites. This article presents the design and analysis of a dual-resonant meandered coil for 7 Tesla magnetic resonance imaging (MRI), to simultaneously help in image hydrogen protons (<sup>1</sup>H) and detect Phosphorus (<sup>31</sup>P) atomic nuclei at 298 MHz and 120.6 MHz, respectively. Both single-channel and four-channel configurations were designed and analyzed. The single-channel coil integrates an LC network for dual resonance, achieving excellent impedance matching (S<sub>11</sub> < −10 dB) and a homogeneous magnetic field distribution within the region of interest. A transmission-line-based matching network was implemented to optimize performance at both frequencies. The four-channel coil was simulated using CST Microwave Studio and experimentally validated. Simulations demonstrated impedance matching and minimal mutual coupling of −38 dB at 298 MHz and −24 dB at 120.6 MHz. The measured S-parameters confirmed these results, showing high decoupling and robust performance across all channels. The prototype featured integrated LC networks and optimized meander structures, ensuring efficient power transmission and uniform field distribution. This work highlights the effectiveness of the proposed dual-resonant coil designs for MRS applications, offering promising potential for advanced clinical diagnostics.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rectal Cancer Segmentation: A Methodical Approach for Generalizable Deep Learning in a Multi-Center Setting

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-04-03 DOI: 10.1002/ima.70076

Jovana Panic, Arianna Defeudis, Lorenzo Vassallo, Stefano Cirillo, Marco Gatti, Roberto Sghedoni, Michele Avanzo, Angelo Vanzulli, Luca Sorrentino, Luca Boldrini, Huong Elena Tran, Giuditta Chiloiro, Giuseppe Roberto D'Agostino, Enrico Menghi, Roberta Fusco, Antonella Petrillo, Vincenza Granata, Martina Mori, Claudio Fiorino, Barbara Alicja Jereczek-Fossa, Marianna Alessandra Gerardi, Serena Dell'Aversana, Antonio Esposito, Daniele Regge, Samanta Rosati, Gabriella Balestra, Valentina Giannini

Noninvasive Artificial Intelligence (AI) techniques have shown great potential in assisting clinicians through the analysis of medical images. However, significant challenges remain in integrating these systems into clinical practice due to the variability of medical data across multi-center databases and the lack of clear implementation guidelines. These issues hinder the ability to achieve robust, reproducible, and statistically significant results. This study thoroughly analyzes several decision-making steps involved in managing a multi-center database and developing AI-based segmentation models, using rectal cancer as a case study. A dataset of 1212 Magnetic Resonance Images (MRIs) from 14 centers was used. The study examined the impact of different image normalization techniques, network hyperparameters, and training set compositions (in terms of size and construction strategies). The findings emphasize the critical role of image normalization in reducing variability and improving performance. Additionally, the study underscores the importance of carefully selecting network structures and loss functions based on the desired outcomes. The potential of clustering approaches to identify representative training subsets, even with limited data sizes, was also evaluated. While no definitive preprocessing pipeline was identified, several networks developed during the study produced promising results on the external validation set. The insights and methodologies presented may help raise awareness and promote more informed decisions when implementing AI systems in medical imaging.

{"title":"Rectal Cancer Segmentation: A Methodical Approach for Generalizable Deep Learning in a Multi-Center Setting","authors":"Jovana Panic, Arianna Defeudis, Lorenzo Vassallo, Stefano Cirillo, Marco Gatti, Roberto Sghedoni, Michele Avanzo, Angelo Vanzulli, Luca Sorrentino, Luca Boldrini, Huong Elena Tran, Giuditta Chiloiro, Giuseppe Roberto D'Agostino, Enrico Menghi, Roberta Fusco, Antonella Petrillo, Vincenza Granata, Martina Mori, Claudio Fiorino, Barbara Alicja Jereczek-Fossa, Marianna Alessandra Gerardi, Serena Dell'Aversana, Antonio Esposito, Daniele Regge, Samanta Rosati, Gabriella Balestra, Valentina Giannini","doi":"10.1002/ima.70076","DOIUrl":"https://doi.org/10.1002/ima.70076","url":null,"abstract":"<p>Noninvasive Artificial Intelligence (AI) techniques have shown great potential in assisting clinicians through the analysis of medical images. However, significant challenges remain in integrating these systems into clinical practice due to the variability of medical data across multi-center databases and the lack of clear implementation guidelines. These issues hinder the ability to achieve robust, reproducible, and statistically significant results. This study thoroughly analyzes several decision-making steps involved in managing a multi-center database and developing AI-based segmentation models, using rectal cancer as a case study. A dataset of 1212 Magnetic Resonance Images (MRIs) from 14 centers was used. The study examined the impact of different image normalization techniques, network hyperparameters, and training set compositions (in terms of size and construction strategies). The findings emphasize the critical role of image normalization in reducing variability and improving performance. Additionally, the study underscores the importance of carefully selecting network structures and loss functions based on the desired outcomes. The potential of clustering approaches to identify representative training subsets, even with limited data sizes, was also evaluated. While no definitive preprocessing pipeline was identified, several networks developed during the study produced promising results on the external validation set. The insights and methodologies presented may help raise awareness and promote more informed decisions when implementing AI systems in medical imaging.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70076","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143762100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Modal Feature Supplementation Enhances Brain Tumor Segmentation

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-04-03 DOI: 10.1002/ima.70079

Kaiyan Zhu, Weiye Cao, Jianhao Xu, Tong Liu, Yue Liu, Weibo Song

For patients with brain tumors, effectively utilizing the complementary information between multimodal medical images is crucial for accurate lesion segmentation. However, effectively utilizing the complementary features across different modalities remains a challenging task. To address these challenges, we propose a modal feature supplement network (MFSNet), which extracts modality features simultaneously using both a main and an auxiliary network. During this process, the auxiliary network supplements the modality features of the main network, enabling accurate brain tumor segmentation. We also design a modal feature enhancement module (MFEM), a cross-layer feature fusion module (CFFM), and an edge feature supplement module (EFSM). MFEM enhances the network performance by fusing the modality features from the main and auxiliary networks. CFFM supplements additional contextual information by fusing features from adjacent encoding layers at different scales, which are then passed into the corresponding decoding layers. This aids the network in preserving more details during upsampling. EFSM improves network performance by using deformable convolution to extract challenging boundary lesion features, which are then used to supplement the final output of the decoding layer. We evaluated MFSNet on the BraTS2018 and BraTS2021 datasets. The Dice scores for the whole tumor, tumor core, and enhancing tumor regions were 90.86%, 90.59%, 84.72%, and 92.28%, 92.47%, 86.07%, respectively. This validates the accuracy of MFSNet in brain tumor segmentation, demonstrating its superiority over other networks of similar type.

{"title":"Modal Feature Supplementation Enhances Brain Tumor Segmentation","authors":"Kaiyan Zhu, Weiye Cao, Jianhao Xu, Tong Liu, Yue Liu, Weibo Song","doi":"10.1002/ima.70079","DOIUrl":"https://doi.org/10.1002/ima.70079","url":null,"abstract":"<div>\u0000 \u0000 <p>For patients with brain tumors, effectively utilizing the complementary information between multimodal medical images is crucial for accurate lesion segmentation. However, effectively utilizing the complementary features across different modalities remains a challenging task. To address these challenges, we propose a modal feature supplement network (MFSNet), which extracts modality features simultaneously using both a main and an auxiliary network. During this process, the auxiliary network supplements the modality features of the main network, enabling accurate brain tumor segmentation. We also design a modal feature enhancement module (MFEM), a cross-layer feature fusion module (CFFM), and an edge feature supplement module (EFSM). MFEM enhances the network performance by fusing the modality features from the main and auxiliary networks. CFFM supplements additional contextual information by fusing features from adjacent encoding layers at different scales, which are then passed into the corresponding decoding layers. This aids the network in preserving more details during upsampling. EFSM improves network performance by using deformable convolution to extract challenging boundary lesion features, which are then used to supplement the final output of the decoding layer. We evaluated MFSNet on the BraTS2018 and BraTS2021 datasets. The Dice scores for the whole tumor, tumor core, and enhancing tumor regions were 90.86%, 90.59%, 84.72%, and 92.28%, 92.47%, 86.07%, respectively. This validates the accuracy of MFSNet in brain tumor segmentation, demonstrating its superiority over other networks of similar type.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143770075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Interactive CNN and Transformer-Based Cross-Attention Fusion Network for Medical Image Classification

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-04-01 DOI: 10.1002/ima.70077

Shu Cai, Qiude Zhang, Shanshan Wang, Junjie Hu, Liang Zeng, Kaiyan Li

Medical images typically contain complex structures and abundant detail, exhibiting variations in texture, contrast, and noise across different imaging modalities. Different types of images contain both local and global features with varying expressions and importance, making accurate classification highly challenging. Convolutional neural network (CNN)-based approaches are limited by the size of the convolutional kernel, which restricts their ability to capture global contextual information effectively. In addition, while transformer-based models can compensate for the limitations of convolutional neural networks by modeling long-range dependencies, they are difficult to extract fine-grained local features from images. To address these issues, we propose a novel architecture, the Interactive CNN and Transformer for Cross Attention Fusion Network (IFC-Net). This model leverages the strengths of CNNs for efficient local feature extraction and transformers for capturing global dependencies, enabling it to preserve local features and global contextual relationships. Additionally, we introduce a cross-attention fusion module that adaptively adjusts the feature fusion strategy, facilitating efficient integration of local and global features and enabling dynamic information exchange between the CNN and transformer components. Experimental results on four benchmark datasets, ISIC2018, COVID-19, and liver cirrhosis (line array, convex array), demonstrate that the proposed model achieves superior classification performance, outperforming both CNN and transformer-only architectures.

{"title":"Interactive CNN and Transformer-Based Cross-Attention Fusion Network for Medical Image Classification","authors":"Shu Cai, Qiude Zhang, Shanshan Wang, Junjie Hu, Liang Zeng, Kaiyan Li","doi":"10.1002/ima.70077","DOIUrl":"https://doi.org/10.1002/ima.70077","url":null,"abstract":"<div>\u0000 \u0000 <p>Medical images typically contain complex structures and abundant detail, exhibiting variations in texture, contrast, and noise across different imaging modalities. Different types of images contain both local and global features with varying expressions and importance, making accurate classification highly challenging. Convolutional neural network (CNN)-based approaches are limited by the size of the convolutional kernel, which restricts their ability to capture global contextual information effectively. In addition, while transformer-based models can compensate for the limitations of convolutional neural networks by modeling long-range dependencies, they are difficult to extract fine-grained local features from images. To address these issues, we propose a novel architecture, the Interactive CNN and Transformer for Cross Attention Fusion Network (IFC-Net). This model leverages the strengths of CNNs for efficient local feature extraction and transformers for capturing global dependencies, enabling it to preserve local features and global contextual relationships. Additionally, we introduce a cross-attention fusion module that adaptively adjusts the feature fusion strategy, facilitating efficient integration of local and global features and enabling dynamic information exchange between the CNN and transformer components. Experimental results on four benchmark datasets, ISIC2018, COVID-19, and liver cirrhosis (line array, convex array), demonstrate that the proposed model achieves superior classification performance, outperforming both CNN and transformer-only architectures.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143749366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multiscale Three-Dimensional Features and Spatial Feature Evaluation of Human Pulmonary Tuberculosis

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-03-29 DOI: 10.1002/ima.70069

Xiaojiang Zhao, Yun Ding, Bowen Zhang, Huaye Wei, Ting Li, Xin Li

The low detection rate of Mycobacterium tuberculosis in clinical practice leads to a high rate of missed diagnoses for pulmonary tuberculosis (PTB). This study aimed to assess the imaging and pathological characteristics of PTB lesions from different multiple dimensions, with a focus on evaluating their three-dimensional(3D) and spatial features. This study employed multiple methods to evaluate the three-dimensional characteristics of PTB. CT was used to visually assess the density and spatial positioning of PTB lesions, and acid-fast staining was used to evaluate the two-dimensional histological features of PTB. Using fMOST technology, a total of 2399 consecutive single-cell resolution images of human PTB tissue were obtained. These images were subsequently reconstructed in 3D to evaluate the pathological characteristics of PTB in three dimensions. The 3D imaging precisely extracted the distribution of different CT values (HU values) and accurately obtained the spatial location information of the lesions, achieving precise localization. Using fMOST technology, we clearly identified the microscopic structures within both normal lung tissue and PTB lesions, revealing the loose structure, continuous alveolar septa, and clearly visible blood vessels of normal lung tissue. In contrast, typical characteristics of PTB lesions included the destruction of normal lung structure, tissue proliferation, necrosis, and inflammatory infiltration, with a significant increase in overall density. 3D observations of the necrotic areas showed high tissue density but low cellular density, primarily composed of necrotic tissue, consistent with the histological characteristics commonly seen in PTB lesions. This enhanced our understanding of the spatial distribution of PTB lesions. The 3D visualization of imaging and pathology enables a more comprehensive identification of the pathological features of PTB lesions. The multiscale model based on the fMOST system provides more detailed structural information and displays the spatial distribution of lesions more accurately. This is particularly beneficial in the evaluation of complex lesions, demonstrating its potential for optimizing diagnostic methods and supporting clinical decision-making.

在临床实践中，结核分枝杆菌的检出率很低，导致肺结核（PTB）的漏诊率很高。本研究旨在从多个维度评估肺结核病灶的成像和病理特征，重点是评估其三维（3D）和空间特征。本研究采用多种方法评估 PTB 的三维特征。CT 用于直观评估 PTB 病灶的密度和空间定位，酸-ast 染色用于评估 PTB 的二维组织学特征。利用 fMOST 技术，共获得了 2399 幅连续的单细胞分辨率人体 PTB 组织图像。随后对这些图像进行三维重建，以评估 PTB 的三维病理特征。三维成像精确提取了不同 CT 值（HU 值）的分布，准确获取了病灶的空间位置信息，实现了精确定位。利用 fMOST 技术，我们清晰地识别了正常肺组织和 PTB 病灶内的微观结构，发现正常肺组织结构疏松、肺泡间隔连续、血管清晰可见。相比之下，PTB 病变的典型特征包括正常肺部结构被破坏、组织增生、坏死和炎症浸润，整体密度显著增加。对坏死区域的三维观察显示，组织密度高，但细胞密度低，主要由坏死组织组成，这与 PTB 病变中常见的组织学特征一致。这加深了我们对 PTB 病变空间分布的了解。成像和病理的三维可视化使我们能够更全面地识别 PTB 病变的病理特征。基于 fMOST 系统的多尺度模型可提供更详细的结构信息，并更准确地显示病变的空间分布。这对评估复杂病变尤其有益，显示了其在优化诊断方法和支持临床决策方面的潜力。

{"title":"Multiscale Three-Dimensional Features and Spatial Feature Evaluation of Human Pulmonary Tuberculosis","authors":"Xiaojiang Zhao, Yun Ding, Bowen Zhang, Huaye Wei, Ting Li, Xin Li","doi":"10.1002/ima.70069","DOIUrl":"https://doi.org/10.1002/ima.70069","url":null,"abstract":"<p>The low detection rate of <i>Mycobacterium tuberculosis</i> in clinical practice leads to a high rate of missed diagnoses for pulmonary tuberculosis (PTB). This study aimed to assess the imaging and pathological characteristics of PTB lesions from different multiple dimensions, with a focus on evaluating their three-dimensional(3D) and spatial features. This study employed multiple methods to evaluate the three-dimensional characteristics of PTB. CT was used to visually assess the density and spatial positioning of PTB lesions, and acid-fast staining was used to evaluate the two-dimensional histological features of PTB. Using fMOST technology, a total of 2399 consecutive single-cell resolution images of human PTB tissue were obtained. These images were subsequently reconstructed in 3D to evaluate the pathological characteristics of PTB in three dimensions. The 3D imaging precisely extracted the distribution of different CT values (HU values) and accurately obtained the spatial location information of the lesions, achieving precise localization. Using fMOST technology, we clearly identified the microscopic structures within both normal lung tissue and PTB lesions, revealing the loose structure, continuous alveolar septa, and clearly visible blood vessels of normal lung tissue. In contrast, typical characteristics of PTB lesions included the destruction of normal lung structure, tissue proliferation, necrosis, and inflammatory infiltration, with a significant increase in overall density. 3D observations of the necrotic areas showed high tissue density but low cellular density, primarily composed of necrotic tissue, consistent with the histological characteristics commonly seen in PTB lesions. This enhanced our understanding of the spatial distribution of PTB lesions. The 3D visualization of imaging and pathology enables a more comprehensive identification of the pathological features of PTB lesions. The multiscale model based on the fMOST system provides more detailed structural information and displays the spatial distribution of lesions more accurately. This is particularly beneficial in the evaluation of complex lesions, demonstrating its potential for optimizing diagnostic methods and supporting clinical decision-making.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70069","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143726857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Explainable Graph Neural Network Approach for Patch Selection Using a New Patch Score Metric in Breast Cancer Detection

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-03-29 DOI: 10.1002/ima.70078

Eranjoli Nalupurakkal Subhija, Vaninirappuputhenpurayil Gopalan Reju

This study aims to develop an algorithm for selecting the most informative and diverse patches from breast histopathology images while excluding irrelevant areas to enhance cancer detection. A key contribution of the method is the creation of a new metric called patch score that integrates SHAP values with Haralick features, improving both explainability and diagnostic accuracy. The algorithm begins by calculating Haralick features and measuring cosine similarity between patches to construct a graph, which is then used to train a graph neural network (GNN). To assess each patch's contribution to the analysis, we employ a SHAP explainer on the GNN model. The SHAP values and the features from each patch are then used to calculate a score called the patch score, which determines the importance of each patch. Additionally, to incorporate diversity in the selected patches, all patches are clustered based on local binary patterns, and the patch with the highest patch score from each cluster is selected to obtain the final patches for image classification. Features extracted from these patches using a ResNeXt 50 model, fused with 3-norm pooling, are used to classify the images as benign or malignant. The proposed framework was evaluated on the BreakHis dataset and demonstrated superior accuracy and precision compared to existing methods. By integrating both explainability and diversity into patch selection, the algorithm delivers a robust, interpretable model, offering dependable diagnostic support for pathologists.

{"title":"An Explainable Graph Neural Network Approach for Patch Selection Using a New Patch Score Metric in Breast Cancer Detection","authors":"Eranjoli Nalupurakkal Subhija, Vaninirappuputhenpurayil Gopalan Reju","doi":"10.1002/ima.70078","DOIUrl":"https://doi.org/10.1002/ima.70078","url":null,"abstract":"<div>\u0000 \u0000 <p>This study aims to develop an algorithm for selecting the most informative and diverse patches from breast histopathology images while excluding irrelevant areas to enhance cancer detection. A key contribution of the method is the creation of a new metric called patch score that integrates SHAP values with Haralick features, improving both explainability and diagnostic accuracy. The algorithm begins by calculating Haralick features and measuring cosine similarity between patches to construct a graph, which is then used to train a graph neural network (GNN). To assess each patch's contribution to the analysis, we employ a SHAP explainer on the GNN model. The SHAP values and the features from each patch are then used to calculate a score called the patch score, which determines the importance of each patch. Additionally, to incorporate diversity in the selected patches, all patches are clustered based on local binary patterns, and the patch with the highest patch score from each cluster is selected to obtain the final patches for image classification. Features extracted from these patches using a ResNeXt 50 model, fused with 3-norm pooling, are used to classify the images as benign or malignant. The proposed framework was evaluated on the BreakHis dataset and demonstrated superior accuracy and precision compared to existing methods. By integrating both explainability and diversity into patch selection, the algorithm delivers a robust, interpretable model, offering dependable diagnostic support for pathologists.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 3","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143726855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CS U-NET: A Medical Image Segmentation Method Integrating Spatial and Contextual Attention Mechanisms Based on U-NET

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2025-03-26 DOI: 10.1002/ima.70072

Zhang Fanyang, Zhang Fan

Medical image segmentation is a crucial process in medical image analysis, with convolutional neural network (CNN)-based methods achieving notable success in recent years. Among these, U-Net has gained widespread use due to its simple yet effective architecture. However, CNNs still struggle to capture global, long-range semantic information. To address this limitation, we present CS U-NET, a novel method built upon Swin-U-Net, which integrates spatial and contextual attention mechanisms. This hybrid approach combines the strengths of both transformers and U-Net architectures to enhance segmentation performance. In this framework, tokenized image patches are processed through a transformer-based U-shaped encoder-decoder, enabling the learning of both local and global semantic features via skip connections. Our method achieves a Dice Similarity Coefficient of 78.64% and a 95% Hausdorff distance of 21.25 on the Synapse multiorgan segmentation dataset, outperforming Trans-U-Net and other state-of-the-art U-Net variants by 4% and 6%, respectively. The experimental results highlight the significant improvements in prediction accuracy and edge detail preservation provided by our approach.

{"title":"CS U-NET: A Medical Image Segmentation Method Integrating Spatial and Contextual Attention Mechanisms Based on U-NET","authors":"Zhang Fanyang, Zhang Fan","doi":"10.1002/ima.70072","DOIUrl":"https://doi.org/10.1002/ima.70072","url":null,"abstract":"<div>\u0000 \u0000 <p>Medical image segmentation is a crucial process in medical image analysis, with convolutional neural network (CNN)-based methods achieving notable success in recent years. Among these, U-Net has gained widespread use due to its simple yet effective architecture. However, CNNs still struggle to capture global, long-range semantic information. To address this limitation, we present CS U-NET, a novel method built upon Swin-U-Net, which integrates spatial and contextual attention mechanisms. This hybrid approach combines the strengths of both transformers and U-Net architectures to enhance segmentation performance. In this framework, tokenized image patches are processed through a transformer-based U-shaped encoder-decoder, enabling the learning of both local and global semantic features via skip connections. Our method achieves a Dice Similarity Coefficient of 78.64% and a 95% Hausdorff distance of 21.25 on the Synapse multiorgan segmentation dataset, outperforming Trans-U-Net and other state-of-the-art U-Net variants by 4% and 6%, respectively. The experimental results highlight the significant improvements in prediction accuracy and edge detail preservation provided by our approach.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 2","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143707268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0