Pub Date : 2025-02-24DOI: 10.1007/s11517-025-03326-w
Mingjing Yang, Kangwen Yang, Mengjun Wu, Liqin Huang, Wangbin Ding, Lin Pan, Lei Yin
Myocardium scar segmentation is essential for clinical diagnosis and prognosis for cardiac vascular diseases. Late gadolinium enhancement (LGE) imaging technology has been widely utilized to visualize left atrial and ventricular scars. However, automatic scar segmentation remains challenging due to the imbalance between scar and background and the variation in scar sizes. To address these challenges, we introduce an innovative network, i.e., LGENet, for scar segmentation. LGENet disentangles anatomy and pathology features from LGE images. Note that inherent spatial relationships exist between the myocardium and scarring regions. We proposed a boundary attention module to allow the scar segmentation conditioned on anatomical boundary features, which could mitigate the imbalance problem. Meanwhile, LGENet can predict scar regions across multiple scales with a multi-depth decision module, addressing the scar size variation issue. In our experiments, we thoroughly evaluated the performance of LGENet using LAScarQS 2022 and EMIDEC datasets. The results demonstrate that LGENet achieved promising performance for cardiac scar segmentation.
{"title":"LGENet: disentangle anatomy and pathology features for late gadolinium enhancement image segmentation.","authors":"Mingjing Yang, Kangwen Yang, Mengjun Wu, Liqin Huang, Wangbin Ding, Lin Pan, Lei Yin","doi":"10.1007/s11517-025-03326-w","DOIUrl":"https://doi.org/10.1007/s11517-025-03326-w","url":null,"abstract":"<p><p>Myocardium scar segmentation is essential for clinical diagnosis and prognosis for cardiac vascular diseases. Late gadolinium enhancement (LGE) imaging technology has been widely utilized to visualize left atrial and ventricular scars. However, automatic scar segmentation remains challenging due to the imbalance between scar and background and the variation in scar sizes. To address these challenges, we introduce an innovative network, i.e., LGENet, for scar segmentation. LGENet disentangles anatomy and pathology features from LGE images. Note that inherent spatial relationships exist between the myocardium and scarring regions. We proposed a boundary attention module to allow the scar segmentation conditioned on anatomical boundary features, which could mitigate the imbalance problem. Meanwhile, LGENet can predict scar regions across multiple scales with a multi-depth decision module, addressing the scar size variation issue. In our experiments, we thoroughly evaluated the performance of LGENet using LAScarQS 2022 and EMIDEC datasets. The results demonstrate that LGENet achieved promising performance for cardiac scar segmentation.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143484491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-18DOI: 10.1007/s11517-024-03278-7
Khalid Al-Hammuri, Fayez Gebali, Awos Kanan
In modern telehealth and healthcare information systems medical image analysis is essential to understand the context of the images and its complex structure from large, inconsistent-quality, and distributed datasets. Achieving desired results faces a few challenges for deep learning. Examples of these challenges are date size, labeling, balancing, training, and feature extraction. These challenges made the AI model complex and expensive to be built and difficult to understand which made it a black box and produce hysteresis and irrelevant, illegal, and unethical output in some cases. In this article, lingual ultrasound is studied to extract tongue contour to understand language behavior and language signature and utilize it as biofeedback for different applications. This article introduces a design strategy that can work effectively using a well-managed dynamic-size dataset. It includes a hybrid architecture using UNet, Vision Transformer (ViT), and contrastive loss in latent space to build a foundation model cumulatively. The process starts with building a reference representation in the embedding space using human experts to validate any new input for training data. UNet and ViT encoders are used to extract the input feature representations. The contrastive loss was then compared to the new feature embedding with the reference in the embedding space. The UNet-based decoder is used to reconstruct the image to its original size. Before releasing the final results, quality control is used to assess the segmented contour, and if rejected, the algorithm requests an action from a human expert to annotate it manually. The results show an improved accuracy over the traditional techniques as it contains only high quality and relevant features.
{"title":"TongueTransUNet: toward effective tongue contour segmentation using well-managed dataset.","authors":"Khalid Al-Hammuri, Fayez Gebali, Awos Kanan","doi":"10.1007/s11517-024-03278-7","DOIUrl":"https://doi.org/10.1007/s11517-024-03278-7","url":null,"abstract":"<p><p>In modern telehealth and healthcare information systems medical image analysis is essential to understand the context of the images and its complex structure from large, inconsistent-quality, and distributed datasets. Achieving desired results faces a few challenges for deep learning. Examples of these challenges are date size, labeling, balancing, training, and feature extraction. These challenges made the AI model complex and expensive to be built and difficult to understand which made it a black box and produce hysteresis and irrelevant, illegal, and unethical output in some cases. In this article, lingual ultrasound is studied to extract tongue contour to understand language behavior and language signature and utilize it as biofeedback for different applications. This article introduces a design strategy that can work effectively using a well-managed dynamic-size dataset. It includes a hybrid architecture using UNet, Vision Transformer (ViT), and contrastive loss in latent space to build a foundation model cumulatively. The process starts with building a reference representation in the embedding space using human experts to validate any new input for training data. UNet and ViT encoders are used to extract the input feature representations. The contrastive loss was then compared to the new feature embedding with the reference in the embedding space. The UNet-based decoder is used to reconstruct the image to its original size. Before releasing the final results, quality control is used to assess the segmented contour, and if rejected, the algorithm requests an action from a human expert to annotate it manually. The results show an improved accuracy over the traditional techniques as it contains only high quality and relevant features.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-18DOI: 10.1007/s11517-025-03324-y
Zhihui Liu, Mohd Shahrizal Sunar, Tian Swee Tan, Wan Hazabbah Wan Hitam
Ophthalmic diseases are a leading cause of vision loss, with retinal damage being irreversible. Retinal blood vessels are vital for diagnosing eye conditions, as even subtle changes in their structure can signal underlying issues. Retinal vessel segmentation is key for early detection and treatment of eye diseases. Traditionally, ophthalmologists manually segmented vessels, a time-consuming process based on clinical and geometric features. However, deep learning advancements have led to automated methods with impressive results. This systematic review, following PRISMA guidelines, examines 79 studies on deep learning-based retinal vessel segmentation published between 2020 and 2024 from four databases: Web of Science, Scopus, IEEE Xplore, and PubMed. The review focuses on datasets, segmentation models, evaluation metrics, and emerging trends. U-Net and Transformer architectures have shown success, with U-Net's encoder-decoder structure preserving details and Transformers capturing global context through self-attention mechanisms. Despite their effectiveness, challenges remain, suggesting future research should explore hybrid models combining U-Net, Transformers, and GANs to improve segmentation accuracy. This review offers a comprehensive look at the current landscape and future directions in retinal vessel segmentation.
{"title":"Deep learning for retinal vessel segmentation: a systematic review of techniques and applications.","authors":"Zhihui Liu, Mohd Shahrizal Sunar, Tian Swee Tan, Wan Hazabbah Wan Hitam","doi":"10.1007/s11517-025-03324-y","DOIUrl":"https://doi.org/10.1007/s11517-025-03324-y","url":null,"abstract":"<p><p>Ophthalmic diseases are a leading cause of vision loss, with retinal damage being irreversible. Retinal blood vessels are vital for diagnosing eye conditions, as even subtle changes in their structure can signal underlying issues. Retinal vessel segmentation is key for early detection and treatment of eye diseases. Traditionally, ophthalmologists manually segmented vessels, a time-consuming process based on clinical and geometric features. However, deep learning advancements have led to automated methods with impressive results. This systematic review, following PRISMA guidelines, examines 79 studies on deep learning-based retinal vessel segmentation published between 2020 and 2024 from four databases: Web of Science, Scopus, IEEE Xplore, and PubMed. The review focuses on datasets, segmentation models, evaluation metrics, and emerging trends. U-Net and Transformer architectures have shown success, with U-Net's encoder-decoder structure preserving details and Transformers capturing global context through self-attention mechanisms. Despite their effectiveness, challenges remain, suggesting future research should explore hybrid models combining U-Net, Transformers, and GANs to improve segmentation accuracy. This review offers a comprehensive look at the current landscape and future directions in retinal vessel segmentation.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chronic obstructive pulmonary disease (COPD) is a highly heterogeneous disease with various phenotypes. Registered inspiratory and expiratory CT images can generate the parametric response map (PRM) that characterizes phenotypes' spatial distribution and proportions. However, increased radiation dosage, scan time, quality control, and patient cooperation requirements limit the utility of PRM. This study aims to synthesize a PRM using only inspiratory CT scans. First, a CycleGAN with perceptual loss and a multiscale discriminator (MPCycleGAN) is proposed and trained to synthesize registered expiratory CT images from inspiratory images. Next, a strategy named InspirationOnly is introduced, where synthesized images replace actual expiratory CT images. The image synthesizer outperformed state-of-the-art models, achieving a mean absolute error of 105.66 ± 36.64 HU, a peak signal-to-noise ratio of 21.43 ± 1.87 dB, and a structural similarity of 0.84 ± 0.02. The intraclass correlation coefficients of emphysema, fSAD, and normal proportions between the InspirationOnly and ground truth were 0.995, 0.829, and 0.914, respectively. The proposed MPCycleGAN enables the InspirationOnly strategy to yield PRM using only inspiratory CT. The estimated COPD phenotypes are consistent with those from dual-phase CT and correlated with the spirometry parameters. This offers a potential tool for characterizing phenotypes of COPD, particularly when expiratory CT images are unavailable.
{"title":"InspirationOnly: synthesizing expiratory CT from inspiratory CT to estimate parametric response map.","authors":"Tiande Zhang, Haowen Pang, Yanan Wu, Jiaxuan Xu, Zhenyu Liang, Shuyue Xia, Chenwang Jin, Rongchang Chen, Shouliang Qi","doi":"10.1007/s11517-025-03322-0","DOIUrl":"https://doi.org/10.1007/s11517-025-03322-0","url":null,"abstract":"<p><p>Chronic obstructive pulmonary disease (COPD) is a highly heterogeneous disease with various phenotypes. Registered inspiratory and expiratory CT images can generate the parametric response map (PRM) that characterizes phenotypes' spatial distribution and proportions. However, increased radiation dosage, scan time, quality control, and patient cooperation requirements limit the utility of PRM. This study aims to synthesize a PRM using only inspiratory CT scans. First, a CycleGAN with perceptual loss and a multiscale discriminator (MPCycleGAN) is proposed and trained to synthesize registered expiratory CT images from inspiratory images. Next, a strategy named InspirationOnly is introduced, where synthesized images replace actual expiratory CT images. The image synthesizer outperformed state-of-the-art models, achieving a mean absolute error of 105.66 ± 36.64 HU, a peak signal-to-noise ratio of 21.43 ± 1.87 dB, and a structural similarity of 0.84 ± 0.02. The intraclass correlation coefficients of emphysema, fSAD, and normal proportions between the InspirationOnly and ground truth were 0.995, 0.829, and 0.914, respectively. The proposed MPCycleGAN enables the InspirationOnly strategy to yield PRM using only inspiratory CT. The estimated COPD phenotypes are consistent with those from dual-phase CT and correlated with the spirometry parameters. This offers a potential tool for characterizing phenotypes of COPD, particularly when expiratory CT images are unavailable.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caries segmentation holds significant clinical importance in medical image analysis, particularly in the early detection and treatment of dental caries. However, existing deep learning segmentation methods often struggle with accurately segmenting complex caries boundaries. To address this challenge, this paper proposes a novel network, named AEDD-Net, which combines an attention mechanism with a dual-decoder structure to enhance the performance of boundary segmentation for caries. Unlike traditional methods, AEDD-Net integrates atrous spatial pyramid pooling with cross-coordinate attention mechanisms to effectively fuse global and multi-scale features. Additionally, the network introduces a dedicated boundary generation module that precisely extracts caries boundary information. Moreover, we propose an innovative boundary loss function to further improve the learning of boundary features. Experimental results demonstrate that AEDD-Net significantly outperforms other comparison networks in terms of Dice coefficient, Jaccard similarity, precision, and sensitivity, particularly showing superior performance in boundary segmentation. This study provides an innovative approach for automated caries segmentation, with promising potential for clinical applications.
{"title":"Precise dental caries segmentation in X-rays with an attention and edge dual-decoder network.","authors":"Feng Huang, Jiaxing Yin, Yuxin Ma, Hao Zhang, Shunv Ying","doi":"10.1007/s11517-025-03318-w","DOIUrl":"https://doi.org/10.1007/s11517-025-03318-w","url":null,"abstract":"<p><p>Caries segmentation holds significant clinical importance in medical image analysis, particularly in the early detection and treatment of dental caries. However, existing deep learning segmentation methods often struggle with accurately segmenting complex caries boundaries. To address this challenge, this paper proposes a novel network, named AEDD-Net, which combines an attention mechanism with a dual-decoder structure to enhance the performance of boundary segmentation for caries. Unlike traditional methods, AEDD-Net integrates atrous spatial pyramid pooling with cross-coordinate attention mechanisms to effectively fuse global and multi-scale features. Additionally, the network introduces a dedicated boundary generation module that precisely extracts caries boundary information. Moreover, we propose an innovative boundary loss function to further improve the learning of boundary features. Experimental results demonstrate that AEDD-Net significantly outperforms other comparison networks in terms of Dice coefficient, Jaccard similarity, precision, and sensitivity, particularly showing superior performance in boundary segmentation. This study provides an innovative approach for automated caries segmentation, with promising potential for clinical applications.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1007/s11517-025-03325-x
Claudia Cava, Isabella Castiglioni
Computational drug repositioning approaches should be investigated for the identification of new treatments for Alzheimer's patients as a huge amount of omics data has been produced during pre-clinical and clinical studies. Here, we investigated a gene network in Alzheimer's patients to detect a proper therapeutic target. We screened the targets of different drugs (34,006 compounds) using data available in the Connectivity Map database. Then, we analyzed transcriptome profiles of Alzheimer's patients to discover a network of gene-drugs based on mutual information, representing an index of dependence among genes. This study identified a network consisting of 25 genes and compounds and interconnected biological processes using computational approaches. The results also highlight the diagnostic role of the 25 genes since we obtained good classification performances using a neural network model. We also suggest 12 repurposable drugs (like KU-60019, AM-630, CP55940, enflurane, ginkgolide B, linopirdine, apremilast, ibudilast, pentoxifylline, roflumilast, acitretin, and tamibarotene) interacting with 6 genes (ATM, CNR1, GLRB, KCNQ2, PDE4B, and RARA), that we linked to retrograde endocannabinoid signaling, synaptic vesicle cycle, morphine addiction, and homologous recombination.
{"title":"Drug repositioning based on mutual information for the treatment of Alzheimer's disease patients.","authors":"Claudia Cava, Isabella Castiglioni","doi":"10.1007/s11517-025-03325-x","DOIUrl":"https://doi.org/10.1007/s11517-025-03325-x","url":null,"abstract":"<p><p>Computational drug repositioning approaches should be investigated for the identification of new treatments for Alzheimer's patients as a huge amount of omics data has been produced during pre-clinical and clinical studies. Here, we investigated a gene network in Alzheimer's patients to detect a proper therapeutic target. We screened the targets of different drugs (34,006 compounds) using data available in the Connectivity Map database. Then, we analyzed transcriptome profiles of Alzheimer's patients to discover a network of gene-drugs based on mutual information, representing an index of dependence among genes. This study identified a network consisting of 25 genes and compounds and interconnected biological processes using computational approaches. The results also highlight the diagnostic role of the 25 genes since we obtained good classification performances using a neural network model. We also suggest 12 repurposable drugs (like KU-60019, AM-630, CP55940, enflurane, ginkgolide B, linopirdine, apremilast, ibudilast, pentoxifylline, roflumilast, acitretin, and tamibarotene) interacting with 6 genes (ATM, CNR1, GLRB, KCNQ2, PDE4B, and RARA), that we linked to retrograde endocannabinoid signaling, synaptic vesicle cycle, morphine addiction, and homologous recombination.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1007/s11517-025-03311-3
Md Ahasan Atick Faisal, Onur Mutlu, Sakib Mahmud, Anas Tahir, Muhammad E H Chowdhury, Faycal Bensaali, Abdulrahman Alnabti, Mehmet Metin Yavuz, Ayman El-Menyar, Hassan Al-Thani, Huseyin Cagatay Yalcin
Aortic aneurysms pose a significant risk of rupture. Previous research has shown that areas exposed to low wall shear stress (WSS) are more prone to rupture. Therefore, precise WSS determination on the aneurysm is crucial for rupture risk assessment. Computational fluid dynamics (CFD) is a powerful approach for WSS calculations, but they are computationally intensive, hindering time-sensitive clinical decision-making. In this study, we propose a deep learning (DL) surrogate, MultiViewUNet, to rapidly predict time-averaged WSS (TAWSS) distributions on abdominal aortic aneurysms (AAA). Our novel approach employs a domain transformation technique to translate complex aortic geometries into representations compatible with state-of-the-art neural networks. MultiViewUNet was trained on real and synthetic AAA geometries, demonstrating an average normalized mean absolute error (NMAE) of just in WSS prediction. This framework has the potential to streamline hemodynamic analysis in AAA and other clinical scenarios where fast and accurate stress quantification is essential.
{"title":"Rapid wall shear stress prediction for aortic aneurysms using deep learning: a fast alternative to CFD.","authors":"Md Ahasan Atick Faisal, Onur Mutlu, Sakib Mahmud, Anas Tahir, Muhammad E H Chowdhury, Faycal Bensaali, Abdulrahman Alnabti, Mehmet Metin Yavuz, Ayman El-Menyar, Hassan Al-Thani, Huseyin Cagatay Yalcin","doi":"10.1007/s11517-025-03311-3","DOIUrl":"https://doi.org/10.1007/s11517-025-03311-3","url":null,"abstract":"<p><p>Aortic aneurysms pose a significant risk of rupture. Previous research has shown that areas exposed to low wall shear stress (WSS) are more prone to rupture. Therefore, precise WSS determination on the aneurysm is crucial for rupture risk assessment. Computational fluid dynamics (CFD) is a powerful approach for WSS calculations, but they are computationally intensive, hindering time-sensitive clinical decision-making. In this study, we propose a deep learning (DL) surrogate, MultiViewUNet, to rapidly predict time-averaged WSS (TAWSS) distributions on abdominal aortic aneurysms (AAA). Our novel approach employs a domain transformation technique to translate complex aortic geometries into representations compatible with state-of-the-art neural networks. MultiViewUNet was trained on <math><mrow><mn>23</mn></mrow> </math> real and <math><mrow><mn>230</mn></mrow> </math> synthetic AAA geometries, demonstrating an average normalized mean absolute error (NMAE) of just <math><mrow><mn>0.362</mn> <mo>%</mo></mrow> </math> in WSS prediction. This framework has the potential to streamline hemodynamic analysis in AAA and other clinical scenarios where fast and accurate stress quantification is essential.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143442610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Left ventricular ejection fraction (LVEF) is a critical indicator of cardiac function, aiding in the assessment of heart conditions. Accurate segmentation of the left ventricle (LV) is essential for LVEF calculation. However, current methods are often limited by small datasets and exhibit poor generalization. While leveraging large models can address this issue, many fail to capture multi-scale information and introduce additional burdens on users to generate prompts. To overcome these challenges, we propose LV-SAM, a model based on the large model SAM-Med2D, for accurate LV segmentation. It comprises three key components: an image encoder with a multi-scale adapter (MSAd), a multimodal prompt encoder (MPE), and a multi-scale decoder (MSD). The MSAd extracts multi-scale information at the encoder level and fine-tunes the model, while the MSD employs skip connections to effectively utilize multi-scale information at the decoder level. Additionally, we introduce an automated pipeline for generating self-extracted dense prompts and use a large language model to generate text prompts, reducing the user burden. The MPE processes these prompts, further enhancing model performance. Evaluations on the CAMUS dataset show that LV-SAM outperforms existing SOAT methods in LV segmentation, achieving the lowest MAE of 5.016 in LVEF estimation.
{"title":"Integrating multi-scale information and diverse prompts in large model SAM-Med2D for accurate left ventricular ejection fraction estimation.","authors":"Yagang Wu, Tianli Zhao, Shijun Hu, Qin Wu, Yingxu Chen, Xin Huang, Zhoushun Zheng","doi":"10.1007/s11517-025-03310-4","DOIUrl":"https://doi.org/10.1007/s11517-025-03310-4","url":null,"abstract":"<p><p>Left ventricular ejection fraction (LVEF) is a critical indicator of cardiac function, aiding in the assessment of heart conditions. Accurate segmentation of the left ventricle (LV) is essential for LVEF calculation. However, current methods are often limited by small datasets and exhibit poor generalization. While leveraging large models can address this issue, many fail to capture multi-scale information and introduce additional burdens on users to generate prompts. To overcome these challenges, we propose LV-SAM, a model based on the large model SAM-Med2D, for accurate LV segmentation. It comprises three key components: an image encoder with a multi-scale adapter (MSAd), a multimodal prompt encoder (MPE), and a multi-scale decoder (MSD). The MSAd extracts multi-scale information at the encoder level and fine-tunes the model, while the MSD employs skip connections to effectively utilize multi-scale information at the decoder level. Additionally, we introduce an automated pipeline for generating self-extracted dense prompts and use a large language model to generate text prompts, reducing the user burden. The MPE processes these prompts, further enhancing model performance. Evaluations on the CAMUS dataset show that LV-SAM outperforms existing SOAT methods in LV segmentation, achieving the lowest MAE of 5.016 in LVEF estimation.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143416044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate three-dimensional (3D) segmentation of hepatic vascular networks is crucial for supporting ultrasound-mediated theranostics for liver diseases. Despite advancements in deep learning techniques, accurate segmentation remains challenging due to ultrasound image quality issues, including intensity and contrast fluctuations. This study introduces intensity transformation-based data augmentation methods to improve deep convolutional neural network-based segmentation of hepatic vascular networks. We employed a 3D U-Net, which leverages spatial contextual information, as the baseline. To address intensity and contrast fluctuations and improve 3D U-Net performance, we implemented data augmentation using high-contrast intensity transformation with S-shaped tone curves and low-contrast intensity transformation with Gamma and inverse S-shaped tone curves. We conducted validation experiments on 78 ultrasound volumes to evaluate the effect of both geometric and intensity transformation-based data augmentations. We found that high-contrast intensity transformation-based data augmentation decreased segmentation accuracy, while low-contrast intensity transformation-based data augmentation significantly improved Recall and Dice. Additionally, combining geometric and low-contrast intensity transformation-based data augmentations, through an OR operation on their results, further enhanced segmentation accuracy, achieving improvements of 9.7% in Recall and 3.3% in Dice. This study demonstrated the effectiveness of low-contrast intensity transformation-based data augmentation in improving volumetric segmentation of hepatic vascular networks from ultrasound volumes.
{"title":"Improved segmentation of hepatic vascular networks in ultrasound volumes using 3D U-Net with intensity transformation-based data augmentation.","authors":"Yukino Takahashi, Takaaki Sugino, Shinya Onogi, Yoshikazu Nakajima, Kohji Masuda","doi":"10.1007/s11517-025-03320-2","DOIUrl":"https://doi.org/10.1007/s11517-025-03320-2","url":null,"abstract":"<p><p>Accurate three-dimensional (3D) segmentation of hepatic vascular networks is crucial for supporting ultrasound-mediated theranostics for liver diseases. Despite advancements in deep learning techniques, accurate segmentation remains challenging due to ultrasound image quality issues, including intensity and contrast fluctuations. This study introduces intensity transformation-based data augmentation methods to improve deep convolutional neural network-based segmentation of hepatic vascular networks. We employed a 3D U-Net, which leverages spatial contextual information, as the baseline. To address intensity and contrast fluctuations and improve 3D U-Net performance, we implemented data augmentation using high-contrast intensity transformation with S-shaped tone curves and low-contrast intensity transformation with Gamma and inverse S-shaped tone curves. We conducted validation experiments on 78 ultrasound volumes to evaluate the effect of both geometric and intensity transformation-based data augmentations. We found that high-contrast intensity transformation-based data augmentation decreased segmentation accuracy, while low-contrast intensity transformation-based data augmentation significantly improved Recall and Dice. Additionally, combining geometric and low-contrast intensity transformation-based data augmentations, through an OR operation on their results, further enhanced segmentation accuracy, achieving improvements of 9.7% in Recall and 3.3% in Dice. This study demonstrated the effectiveness of low-contrast intensity transformation-based data augmentation in improving volumetric segmentation of hepatic vascular networks from ultrasound volumes.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143411369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-13DOI: 10.1007/s11517-025-03312-2
Wenshuang Chen, Qi Ye, Lihua Guo, Qi Wu
Unsupervised domain adaptation (UDA) offers a promising approach to enhance discriminant performance on target domains by utilizing domain adaptation techniques. These techniques enable models to leverage knowledge from the source domain to adjust to the feature distribution in the target domain. This paper proposes a unified domain adaptation framework to carry out cross-modality medical image segmentation from two perspectives: image and feature. To achieve image alignment, the loss function of Fourier-based Contrastive Style Augmentation (FCSA) has been fine-tuned to increase the impact of style change for improving system robustness. For feature alignment, a module called Source-domain Labels Guided Contrastive Learning (SLGCL) has been designed to encourage the target domain to align features of different classes with those in the source domain. In addition, a generative adversarial network has been incorporated to ensure consistency in spatial layout and local context in generated image space. According to our knowledge, our method is the first attempt to utilize source domain class intensity information to guide target domain class intensity information for feature alignment in an unsupervised domain adaptation setting. Extensive experiments conducted on a public whole heart image segmentation task demonstrate that our proposed method outperforms state-of-the-art UDA methods for medical image segmentation.
{"title":"Unsupervised cross-modality domain adaptation via source-domain labels guided contrastive learning for medical image segmentation.","authors":"Wenshuang Chen, Qi Ye, Lihua Guo, Qi Wu","doi":"10.1007/s11517-025-03312-2","DOIUrl":"https://doi.org/10.1007/s11517-025-03312-2","url":null,"abstract":"<p><p>Unsupervised domain adaptation (UDA) offers a promising approach to enhance discriminant performance on target domains by utilizing domain adaptation techniques. These techniques enable models to leverage knowledge from the source domain to adjust to the feature distribution in the target domain. This paper proposes a unified domain adaptation framework to carry out cross-modality medical image segmentation from two perspectives: image and feature. To achieve image alignment, the loss function of Fourier-based Contrastive Style Augmentation (FCSA) has been fine-tuned to increase the impact of style change for improving system robustness. For feature alignment, a module called Source-domain Labels Guided Contrastive Learning (SLGCL) has been designed to encourage the target domain to align features of different classes with those in the source domain. In addition, a generative adversarial network has been incorporated to ensure consistency in spatial layout and local context in generated image space. According to our knowledge, our method is the first attempt to utilize source domain class intensity information to guide target domain class intensity information for feature alignment in an unsupervised domain adaptation setting. Extensive experiments conducted on a public whole heart image segmentation task demonstrate that our proposed method outperforms state-of-the-art UDA methods for medical image segmentation.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":""},"PeriodicalIF":2.6,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143411299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}