首页 > 最新文献

International Journal of Imaging Systems and Technology最新文献

英文 中文
Multimodal Radiomics and Deep Learning Integration for Bone Health Assessment in Postmenopausal Women via Dental Radiographs: Development of an Interpretable Nomogram 多模态放射组学和深度学习集成用于绝经后妇女通过牙科x线片进行骨骼健康评估:可解释Nomogram发展
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-28 DOI: 10.1002/ima.70239
Zhengxia Hu, Xiaodong Wang, Hai Lan

To develop and validate a multimodal machine learning model for opportunistic osteoporosis screening in postmenopausal women using dental periapical radiographs. This retrospective multicenter study analyzed 3885 periapical radiographs paired with DEXA-derived T-scores from postmenopausal women. Clinical, handcrafted radiomic, and deep features were extracted, resulting in a fused feature set. Radiomic features (n = 215) followed Image Biomarker Standardization Initiative (IBSI) guidelines, and deep features (n = 128) were derived from a novel attention-based autoencoder. Feature harmonization used ComBat adjustment; reliability was ensured by intra-class correlation coefficient (ICC) filtering (ICC ≥ 0.80). Dimensionality was reduced via Pearson correlation and LASSO regression. Four classifiers—logistic regression, random forest, multilayer perceptron, and XGBoost—were trained and evaluated across stratified training, internal, and external test sets. A logistic regression model was selected for clinical translation and nomogram development. Decision curve analysis assessed clinical utility. XGBoost achieved the highest classification performance using the fused feature set, with an internal AUC of 94.6% and external AUC of 93.7%. Logistic regression maintained strong performance (external AUC = 91.3%) and facilitated nomogram construction. Deep and radiomic features independently outperformed clinical-only models, confirming their predictive strength. SHAP analysis identified DEXA T-score, age, vitamin D, and selected radiomic/deep features as key contributors. Calibration curves and Hosmer–Lemeshow test (p = 0.492) confirmed model reliability. Decision curve analysis showed meaningful net clinical benefit across decision thresholds. Dental periapical radiographs can be leveraged for accurate, non-invasive osteoporosis screening in postmenopausal women. The proposed model demonstrates high accuracy, generalizability, and interpretability, offering a scalable solution for integration into dental practice.

开发并验证一种多模态机器学习模型,用于绝经后妇女根尖周x线片的机会性骨质疏松症筛查。这项回顾性多中心研究分析了绝经后妇女的3885张根尖周围x线片和dexa衍生的t评分。提取临床、手工制作的放射学和深度特征,形成融合的特征集。放射学特征(n = 215)遵循图像生物标志物标准化倡议(IBSI)指南,深度特征(n = 128)来自一种新型的基于注意力的自编码器。特征协调使用战斗调整;通过类内相关系数(ICC)滤波(ICC≥0.80)保证信度。通过Pearson相关和LASSO回归降低维度。四个分类器——逻辑回归、随机森林、多层感知器和xgboost——在分层训练、内部和外部测试集上进行了训练和评估。选择逻辑回归模型进行临床翻译和nomogram发展。决策曲线分析评估临床效用。使用融合特征集,XGBoost实现了最高的分类性能,内部AUC为94.6%,外部AUC为93.7%。Logistic回归保持了较强的表现(外部AUC = 91.3%),并促进了nomogram构建。深度和放射学特征独立优于临床模型,证实了它们的预测强度。SHAP分析确定DEXA t评分、年龄、维生素D和选定的放射学/深部特征是关键因素。校正曲线和Hosmer-Lemeshow检验(p = 0.492)证实了模型的可靠性。决策曲线分析显示有意义的净临床效益跨越决策阈值。牙科根尖周x线片可用于绝经后妇女的准确、非侵入性骨质疏松症筛查。该模型具有较高的准确性、通用性和可解释性,为整合到牙科实践中提供了可扩展的解决方案。
{"title":"Multimodal Radiomics and Deep Learning Integration for Bone Health Assessment in Postmenopausal Women via Dental Radiographs: Development of an Interpretable Nomogram","authors":"Zhengxia Hu,&nbsp;Xiaodong Wang,&nbsp;Hai Lan","doi":"10.1002/ima.70239","DOIUrl":"https://doi.org/10.1002/ima.70239","url":null,"abstract":"<div>\u0000 \u0000 <p>To develop and validate a multimodal machine learning model for opportunistic osteoporosis screening in postmenopausal women using dental periapical radiographs. This retrospective multicenter study analyzed 3885 periapical radiographs paired with DEXA-derived <i>T</i>-scores from postmenopausal women. Clinical, handcrafted radiomic, and deep features were extracted, resulting in a fused feature set. Radiomic features (<i>n</i> = 215) followed Image Biomarker Standardization Initiative (IBSI) guidelines, and deep features (<i>n</i> = 128) were derived from a novel attention-based autoencoder. Feature harmonization used ComBat adjustment; reliability was ensured by intra-class correlation coefficient (ICC) filtering (ICC ≥ 0.80). Dimensionality was reduced via Pearson correlation and LASSO regression. Four classifiers—logistic regression, random forest, multilayer perceptron, and XGBoost—were trained and evaluated across stratified training, internal, and external test sets. A logistic regression model was selected for clinical translation and nomogram development. Decision curve analysis assessed clinical utility. XGBoost achieved the highest classification performance using the fused feature set, with an internal AUC of 94.6% and external AUC of 93.7%. Logistic regression maintained strong performance (external AUC = 91.3%) and facilitated nomogram construction. Deep and radiomic features independently outperformed clinical-only models, confirming their predictive strength. SHAP analysis identified DEXA <i>T</i>-score, age, vitamin D, and selected radiomic/deep features as key contributors. Calibration curves and Hosmer–Lemeshow test (<i>p</i> = 0.492) confirmed model reliability. Decision curve analysis showed meaningful net clinical benefit across decision thresholds. Dental periapical radiographs can be leveraged for accurate, non-invasive osteoporosis screening in postmenopausal women. The proposed model demonstrates high accuracy, generalizability, and interpretability, offering a scalable solution for integration into dental practice.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145406702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature Reconstruction-Guided Multi-Scale Attention Network for Non-Significant Lung Nodule Detection 特征重构引导的多尺度注意网络在非显著性肺结节检测中的应用
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-28 DOI: 10.1002/ima.70235
Huiqing Xu, Wei Li, Junfang Tu, Lvchen Cao

Lung cancer remains the leading cause of cancer-related incidence and mortality worldwide. Early detection of lung nodules is crucial for significantly reducing the risk of lung cancer. However, due to the high similarity in CT image features between lung nodules and surrounding normal tissues, nodules are often missed or misidentified during the detection process. Moreover, the diverse types and morphologies of nodules further complicate the development of a unified detection approach. To address these challenges, this study proposes a novel Feature Reconstruction-guided Multi-Scale Attention Network (FRMANet). Specifically, a Refined Feature Reconstruction Module is designed to effectively suppress redundant information while preserving essential feature representations of nodules, ensuring high sensitivity and enhanced representation capability for nodule regions during feature extraction. Additionally, a Multi-scale Feature Enhancement Attention mechanism is introduced, which utilizes an attention-based fusion strategy across multiple scales to fully capture discriminative features of nodules with varying sizes and shapes. Experimental results on the LUNA16 dataset demonstrate that the proposed FRMANet achieves superior detection performance, with a mAP of 0.894 and an F1 score of 0.923, outperforming existing state-of-the-art methods.

肺癌仍然是全球癌症相关发病率和死亡率的主要原因。早期发现肺结节对于显著降低肺癌风险至关重要。然而,由于肺结节与周围正常组织的CT图像特征高度相似,在检测过程中经常被遗漏或误认。此外,结节的不同类型和形态进一步复杂化了统一检测方法的发展。为了解决这些挑战,本研究提出了一种新的特征重构引导的多尺度注意力网络(FRMANet)。具体而言,设计了一个精细化特征重构模块,在保留结节基本特征表示的同时有效地抑制冗余信息,确保特征提取过程中对结节区域的高灵敏度和增强的表示能力。此外,介绍了一种多尺度特征增强注意机制,该机制利用基于注意的多尺度融合策略,充分捕获不同大小和形状的结节的判别特征。在LUNA16数据集上的实验结果表明,本文提出的FRMANet具有优越的检测性能,mAP为0.894,F1分数为0.923,优于现有的先进方法。
{"title":"Feature Reconstruction-Guided Multi-Scale Attention Network for Non-Significant Lung Nodule Detection","authors":"Huiqing Xu,&nbsp;Wei Li,&nbsp;Junfang Tu,&nbsp;Lvchen Cao","doi":"10.1002/ima.70235","DOIUrl":"https://doi.org/10.1002/ima.70235","url":null,"abstract":"<div>\u0000 \u0000 <p>Lung cancer remains the leading cause of cancer-related incidence and mortality worldwide. Early detection of lung nodules is crucial for significantly reducing the risk of lung cancer. However, due to the high similarity in CT image features between lung nodules and surrounding normal tissues, nodules are often missed or misidentified during the detection process. Moreover, the diverse types and morphologies of nodules further complicate the development of a unified detection approach. To address these challenges, this study proposes a novel Feature Reconstruction-guided Multi-Scale Attention Network (FRMANet). Specifically, a Refined Feature Reconstruction Module is designed to effectively suppress redundant information while preserving essential feature representations of nodules, ensuring high sensitivity and enhanced representation capability for nodule regions during feature extraction. Additionally, a Multi-scale Feature Enhancement Attention mechanism is introduced, which utilizes an attention-based fusion strategy across multiple scales to fully capture discriminative features of nodules with varying sizes and shapes. Experimental results on the LUNA16 dataset demonstrate that the proposed FRMANet achieves superior detection performance, with a mAP of 0.894 and an F1 score of 0.923, outperforming existing state-of-the-art methods.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145406806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radiomic Feature-Based Prediction of Primary Cancer Origins in Brain Metastases Using Machine Learning 基于放射学特征的脑转移癌起源机器学习预测
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-28 DOI: 10.1002/ima.70234
Dilek Betül Sarıdede, Sevim Cengiz

Identifying the primary tumor origin is a critical factor in determining treatment strategies for brain metastases, which remain a major challenge in clinical practice. Traditional diagnostic methods rely on invasive procedures, which may be limited by sampling errors. In this study, a dataset of 200 patients with brain metastases originating from six different cancer types (breast, gastrointestinal, small cell lung, melanoma, non-small cell lung, and renal cell carcinoma) was included. Radiomic features were extracted from different magnetic resonance images (MRI) and selected using the Kruskal–Wallis test, correlation analysis, and ElasticNet regression. Machine learning models, including support vector machine, logistic regression, and random forest, were trained and evaluated using cross-validation and unseen test sets to predict the primary origins of metastatic brain tumors. Our results demonstrate that radiomic features can significantly enhance classification accuracy, with AUC values reaching 0.98 in distinguishing between specific cancer types. Additionally, survival analysis revealed significant differences in survival probabilities across primary tumor types. This study utilizes a larger, single-center cohort and a standardized MRI protocol, applying rigorous feature selection and multiple machine learning classifiers to enhance the robustness and clinical relevance of radiomic predictions. Our findings support the potential of radiomics as a non-invasive tool for metastatic tumor prediction and prognostic assessment, paving the way for improved personalized treatment strategies. Radiomic features extracted from MRI images can significantly enhance the prediction of the main origin of the metastatic tumor types in the brain, thereby informing treatment decisions and prognostic assessments.

确定原发肿瘤的起源是确定脑转移治疗策略的关键因素,这在临床实践中仍然是一个主要挑战。传统的诊断方法依赖于侵入性程序,这可能受到抽样误差的限制。在这项研究中,纳入了200例源自6种不同癌症类型(乳腺癌、胃肠道癌、小细胞肺癌、黑色素瘤、非小细胞肺癌和肾细胞癌)的脑转移患者的数据集。从不同的磁共振图像(MRI)中提取放射学特征,并使用Kruskal-Wallis检验、相关分析和ElasticNet回归进行选择。机器学习模型,包括支持向量机、逻辑回归和随机森林,使用交叉验证和未见测试集进行训练和评估,以预测转移性脑肿瘤的主要起源。我们的研究结果表明,放射学特征可以显著提高分类精度,在区分特定癌症类型时,AUC值达到0.98。此外,生存分析显示不同原发肿瘤类型的生存率存在显著差异。本研究采用更大的单中心队列和标准化的MRI方案,应用严格的特征选择和多个机器学习分类器来增强放射学预测的稳健性和临床相关性。我们的研究结果支持放射组学作为转移性肿瘤预测和预后评估的非侵入性工具的潜力,为改进个性化治疗策略铺平了道路。从MRI图像中提取的放射学特征可以显著增强对脑转移性肿瘤主要来源的预测,从而为治疗决策和预后评估提供信息。
{"title":"Radiomic Feature-Based Prediction of Primary Cancer Origins in Brain Metastases Using Machine Learning","authors":"Dilek Betül Sarıdede,&nbsp;Sevim Cengiz","doi":"10.1002/ima.70234","DOIUrl":"https://doi.org/10.1002/ima.70234","url":null,"abstract":"<div>\u0000 \u0000 <p>Identifying the primary tumor origin is a critical factor in determining treatment strategies for brain metastases, which remain a major challenge in clinical practice. Traditional diagnostic methods rely on invasive procedures, which may be limited by sampling errors. In this study, a dataset of 200 patients with brain metastases originating from six different cancer types (breast, gastrointestinal, small cell lung, melanoma, non-small cell lung, and renal cell carcinoma) was included. Radiomic features were extracted from different magnetic resonance images (MRI) and selected using the Kruskal–Wallis test, correlation analysis, and ElasticNet regression. Machine learning models, including support vector machine, logistic regression, and random forest, were trained and evaluated using cross-validation and unseen test sets to predict the primary origins of metastatic brain tumors. Our results demonstrate that radiomic features can significantly enhance classification accuracy, with AUC values reaching 0.98 in distinguishing between specific cancer types. Additionally, survival analysis revealed significant differences in survival probabilities across primary tumor types. This study utilizes a larger, single-center cohort and a standardized MRI protocol, applying rigorous feature selection and multiple machine learning classifiers to enhance the robustness and clinical relevance of radiomic predictions. Our findings support the potential of radiomics as a non-invasive tool for metastatic tumor prediction and prognostic assessment, paving the way for improved personalized treatment strategies. Radiomic features extracted from MRI images can significantly enhance the prediction of the main origin of the metastatic tumor types in the brain, thereby informing treatment decisions and prognostic assessments.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145406805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ViTCXRResNet: Harnessing Explainable Artificial Intelligence in Medical Imaging—Chest X-Ray-Based Patients Demographic Prediction 在医学成像中利用可解释的人工智能——基于胸部x光的患者人口统计预测
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-27 DOI: 10.1002/ima.70233
Sugirdha Ranganathan, Kirubhasini Srinivasan, Sriramakrishnan Pathmanaban, Kalaiselvi Thiruvenkadam

Patient demographic prediction involves estimating age, gender, ethnicity, and other personal characteristics using X-rays. This can help in personalized medicine and improved healthcare outcomes. It can assist in automated diagnosis for some diseases that exhibit age and gender-specific prevalence. It can also help in forensic science to identify individuals when demographic information is missing. Insights from deep learning can verify the gender and age of self-reported individuals through chest X-rays (CXRs). In this proposed work, we have deployed an artificial intelligence (AI) enabled model which focuses on two tasks: gender classification and age prediction from CXRs. For gender classification, the model combines ResNet-50 (CNN) and Vision Transformer (ViT) to leverage both local feature extraction and global contextual understanding for predicting gender and is called ViTCXRResNet. The model was trained and validated on an Amazon Web Services (SPR) dataset of 10702 images, split with an 80–20 ratio, which was evaluated with classification metrics to determine the model's behavior. For age prediction, extracted features from ResNet-50 were used with dimensionality reduction through principal component analysis (PCA). A fully connected feedforward neural network was trained on the reduced feature set to predict age. The classification and regression model achieves accuracy results of 93.46% for gender classification and 0.86 for the R2 score for age prediction on the SPR dataset. For visual interpretation, explainable AI (Gradient-weighted Class Activation Mapping) was utilized to visualize and find out which parts of the image are prioritized for classifying gender. The proposed model yields high classification accuracy in gender detection and significant accuracy in age prediction. The model shows competitive accuracy compared to existing methods. Further, the demographic prediction stability of the model was proven on two different ethnic groups, such as the Japanese Society of Radiological Technology (JSRT) and Montgomery (USA) datasets.

患者人口统计预测包括使用x射线估计年龄、性别、种族和其他个人特征。这有助于个性化医疗和改善医疗保健结果。它可以帮助对某些表现出年龄和性别特定患病率的疾病进行自动诊断。它还可以帮助法医科学在人口统计信息缺失的情况下识别个人。来自深度学习的见解可以通过胸部x光片(cxr)验证自我报告的个体的性别和年龄。在这项工作中,我们部署了一个人工智能(AI)支持的模型,该模型专注于两项任务:性别分类和来自cxr的年龄预测。对于性别分类,该模型结合了ResNet-50 (CNN)和Vision Transformer (ViT),利用局部特征提取和全局上下文理解来预测性别,称为ViTCXRResNet。该模型在Amazon Web Services (SPR)的10702张图像数据集上进行训练和验证,以80-20的比例进行分割,并使用分类指标进行评估,以确定模型的行为。年龄预测采用ResNet-50提取的特征,并通过主成分分析(PCA)进行降维。在约简特征集上训练全连接前馈神经网络进行年龄预测。该分类回归模型在SPR数据集上对性别分类的准确率为93.46%,对年龄预测的R2评分为0.86。对于视觉解释,使用可解释的AI(梯度加权类激活映射)来可视化并找出图像的哪些部分优先用于分类性别。该模型在性别检测方面具有较高的分类准确率,在年龄预测方面具有显著的准确率。与现有方法相比,该模型具有相当的准确性。此外,在日本放射技术学会(JSRT)和Montgomery(美国)两个不同的族群数据集上验证了该模型的人口统计学预测稳定性。
{"title":"ViTCXRResNet: Harnessing Explainable Artificial Intelligence in Medical Imaging—Chest X-Ray-Based Patients Demographic Prediction","authors":"Sugirdha Ranganathan,&nbsp;Kirubhasini Srinivasan,&nbsp;Sriramakrishnan Pathmanaban,&nbsp;Kalaiselvi Thiruvenkadam","doi":"10.1002/ima.70233","DOIUrl":"https://doi.org/10.1002/ima.70233","url":null,"abstract":"<div>\u0000 \u0000 <p>Patient demographic prediction involves estimating age, gender, ethnicity, and other personal characteristics using X-rays. This can help in personalized medicine and improved healthcare outcomes. It can assist in automated diagnosis for some diseases that exhibit age and gender-specific prevalence. It can also help in forensic science to identify individuals when demographic information is missing. Insights from deep learning can verify the gender and age of self-reported individuals through chest X-rays (CXRs). In this proposed work, we have deployed an artificial intelligence (AI) enabled model which focuses on two tasks: gender classification and age prediction from CXRs. For gender classification, the model combines ResNet-50 (CNN) and Vision Transformer (ViT) to leverage both local feature extraction and global contextual understanding for predicting gender and is called ViTCXRResNet. The model was trained and validated on an Amazon Web Services (SPR) dataset of 10702 images, split with an 80–20 ratio, which was evaluated with classification metrics to determine the model's behavior. For age prediction, extracted features from ResNet-50 were used with dimensionality reduction through principal component analysis (PCA). A fully connected feedforward neural network was trained on the reduced feature set to predict age. The classification and regression model achieves accuracy results of 93.46% for gender classification and 0.86 for the <i>R</i><sup>2</sup> score for age prediction on the SPR dataset. For visual interpretation, explainable AI (Gradient-weighted Class Activation Mapping) was utilized to visualize and find out which parts of the image are prioritized for classifying gender. The proposed model yields high classification accuracy in gender detection and significant accuracy in age prediction. The model shows competitive accuracy compared to existing methods. Further, the demographic prediction stability of the model was proven on two different ethnic groups, such as the Japanese Society of Radiological Technology (JSRT) and Montgomery (USA) datasets.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145406549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DenseNet201SA++: Enhanced Melanoma Recognition in Dermoscopy Images via Soft Attention Guided Feature Learning densenet201sa++:基于软注意引导特征学习的皮肤镜图像黑色素瘤识别
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-24 DOI: 10.1002/ima.70236
Shuangshuang Hu, Xiaomei Xu

As the first line of defense in the human immune system, the skin is highly susceptible to environmental toxins. Melanoma, the most lethal type of skin cancer, is characterized by high mortality and a strong tendency for metastasis. It can sometimes originate from pre-existing nevi, particularly dysplastic nevi. Early identification is crucial for improving patient survival rates. However, traditional skin lesion detection faces challenges due to image quality limitations, dataset imperfections, and the complexity of lesion features. This study proposes the DenseNet201SA++ model, which uses image augmentation techniques and the soft attention mechanism to optimize dermoscopy image quality and automatically capture critical features. Experiments on the HAM10000 dataset with 10,015 dermoscopic images, focusing on binary classification (melanoma vs. nevus), show that the DenseNet201SA++ model achieves significant performance gains, with improvements in precision, recall, F1-score, and accuracy of at least 7.2%, 14.7%, 12.7%, and 14.7% compared to baseline networks. The proposed soft attention-guided feature fusion in DenseNet201SA++ addresses feature redundancy in traditional attention mechanisms, achieving superior performance in distinguishing Mel from Nv, while the DenseNet201 backbone shows distinct advantages. Ablation studies confirm the significant role of data augmentation. The integrated DenseNet201SA++ model achieves robust results with precision, recall, F1-score, and accuracy all reaching 0.983, complemented by an AUC of 0.993. These metrics demonstrate the model's exceptional balance between discriminative power and generalization capability, validating the effectiveness of our proposed architecture.

作为人体免疫系统的第一道防线,皮肤极易受到环境毒素的影响。黑色素瘤是最致命的一种皮肤癌,其特点是死亡率高,有很强的转移倾向。它有时可能源于已有的痣,特别是发育不良的痣。早期识别对提高患者存活率至关重要。然而,由于图像质量的限制、数据集的不完善以及病变特征的复杂性,传统的皮肤病变检测面临着挑战。本研究提出了densenet201sa++模型,该模型利用图像增强技术和软注意机制优化皮肤镜图像质量并自动捕获关键特征。在带有10015张皮肤镜图像的HAM10000数据集上进行的实验,重点是二元分类(黑色素瘤与痣),结果表明,与基线网络相比,densenet201sa+ +模型取得了显著的性能提升,精度、召回率、f1评分和准确率分别提高了7.2%、14.7%、12.7%和14.7%。提出的软注意引导特征融合在densenet201sa++中解决了传统注意机制中的特征冗余,在区分Mel和Nv方面取得了优异的性能,而DenseNet201骨干网则表现出明显的优势。消融研究证实了数据增强的重要作用。集成的DenseNet201SA++模型具有鲁棒性,精密度、召回率、f1分数和准确度均达到0.983,AUC为0.993。这些指标证明了模型在判别能力和泛化能力之间的卓越平衡,验证了我们提出的架构的有效性。
{"title":"DenseNet201SA++: Enhanced Melanoma Recognition in Dermoscopy Images via Soft Attention Guided Feature Learning","authors":"Shuangshuang Hu,&nbsp;Xiaomei Xu","doi":"10.1002/ima.70236","DOIUrl":"https://doi.org/10.1002/ima.70236","url":null,"abstract":"<div>\u0000 \u0000 <p>As the first line of defense in the human immune system, the skin is highly susceptible to environmental toxins. Melanoma, the most lethal type of skin cancer, is characterized by high mortality and a strong tendency for metastasis. It can sometimes originate from pre-existing nevi, particularly dysplastic nevi. Early identification is crucial for improving patient survival rates. However, traditional skin lesion detection faces challenges due to image quality limitations, dataset imperfections, and the complexity of lesion features. This study proposes the DenseNet201SA++ model, which uses image augmentation techniques and the soft attention mechanism to optimize dermoscopy image quality and automatically capture critical features. Experiments on the HAM10000 dataset with 10,015 dermoscopic images, focusing on binary classification (melanoma vs. nevus), show that the DenseNet201SA++ model achieves significant performance gains, with improvements in precision, recall, F1-score, and accuracy of at least 7.2%, 14.7%, 12.7%, and 14.7% compared to baseline networks. The proposed soft attention-guided feature fusion in DenseNet201SA++ addresses feature redundancy in traditional attention mechanisms, achieving superior performance in distinguishing Mel from Nv, while the DenseNet201 backbone shows distinct advantages. Ablation studies confirm the significant role of data augmentation. The integrated DenseNet201SA++ model achieves robust results with precision, recall, F1-score, and accuracy all reaching 0.983, complemented by an AUC of 0.993. These metrics demonstrate the model's exceptional balance between discriminative power and generalization capability, validating the effectiveness of our proposed architecture.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145367145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Framework for Classification of COVID-19 Variants Using K-mer Based DNA Sequencing 基于K-mer的DNA测序的COVID-19变体分类机器学习框架
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-22 DOI: 10.1002/ima.70231
Sunil Kumar, Sanjay Raju, Biswajit Bhowmik

Accurate classification of viral DNA sequences is essential for tracking mutations, understanding viral evolution, and enabling timely public health responses. Traditional alignment-based methods are often computationally intensive and less effective for highly mutating viruses. This article presents a machine learning framework for classifying DNA sequences of COVID-19 variants using K-mer-based tokenization and vectorization techniques inspired by Natural Language Processing (NLP). DNA sequences corresponding to Alpha, Beta, Gamma, and Omicron variants are obtained from the Global Initiative on Sharing All Influenza Data (GISAID) database and encoded into feature vectors. Multiple classifiers, including Extra Trees, Random Forest, Support Vector Classifier (SVC), Decision Tree, Logistic Regression, Naive Bayes, K-Nearest Neighbor (KNN), Ridge Classifier, Stochastic Gradient Descent (SGD), and XGBoost, are evaluated based on accuracy, precision, recall, and F1-score. The Extra Trees model achieved the highest accuracy of 93.10% ±$$ pm $$ 0.42, followed by Random Forest with 92.60% ±$$ pm $$ 0.38, both demonstrating robust and balanced performance. Statistical significance tests confirmed the robustness of the results. The results validate the effectiveness of K-mer-based encoding combined with traditional machine learning models in classifying COVID-19 variants, offering a scalable and efficient solution for genomic surveillance.

病毒DNA序列的准确分类对于跟踪突变、了解病毒进化和及时作出公共卫生反应至关重要。传统的基于比对的方法通常是计算密集型的,并且对高度变异的病毒不太有效。本文提出了一种机器学习框架,用于使用受自然语言处理(NLP)启发的基于k -mer的标记化和矢量化技术对COVID-19变体的DNA序列进行分类。Alpha、Beta、Gamma和Omicron变体对应的DNA序列从共享所有流感数据全球倡议(GISAID)数据库中获得,并编码为特征向量。多个分类器,包括额外树,随机森林,支持向量分类器(SVC),决策树,逻辑回归,朴素贝叶斯,k -最近邻(KNN),山脊分类器,随机梯度下降(SGD)和XGBoost,基于准确性,精密度,召回率和一级分数进行评估。Extra Trees模型达到了93.10的最高精度% ± $$ pm $$ 0.42, followed by Random Forest with 92.60% ± $$ pm $$ 0.38, both demonstrating robust and balanced performance. Statistical significance tests confirmed the robustness of the results. The results validate the effectiveness of K-mer-based encoding combined with traditional machine learning models in classifying COVID-19 variants, offering a scalable and efficient solution for genomic surveillance.
{"title":"Machine Learning Framework for Classification of COVID-19 Variants Using K-mer Based DNA Sequencing","authors":"Sunil Kumar,&nbsp;Sanjay Raju,&nbsp;Biswajit Bhowmik","doi":"10.1002/ima.70231","DOIUrl":"https://doi.org/10.1002/ima.70231","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate classification of viral DNA sequences is essential for tracking mutations, understanding viral evolution, and enabling timely public health responses. Traditional alignment-based methods are often computationally intensive and less effective for highly mutating viruses. This article presents a machine learning framework for classifying DNA sequences of COVID-19 variants using K-mer-based tokenization and vectorization techniques inspired by Natural Language Processing (NLP). DNA sequences corresponding to Alpha, Beta, Gamma, and Omicron variants are obtained from the Global Initiative on Sharing All Influenza Data (GISAID) database and encoded into feature vectors. Multiple classifiers, including Extra Trees, Random Forest, Support Vector Classifier (SVC), Decision Tree, Logistic Regression, Naive Bayes, K-Nearest Neighbor (KNN), Ridge Classifier, Stochastic Gradient Descent (SGD), and XGBoost, are evaluated based on accuracy, precision, recall, and F1-score. The Extra Trees model achieved the highest accuracy of 93.10% <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>±</mo>\u0000 </mrow>\u0000 <annotation>$$ pm $$</annotation>\u0000 </semantics></math> 0.42, followed by Random Forest with 92.60% <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>±</mo>\u0000 </mrow>\u0000 <annotation>$$ pm $$</annotation>\u0000 </semantics></math> 0.38, both demonstrating robust and balanced performance. Statistical significance tests confirmed the robustness of the results. The results validate the effectiveness of K-mer-based encoding combined with traditional machine learning models in classifying COVID-19 variants, offering a scalable and efficient solution for genomic surveillance.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M3IF-(SWT-TVC): Multi-Modal Medical Image Fusion via Weighted Energy, Contrast in the SWT Domain, and Total Variation Minimization With Chambolle's Algorithm M3IF-(SWT- tvc):基于加权能量、SWT域对比度和Chambolle算法的总变异最小化的多模态医学图像融合
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-18 DOI: 10.1002/ima.70222
Prabhishek Singh, Manoj Diwakar

The multi-modal medical image fusion (M3IF) combines the required and important information from different medical imaging modalities (computed tomography [CT], magnetic resonance imaging (MRI), positron emission tomography [PET], and single photon emission computed tomography [SPECT]) to provide single informative image. M3IF provides enhanced patient diagnosis, and precise treatment planning. This paper proposes a hybrid M3IF where input medical images are decomposed using stationary wavelet transform (SWT) into low-frequency components (LFCs) and high-frequency components (HFCs). The LFCs and HFCs are fused using energy- and contrast-based metrics. And later reconstruction is performed using inverse SWT (ISWT). The total variation minimization (TVM) using Chambolle's algorithm is applied as a post-refinement operation to reduce noise and preserves the fine details. In this paper, the proposed methodology is termed as M3IF-(SWT-TVC), Here, the acronym TVC is the combination of TVM using Chambolle's algorithm. TVM refinement process is an iterative approach, with the fusion outcomes of M3IF-(SWT-TVC) assessed over a predefined 100 iterations. The TVM, and SWT are blended to balance smoothness and structural details. The final fusion results obtained through M3IF-(SWT-TVC) are evaluated against several prominent non-traditional methods. Based on both visual quality and quantitative metric analysis, it is observed that M3IF-(SWT-TVC) outperforms all the methods used for comparison.

多模态医学图像融合(M3IF)将不同医学成像模式(计算机断层扫描[CT]、磁共振成像(MRI)、正电子发射断层扫描[PET]和单光子发射计算机断层扫描[SPECT])所需的重要信息结合起来,提供单一的信息图像。M3IF提供增强的患者诊断和精确的治疗计划。本文提出了一种混合M3IF算法,利用平稳小波变换(SWT)将输入医学图像分解为低频分量(lfc)和高频分量(hfc)。使用基于能量和对比度的度量来融合低碳化合物和氢氟化合物。然后使用逆SWT (ISWT)进行重建。采用Chambolle算法的总变异最小化(total variation minimization, TVM)作为后精运算,既能降低噪声,又能保留图像的细节。本文提出的方法称为M3IF-(SWT-TVC),这里的缩写TVC是使用Chambolle算法的TVM组合。TVM细化过程是一种迭代方法,在预定义的100次迭代中评估M3IF-(SWT-TVC)的融合结果。TVM和SWT混合,以平衡平滑和结构细节。通过M3IF-(SWT-TVC)获得的最终融合结果与几种突出的非传统方法进行了评估。基于视觉质量和定量度量分析,观察到M3IF-(SWT-TVC)优于所有用于比较的方法。
{"title":"M3IF-(SWT-TVC): Multi-Modal Medical Image Fusion via Weighted Energy, Contrast in the SWT Domain, and Total Variation Minimization With Chambolle's Algorithm","authors":"Prabhishek Singh,&nbsp;Manoj Diwakar","doi":"10.1002/ima.70222","DOIUrl":"https://doi.org/10.1002/ima.70222","url":null,"abstract":"<div>\u0000 \u0000 <p>The multi-modal medical image fusion (M3IF) combines the required and important information from different medical imaging modalities (computed tomography [CT], magnetic resonance imaging (MRI), positron emission tomography [PET], and single photon emission computed tomography [SPECT]) to provide single informative image. M3IF provides enhanced patient diagnosis, and precise treatment planning. This paper proposes a hybrid <i>M3IF</i> where input medical images are decomposed using stationary wavelet transform (SWT) into low-frequency components (LFCs) and high-frequency components (HFCs). The LFCs and HFCs are fused using energy- and contrast-based metrics. And later reconstruction is performed using inverse SWT (ISWT). The total variation minimization (TVM) using Chambolle's algorithm is applied as a post-refinement operation to reduce noise and preserves the fine details. In this paper, the proposed methodology is termed as M3IF-(SWT-TVC), Here, the acronym TVC is the combination of TVM using Chambolle's algorithm. TVM refinement process is an iterative approach, with the fusion outcomes of M3IF-(SWT-TVC) assessed over a predefined 100 iterations. The TVM, and SWT are blended to balance smoothness and structural details. The final fusion results obtained through M3IF-(SWT-TVC) are evaluated against several prominent non-traditional methods. Based on both visual quality and quantitative metric analysis, it is observed that M3IF-(SWT-TVC) outperforms all the methods used for comparison.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Skin Cancer Classification With ResNet-18: A Scalable Approach With 3D Total Body Photography (3D-TBP) 使用ResNet-18优化皮肤癌分类:3D全身摄影(3D- tbp)的可扩展方法
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-16 DOI: 10.1002/ima.70224
Javed Rashid, Turke Althobaiti, Alina Shabbir, Muhammad Shoaib Saleem, Muhammad Faheem

Skin cancer, particularly melanoma, remains a major public health challenge because of its rising incidence and mortality rates. Traditional methods of diagnosis, like dermoscopy and biopsies, are invasive, time-consuming, and highly dependent on clinical experience. Furthermore, previous research has predominantly focused on 2D dermoscopic images, which do not capture important volumetric information required for the proper evaluation of the injury. This work introduces a new deep learning architecture based on the ResNet-18 model, augmented by transfer learning, for binary classification of malignant and benign skin lesions. The model is trained on the ISIC 2024 3D Total Body Photography dataset and uses pre-trained ImageNet weights to enable effective feature extraction. To counter the dataset's natural class imbalance and minimize overfitting, the model uses sophisticated data augmentation and oversampling methods. The suggested model boasts a staggering classification accuracy of 99.82%, surpassing many other 2D-based alternatives. The utilization of 3D-TBP offers a strong diagnostic benefit by allowing volumetric lesion analysis, retaining spatial and depth features usually lost in the conventional 2D images. The findings validate the clinical feasibility of the method, presenting a scalable, noninvasive, and very accurate early detection and diagnosis of melanoma using 3D skin imaging.

皮肤癌,特别是黑色素瘤,由于其发病率和死亡率不断上升,仍然是一个重大的公共卫生挑战。传统的诊断方法,如皮肤镜检查和活检,是侵入性的,耗时的,并且高度依赖于临床经验。此外,以前的研究主要集中在2D皮肤镜图像上,这并不能捕捉到正确评估损伤所需的重要体积信息。这项工作引入了一种基于ResNet-18模型的新的深度学习架构,通过迁移学习增强,用于恶性和良性皮肤病变的二元分类。该模型在ISIC 2024 3D全身摄影数据集上进行训练,并使用预训练的ImageNet权值进行有效的特征提取。为了对抗数据集的自然类不平衡并最小化过拟合,该模型使用了复杂的数据增强和过采样方法。所建议的模型拥有惊人的99.82%的分类准确率,超过了许多其他基于2d的替代品。3D-TBP的应用提供了强大的诊断优势,允许对病变进行体积分析,保留了传统2D图像中通常丢失的空间和深度特征。研究结果验证了该方法的临床可行性,通过3D皮肤成像提供了可扩展的、无创的、非常准确的黑色素瘤早期检测和诊断。
{"title":"Optimizing Skin Cancer Classification With ResNet-18: A Scalable Approach With 3D Total Body Photography (3D-TBP)","authors":"Javed Rashid,&nbsp;Turke Althobaiti,&nbsp;Alina Shabbir,&nbsp;Muhammad Shoaib Saleem,&nbsp;Muhammad Faheem","doi":"10.1002/ima.70224","DOIUrl":"https://doi.org/10.1002/ima.70224","url":null,"abstract":"<p>Skin cancer, particularly melanoma, remains a major public health challenge because of its rising incidence and mortality rates. Traditional methods of diagnosis, like dermoscopy and biopsies, are invasive, time-consuming, and highly dependent on clinical experience. Furthermore, previous research has predominantly focused on 2D dermoscopic images, which do not capture important volumetric information required for the proper evaluation of the injury. This work introduces a new deep learning architecture based on the ResNet-18 model, augmented by transfer learning, for binary classification of malignant and benign skin lesions. The model is trained on the ISIC 2024 3D Total Body Photography dataset and uses pre-trained ImageNet weights to enable effective feature extraction. To counter the dataset's natural class imbalance and minimize overfitting, the model uses sophisticated data augmentation and oversampling methods. The suggested model boasts a staggering classification accuracy of 99.82%, surpassing many other 2D-based alternatives. The utilization of 3D-TBP offers a strong diagnostic benefit by allowing volumetric lesion analysis, retaining spatial and depth features usually lost in the conventional 2D images. The findings validate the clinical feasibility of the method, presenting a scalable, noninvasive, and very accurate early detection and diagnosis of melanoma using 3D skin imaging.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70224","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Lumbar Disc Intensity Classification From MRI Scans Using Region-Based CNNs and Transformer Models 利用基于区域的cnn和变压器模型从MRI扫描中自动分类腰椎间盘强度
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-16 DOI: 10.1002/ima.70229
Hasan Ulutas, Mustafa Fatih Erkoc, Erdal Ozbay, Muhammet Emin Sahin, Mucella Ozbay Karakus, Esra Yuce

This study explores the effectiveness of deep learning methodologies in the detection and classification of lumbar disc intensity using MRI scans. Initially, region-based deep learning frameworks, including Faster R-CNN and Mask R-CNN with different backbones such as ResNet50 and ResNet101 are evaluated. Results demonstrated that backbone selection significantly impacts model performance, with Mask R-CNN combined with ResNet101 achieving a remarkable [email protected] (AP50) of 99.83%. In addition to object detection models, Transformer-based classification architectures, including MaxViT, Vision Transformer (ViT), a Hybrid CNN-ViT model, and Fine-Tuned Enhanced Pyramid Network (FT-EPN), are implemented. Among these, the Hybrid model achieved the highest classification accuracy (83.1%), while MaxViT yielded the highest precision (0.804). Comparative analyses highlighted that while Mask R-CNN models excelled in segmentation and detection tasks, Transformer-based models provided effective solutions for direct severity classification of lumbar discs. These findings emphasize the critical role of both backbone architecture and model type in optimizing diagnostic performance. The study demonstrates the potential of integrating region-based and Transformer-based models in advancing automated lumbar spine assessment, paving the way for more accurate and reliable medical diagnostic systems.

本研究探讨了深度学习方法在使用MRI扫描检测和分类腰椎间盘强度方面的有效性。首先,评估了基于区域的深度学习框架,包括具有不同主干(如ResNet50和ResNet101)的Faster R-CNN和Mask R-CNN。结果表明,骨干网选择对模型性能有显著影响,Mask R-CNN与ResNet101结合可获得99.83%的AP50 (email protected)。除了目标检测模型外,还实现了基于变压器的分类架构,包括MaxViT,视觉变压器(ViT), CNN-ViT混合模型和微调增强金字塔网络(FT-EPN)。其中,Hybrid模型的分类准确率最高(83.1%),MaxViT模型的分类准确率最高(0.804)。对比分析表明,Mask R-CNN模型在分割和检测任务上表现出色,而基于transformer的模型则为腰椎间盘的直接严重程度分类提供了有效的解决方案。这些发现强调了骨干结构和模型类型在优化诊断性能中的关键作用。该研究展示了整合基于区域和基于transformer的模型在推进腰椎自动评估方面的潜力,为更准确和可靠的医疗诊断系统铺平了道路。
{"title":"Automated Lumbar Disc Intensity Classification From MRI Scans Using Region-Based CNNs and Transformer Models","authors":"Hasan Ulutas,&nbsp;Mustafa Fatih Erkoc,&nbsp;Erdal Ozbay,&nbsp;Muhammet Emin Sahin,&nbsp;Mucella Ozbay Karakus,&nbsp;Esra Yuce","doi":"10.1002/ima.70229","DOIUrl":"https://doi.org/10.1002/ima.70229","url":null,"abstract":"<div>\u0000 \u0000 <p>This study explores the effectiveness of deep learning methodologies in the detection and classification of lumbar disc intensity using MRI scans. Initially, region-based deep learning frameworks, including Faster R-CNN and Mask R-CNN with different backbones such as ResNet50 and ResNet101 are evaluated. Results demonstrated that backbone selection significantly impacts model performance, with Mask R-CNN combined with ResNet101 achieving a remarkable [email protected] (AP50) of 99.83%. In addition to object detection models, Transformer-based classification architectures, including MaxViT, Vision Transformer (ViT), a Hybrid CNN-ViT model, and Fine-Tuned Enhanced Pyramid Network (FT-EPN), are implemented. Among these, the Hybrid model achieved the highest classification accuracy (83.1%), while MaxViT yielded the highest precision (0.804). Comparative analyses highlighted that while Mask R-CNN models excelled in segmentation and detection tasks, Transformer-based models provided effective solutions for direct severity classification of lumbar discs. These findings emphasize the critical role of both backbone architecture and model type in optimizing diagnostic performance. The study demonstrates the potential of integrating region-based and Transformer-based models in advancing automated lumbar spine assessment, paving the way for more accurate and reliable medical diagnostic systems.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Network With Spectrum Transformer and Triplet Attention for CT Image Segmentation 基于频谱变换和三重关注的CT图像分割新方法
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-16 DOI: 10.1002/ima.70221
Ju Zhang, Jiahao Yu, Changgang Ying, Yun Cheng, Fanghong Wang

Deep learning-based methods have achieved great progress in CT image segmentation in recent years. However, the lack of unified large-scale datasets, unbalanced categories of segmented images, blurred boundaries between infected and healthy regions, and different sizes and shapes of lesions have led to the fact that the existing methods still have challenges in further improving the segmentation accuracy in medical applications. In this work, a novel network with spectrum Transformer and triplet attention for CT image segmentation is proposed. The computed spectrum Transformer module and the triplet attention module (TAM) are fused in a parallel way utilizing the parallel hybrid fusion module (PHFM), which extracts global and local contextual information from sequence features and cross-dimensional features. A spectrum Transformer block (STB) is proposed, which utilizes the fast Fourier transform (FFT) to learn the weights of each frequency component in the spectral space. Extensive comparison experiments are conducted on both the COVID-19-Seg dataset and Mosmeddata, showing that the proposed network has better accuracy than most existing methods for CT image segmentation tasks. In particular, the proposed model achieves improvement by 1.29% and 2.28% in terms of DSC, respectively, by 1.15% and 1.35% in terms of mIoU. SEN metrics increase by 1.45% and 1.48%, respectively. PRE also achieves the best results, showing its significant advantage in accurately segmenting medical images. SPE also shows quite good results in both datasets. Ablation studies show that the proposed STB and TAM modules improve the segmentation performance significantly.

近年来,基于深度学习的方法在CT图像分割方面取得了很大的进展。然而,由于缺乏统一的大规模数据集,分割图像类别不平衡,感染区域和健康区域界限模糊,病变大小和形状不同,导致现有方法在进一步提高医学应用中的分割精度方面仍然存在挑战。本文提出了一种基于频谱变换和三重关注的CT图像分割方法。利用并行混合融合模块(PHFM)将计算谱变压器模块和三重关注模块(TAM)进行并行融合,从序列特征和跨维特征中提取全局和局部上下文信息。提出了一种利用快速傅里叶变换(FFT)学习频谱空间中各频率分量权重的频谱变换块(STB)。在COVID-19-Seg数据集和Mosmeddata数据集上进行了大量的对比实验,结果表明,本文提出的网络在CT图像分割任务中具有比大多数现有方法更好的准确率。其中,提出的模型DSC分别提高了1.29%和2.28%,mIoU分别提高了1.15%和1.35%。SEN指标分别增长了1.45%和1.48%。PRE也取得了最好的效果,在医学图像的准确分割上具有明显的优势。SPE在这两个数据集中也显示出相当好的结果。烧蚀研究表明,所提出的STB和TAM模块显著提高了分割性能。
{"title":"A Novel Network With Spectrum Transformer and Triplet Attention for CT Image Segmentation","authors":"Ju Zhang,&nbsp;Jiahao Yu,&nbsp;Changgang Ying,&nbsp;Yun Cheng,&nbsp;Fanghong Wang","doi":"10.1002/ima.70221","DOIUrl":"https://doi.org/10.1002/ima.70221","url":null,"abstract":"<div>\u0000 \u0000 <p>Deep learning-based methods have achieved great progress in CT image segmentation in recent years. However, the lack of unified large-scale datasets, unbalanced categories of segmented images, blurred boundaries between infected and healthy regions, and different sizes and shapes of lesions have led to the fact that the existing methods still have challenges in further improving the segmentation accuracy in medical applications. In this work, a novel network with spectrum Transformer and triplet attention for CT image segmentation is proposed. The computed spectrum Transformer module and the triplet attention module (TAM) are fused in a parallel way utilizing the parallel hybrid fusion module (PHFM), which extracts global and local contextual information from sequence features and cross-dimensional features. A spectrum Transformer block (STB) is proposed, which utilizes the fast Fourier transform (FFT) to learn the weights of each frequency component in the spectral space. Extensive comparison experiments are conducted on both the COVID-19-Seg dataset and Mosmeddata, showing that the proposed network has better accuracy than most existing methods for CT image segmentation tasks. In particular, the proposed model achieves improvement by 1.29% and 2.28% in terms of DSC, respectively, by 1.15% and 1.35% in terms of mIoU. SEN metrics increase by 1.45% and 1.48%, respectively. PRE also achieves the best results, showing its significant advantage in accurately segmenting medical images. SPE also shows quite good results in both datasets. Ablation studies show that the proposed STB and TAM modules improve the segmentation performance significantly.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Imaging Systems and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1