首页 > 最新文献

Computer methods and programs in biomedicine最新文献

英文 中文
Classification of α-thalassemia data using machine learning models 机器学习模型对α-地中海贫血数据的分类。
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-06 DOI: 10.1016/j.cmpb.2024.108581
Frederik Christensen , Deniz Kenan Kılıç , Izabela Ewa Nielsen , Tarec Christoffer El-Galaly , Andreas Glenthøj , Jens Helby , Henrik Frederiksen , Sören Möller , Alexander Djupnes Fuglkjær

Background:

Around 7% of the global population has congenital hemoglobin disorders, with over 300,000 new cases of α-thalassemia annually. Diagnosis is costly and inaccurate in low-income regions, often relying on complete blood count (CBC) tests. This study employs machine learning (ML) to classify α-thalassemia traits based on gender and CBC, exploring the effects of grouping silent- and non-carriers.

Methods:

The dataset includes 288 individuals with suspected α-thalassemia from Sri Lanka. It was classified using eleven discriminant formulae and nine ML models. Outliers were removed using Mahalanobis distance, and resampling was conducted with the synthetic minority oversampling technique (SMOTE) and SMOTE-nominal continuous (NC). The Mann–Whitney U test handled feature extraction and class grouping. ML performance was evaluated with eight criteria.

Results:

The Ehsani formula achieved an area under the receiver operating characteristic curve (ROC-AUC) of 0.66 by grouping silent- and non-carriers. The convolutional neural network (CNN) without feature extraction demonstrated better performance, with an accuracy of 0.85, sensitivity of 0.8, specificity of 0.86, and ROC-AUC of 0.95/0.93 (micro/macro). Performance was maintained even without preprocessing.

Conclusion:

ML models outperformed classical discriminant formulae in classifying α-thalassemia using sex and CBC features. A larger dataset could enhance ML model generalization and the impact of feature extraction. Grouping silent- and non-carriers improved ML results, especially with resampling. The silent carriers were not separable from non-carriers regarding the available features.
背景:全球约7%的人口患有先天性血红蛋白疾病,每年有超过30万新发α-地中海贫血病例。诊断是昂贵和不准确在低收入地区,经常依靠全血细胞计数(CBC)测试。本研究采用机器学习(ML)技术,基于性别和CBC对α-地中海贫血特征进行分类,探讨沉默携带者和非携带者分组的效果。方法:收集来自斯里兰卡的288例疑似α-地中海贫血患者。使用11个判别公式和9个ML模型对其进行分类。使用Mahalanobis距离离群值被移除,重新取样进行合成少数过采样技术(打)和SMOTE-nominal连续(NC)。Mann-Whitney U测试处理特征提取和类分组。用8个标准评价ML的性能。结果:采用Ehsani公式对沉默携带者和非携带者进行分组,所得受试者工作特征曲线下面积(ROC-AUC)为0.66。不进行特征提取的卷积神经网络(CNN)表现出更好的性能,准确率为0.85,灵敏度为0.8,特异性为0.86,ROC-AUC为0.95/0.93(微观/宏观)。即使没有预处理,性能也保持不变。结论:ML模型在根据性别和CBC特征对α-地中海贫血进行分类方面优于经典判别公式。更大的数据集可以增强机器学习模型的泛化和特征提取的影响。对沉默携带者和非携带者进行分组改善了机器学习结果,特别是在重新采样时。就可用的功能而言,沉默的载体与非载体是不可分离的。
{"title":"Classification of α-thalassemia data using machine learning models","authors":"Frederik Christensen ,&nbsp;Deniz Kenan Kılıç ,&nbsp;Izabela Ewa Nielsen ,&nbsp;Tarec Christoffer El-Galaly ,&nbsp;Andreas Glenthøj ,&nbsp;Jens Helby ,&nbsp;Henrik Frederiksen ,&nbsp;Sören Möller ,&nbsp;Alexander Djupnes Fuglkjær","doi":"10.1016/j.cmpb.2024.108581","DOIUrl":"10.1016/j.cmpb.2024.108581","url":null,"abstract":"<div><h3>Background:</h3><div>Around 7% of the global population has congenital hemoglobin disorders, with over 300,000 new cases of <span><math><mi>α</mi></math></span>-thalassemia annually. Diagnosis is costly and inaccurate in low-income regions, often relying on complete blood count (CBC) tests. This study employs machine learning (ML) to classify <span><math><mi>α</mi></math></span>-thalassemia traits based on gender and CBC, exploring the effects of grouping silent- and non-carriers.</div></div><div><h3>Methods:</h3><div>The dataset includes 288 individuals with suspected <span><math><mi>α</mi></math></span>-thalassemia from Sri Lanka. It was classified using eleven discriminant formulae and nine ML models. Outliers were removed using Mahalanobis distance, and resampling was conducted with the synthetic minority oversampling technique (SMOTE) and SMOTE-nominal continuous (NC). The Mann–Whitney U test handled feature extraction and class grouping. ML performance was evaluated with eight criteria.</div></div><div><h3>Results:</h3><div>The Ehsani formula achieved an area under the receiver operating characteristic curve (ROC-AUC) of 0.66 by grouping silent- and non-carriers. The convolutional neural network (CNN) without feature extraction demonstrated better performance, with an accuracy of 0.85, sensitivity of 0.8, specificity of 0.86, and ROC-AUC of 0.95/0.93 (micro/macro). Performance was maintained even without preprocessing.</div></div><div><h3>Conclusion:</h3><div>ML models outperformed classical discriminant formulae in classifying <span><math><mi>α</mi></math></span>-thalassemia using sex and CBC features. A larger dataset could enhance ML model generalization and the impact of feature extraction. Grouping silent- and non-carriers improved ML results, especially with resampling. The silent carriers were not separable from non-carriers regarding the available features.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"260 ","pages":"Article 108581"},"PeriodicalIF":4.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Symmetric deformable registration of multimodal brain magnetic resonance images via appearance residuals 基于外观残差的多模态脑磁共振图像对称形变配准。
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-06 DOI: 10.1016/j.cmpb.2024.108578
Yunzhi Huang , Luyi Han , Haoran Dou , Sahar Ahmad , Pew-Thian Yap

Background and Objective:

Deformable registration of multimodal brain magnetic resonance images presents significant challenges, primarily due to substantial structural variations between subjects and pronounced differences in appearance across imaging modalities.

Methods:

Here, we propose to symmetrically register images from two modalities based on appearance residuals from one modality to another. Computed with simple subtraction between modalities, the appearance residuals enhance structural details and form a common representation for simplifying multimodal deformable registration. The proposed framework consists of three serially connected modules: (i) an appearance residual module, which learns intensity residual maps between modalities with a cycle-consistent loss; (ii) a deformable registration module, which predicts deformations across subjects based on appearance residuals; and (iii) a deblurring module, which enhances the warped images to match the sharpness of the original images.

Results:

The proposed method, evaluated on two public datasets (HCP and LEMON), achieves the highest registration accuracy with topology preservation when compared with state-of-the-art methods.

Conclusions:

Our residual space-guided registration framework, combined with GAN-based image enhancement, provides an effective solution to the challenges of multimodal deformable registration. By mitigating intensity distribution discrepancies and improving image quality, this approach improves registration accuracy and strengthens its potential for clinical application.
背景和目的:多模态脑磁共振图像的形变配准面临着重大挑战,主要是由于受试者之间存在实质性的结构差异,以及不同成像模式下外观的显著差异。方法:本文提出基于一种模态到另一种模态的外观残差对两种模态的图像进行对称配准。通过模态之间的简单相减计算,外观残差增强了结构细节,形成了简化多模态形变配准的通用表示。该框架由三个连续连接的模块组成:(i)外观残差模块,该模块学习具有周期一致损失的模态之间的强度残差映射;(ii)可变形的配准模块,该模块基于外观残差预测受试者之间的变形;(iii)去模糊模块,增强扭曲图像以匹配原始图像的清晰度。结果:在两个公共数据集(HCP和LEMON)上对所提出的方法进行了评估,与最先进的方法相比,该方法在拓扑保留的情况下达到了最高的配准精度。结论:残差空间引导配准框架与基于gan的图像增强相结合,为多模态形变配准提供了有效的解决方案。通过减轻强度分布差异和改善图像质量,该方法提高了配准精度,增强了其临床应用潜力。
{"title":"Symmetric deformable registration of multimodal brain magnetic resonance images via appearance residuals","authors":"Yunzhi Huang ,&nbsp;Luyi Han ,&nbsp;Haoran Dou ,&nbsp;Sahar Ahmad ,&nbsp;Pew-Thian Yap","doi":"10.1016/j.cmpb.2024.108578","DOIUrl":"10.1016/j.cmpb.2024.108578","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Deformable registration of multimodal brain magnetic resonance images presents significant challenges, primarily due to substantial structural variations between subjects and pronounced differences in appearance across imaging modalities.</div></div><div><h3>Methods:</h3><div>Here, we propose to symmetrically register images from two modalities based on appearance residuals from one modality to another. Computed with simple subtraction between modalities, the appearance residuals enhance structural details and form a common representation for simplifying multimodal deformable registration. The proposed framework consists of three serially connected modules: (i) an appearance residual module, which learns intensity residual maps between modalities with a cycle-consistent loss; (ii) a deformable registration module, which predicts deformations across subjects based on appearance residuals; and (iii) a deblurring module, which enhances the warped images to match the sharpness of the original images.</div></div><div><h3>Results:</h3><div>The proposed method, evaluated on two public datasets (HCP and LEMON), achieves the highest registration accuracy with topology preservation when compared with state-of-the-art methods.</div></div><div><h3>Conclusions:</h3><div>Our residual space-guided registration framework, combined with GAN-based image enhancement, provides an effective solution to the challenges of multimodal deformable registration. By mitigating intensity distribution discrepancies and improving image quality, this approach improves registration accuracy and strengthens its potential for clinical application.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"261 ","pages":"Article 108578"},"PeriodicalIF":4.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DIFLF: A domain-invariant features learning framework for single-source domain generalization in mammogram classification DIFLF:乳房x线照片分类中单源域泛化的域不变特征学习框架。
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-06 DOI: 10.1016/j.cmpb.2025.108592
Wanfang Xie , Zhenyu Liu , Litao Zhao , Meiyun Wang , Jie Tian , Jiangang Liu

Background and Objective

Single-source domain generalization (SSDG) aims to generalize a deep learning (DL) model trained on one source dataset to multiple unseen datasets. This is important for the clinical applications of DL-based models to breast cancer screening, wherein a DL-based model is commonly developed in an institute and then tested in other institutes. One challenge of SSDG is to alleviate the domain shifts using only one domain dataset.

Methods

The present study proposed a domain-invariant features learning framework (DIFLF) for single-source domain. Specifically, a style-augmentation module (SAM) and a content-style disentanglement module (CSDM) are proposed in DIFLF. SAM includes two different color jitter transforms, which transforms each mammogram in the source domain into two synthesized mammograms with new styles. Thus, it can greatly increase the feature diversity of the source domain, reducing the overfitting of the trained model. CSDM includes three feature disentanglement units, which extracts domain-invariant content (DIC) features by disentangling them from domain-specific style (DSS) features, reducing the influence of the domain shifts resulting from different feature distributions. Our code is available for open access on Github (https://github.com/85675/DIFLF).

Results

DIFLF is trained in a private dataset (PRI1), and tested first in another private dataset (PRI2) with similar feature distribution to PRI1 and then tested in two public datasets (INbreast and MIAS) with greatly different feature distributions from PRI1. As revealed by the experiment results, DIFLF presents excellent performance for classifying mammograms in the unseen target datasets of PRI2, INbreast, and MIAS. The accuracy and AUC of DIFLF are 0.917 and 0.928 in PRI2, 0.882 and 0.893 in INbreast, 0.767 and 0.710 in MIAS, respectively.

Conclusions

DIFLF can alleviate the influence of domain shifts only using one source dataset. Moreover, DIFLF can achieve an excellent mammogram classification performance even in the unseen datasets with great feature distribution differences from the training dataset.
背景与目的:单源域泛化(Single-source domain generalization, SSDG)旨在将在一个源数据集上训练的深度学习(DL)模型泛化到多个看不见的数据集。这对于基于dl的模型在乳腺癌筛查中的临床应用具有重要意义,其中基于dl的模型通常由一个研究所开发,然后在其他研究所进行测试。SSDG的一个挑战是仅使用一个域数据集来减轻域迁移。方法:提出一种单源域的域不变特征学习框架(DIFLF)。具体来说,在DIFLF中提出了一个风格增强模块(SAM)和一个内容风格解除纠缠模块(CSDM)。SAM包括两种不同的颜色抖动变换,将源域的每张乳房x光片变换成两张具有新样式的合成乳房x光片。因此,它可以大大增加源域的特征多样性,减少训练模型的过拟合。CSDM包含三个特征解纠缠单元,通过将域不变内容(DIC)特征与域特定样式(DSS)特征解纠缠,提取域不变内容(DIC)特征,降低了不同特征分布导致的域移动的影响。我们的代码可在Github上开放访问(https://github.com/85675/DIFLF).Results: DIFLF在私有数据集(PRI1)中进行训练,并首先在具有与PRI1相似特征分布的另一个私有数据集(PRI2)中进行测试,然后在具有与PRI1有很大不同特征分布的两个公共数据集(INbreast和MIAS)中进行测试。实验结果表明,DIFLF在PRI2、INbreast和MIAS的未见目标数据集中对乳房x线照片进行分类时表现出优异的性能。DIFLF在PRI2中的准确度和AUC分别为0.917和0.928,INbreast中的准确度和AUC分别为0.882和0.893,MIAS中的准确度和AUC分别为0.767和0.710。结论:DIFLF在单一源数据集上可以缓解域漂移的影响。此外,即使在与训练数据集特征分布差异较大的未见数据集上,DIFLF也能取得优异的乳房x线照片分类性能。
{"title":"DIFLF: A domain-invariant features learning framework for single-source domain generalization in mammogram classification","authors":"Wanfang Xie ,&nbsp;Zhenyu Liu ,&nbsp;Litao Zhao ,&nbsp;Meiyun Wang ,&nbsp;Jie Tian ,&nbsp;Jiangang Liu","doi":"10.1016/j.cmpb.2025.108592","DOIUrl":"10.1016/j.cmpb.2025.108592","url":null,"abstract":"<div><h3>Background and Objective</h3><div>Single-source domain generalization (SSDG) aims to generalize a deep learning (DL) model trained on one source dataset to multiple unseen datasets. This is important for the clinical applications of DL-based models to breast cancer screening, wherein a DL-based model is commonly developed in an institute and then tested in other institutes. One challenge of SSDG is to alleviate the domain shifts using only one domain dataset.</div></div><div><h3>Methods</h3><div>The present study proposed a domain-invariant features learning framework (DIFLF) for single-source domain. Specifically, a style-augmentation module (SAM) and a content-style disentanglement module (CSDM) are proposed in DIFLF. SAM includes two different color jitter transforms, which transforms each mammogram in the source domain into two synthesized mammograms with new styles. Thus, it can greatly increase the feature diversity of the source domain, reducing the overfitting of the trained model. CSDM includes three feature disentanglement units, which extracts domain-invariant content (DIC) features by disentangling them from domain-specific style (DSS) features, reducing the influence of the domain shifts resulting from different feature distributions. Our code is available for open access on Github (<span><span>https://github.com/85675/DIFLF</span><svg><path></path></svg></span>).</div></div><div><h3>Results</h3><div>DIFLF is trained in a private dataset (PRI1), and tested first in another private dataset (PRI2) with similar feature distribution to PRI1 and then tested in two public datasets (INbreast and MIAS) with greatly different feature distributions from PRI1. As revealed by the experiment results, DIFLF presents excellent performance for classifying mammograms in the unseen target datasets of PRI2, INbreast, and MIAS. The accuracy and AUC of DIFLF are 0.917 and 0.928 in PRI2, 0.882 and 0.893 in INbreast, 0.767 and 0.710 in MIAS, respectively.</div></div><div><h3>Conclusions</h3><div>DIFLF can alleviate the influence of domain shifts only using one source dataset. Moreover, DIFLF can achieve an excellent mammogram classification performance even in the unseen datasets with great feature distribution differences from the training dataset.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"261 ","pages":"Article 108592"},"PeriodicalIF":4.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143001331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An emotionally intelligent haptic system – An efficient solution for anxiety detection and mitigation 一种情感智能触觉系统-一种有效的焦虑检测和缓解解决方案。
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-06 DOI: 10.1016/j.cmpb.2025.108590
Swapneel Mishra , Saumya Seth , Shrishti Jain , Vasudev Pant , Jolly Parikh , Nupur Chugh , Yugnanda Puri

Background

Anxiety is a psycho-physiological condition associated with an individual's mental state. Long-term anxiety persistence can lead to anxiety disorder, which is the underlying cause of many mental health problems. As such, it is critical to precisely identify anxiety by automated, effective, and user-bias-free ways.

Objective

The objective of this study is to develop an innovative emotionally intelligent Haptic system for anxiety detection, which can be used to track and manage people's anxiety.

Method

The suggested approach incorporates a haptic feedback mechanism that is based on EEG data and is analysed by machine learning algorithms. This allows users to effectively control their emotional well-being by receiving timely feedback and assessments of their anxiety levels. First, the authors use publicly accessible data to present an experimental study for the categorization of human anxiety.

Results

The ensemble model used for the classification produces results with a 97 % accuracy rate, 0.98 recall, 0.99 precision, and a 0.99 F1 score. Furthermore, self-curated data is subjected to an advanced spike analysis algorithm that identifies signal spikes and then quantifies the level of anxiety.

Conclusion

The results obtained demonstrate that haptic stimuli are produced smoothly, offering a comprehensive and innovative method of managing anxiety.
背景:焦虑是一种与个体精神状态相关的心理生理状况。长期焦虑会导致焦虑障碍,这是许多心理健康问题的潜在原因。因此,通过自动化、有效和无用户偏见的方式精确识别焦虑是至关重要的。目的:本研究的目的是开发一种创新的情绪智能触觉焦虑检测系统,用于跟踪和管理人们的焦虑。方法:建议的方法结合了基于脑电图数据的触觉反馈机制,并通过机器学习算法进行分析。这允许用户通过接收及时的反馈和评估他们的焦虑水平来有效地控制他们的情绪健康。首先,作者使用可公开访问的数据来提出一项关于人类焦虑分类的实验研究。结果:用于分类的集成模型产生的结果准确率为97%,召回率为0.98,精度为0.99,F1分数为0.99。此外,自我整理的数据经过先进的峰值分析算法,识别信号峰值,然后量化焦虑水平。结论:触觉刺激产生顺畅,为焦虑治疗提供了一种全面、创新的方法。
{"title":"An emotionally intelligent haptic system – An efficient solution for anxiety detection and mitigation","authors":"Swapneel Mishra ,&nbsp;Saumya Seth ,&nbsp;Shrishti Jain ,&nbsp;Vasudev Pant ,&nbsp;Jolly Parikh ,&nbsp;Nupur Chugh ,&nbsp;Yugnanda Puri","doi":"10.1016/j.cmpb.2025.108590","DOIUrl":"10.1016/j.cmpb.2025.108590","url":null,"abstract":"<div><h3>Background</h3><div>Anxiety is a psycho-physiological condition associated with an individual's mental state. Long-term anxiety persistence can lead to anxiety disorder, which is the underlying cause of many mental health problems. As such, it is critical to precisely identify anxiety by automated, effective, and user-bias-free ways.</div></div><div><h3>Objective</h3><div>The objective of this study is to develop an innovative emotionally intelligent Haptic system for anxiety detection, which can be used to track and manage people's anxiety.</div></div><div><h3>Method</h3><div>The suggested approach incorporates a haptic feedback mechanism that is based on EEG data and is analysed by machine learning algorithms. This allows users to effectively control their emotional well-being by receiving timely feedback and assessments of their anxiety levels. First, the authors use publicly accessible data to present an experimental study for the categorization of human anxiety.</div></div><div><h3>Results</h3><div>The ensemble model used for the classification produces results with a 97 % accuracy rate, 0.98 recall, 0.99 precision, and a 0.99 F1 score. Furthermore, self-curated data is subjected to an advanced spike analysis algorithm that identifies signal spikes and then quantifies the level of anxiety.</div></div><div><h3>Conclusion</h3><div>The results obtained demonstrate that haptic stimuli are produced smoothly, offering a comprehensive and innovative method of managing anxiety.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"260 ","pages":"Article 108590"},"PeriodicalIF":4.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142945897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and validation of a nomogram-based prognostic model to predict coronary artery lesions in Kawasaki disease from 6847 children in China 基于形态图的预测6847名中国儿童川崎病冠状动脉病变的预后模型的建立和验证
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-06 DOI: 10.1016/j.cmpb.2025.108588
Changjian Li , Huayong Zhang , Wei Yin , Yong Zhang

Background and Objective

Predicting potential risk factors for the occurrence of coronary artery lesions (CAL) in children with Kawasaki disease (KD) is critical for subsequent treatment. The aim of our study was to establish and validate a nomograph-based model for identifying children with KD at risk for CAL.

Methods

Hospitalized children with KD attending Wuhan Children's Hospital from Jan 2011 to Dec 2023 were included in the study and were grouped into a training set (4793 cases) and a validation set (2054 cases) using a simple random sampling method in a 7:3 ratio. The analysis was performed using RStudio software, which first used LASSO regression analysis to screen for the best predictors, and then analyzed the screened predictors using logistic regression analysis to derive independent predictors and construct a nomogram model to predict CAL risk. The receiver operating characteristic (ROC) and calibration curves were employed to evaluate the discrimination and calibration of the model. Finally, decision curve analysis (DCA) was utilized to validate the clinical applicability of the models assessed in the data.

Results

Of the 6847 eligible children with KD included, 845 (12 %) were ultimately diagnosed with CAL, of whom 619 were boys (73 %) with a median age of 1.81 (0.74, 3.51) years. Six significant independent predictors were identified, including sex, intravenous immunoglobulin nonresponse, peripheral blood hemoglobin, platelet distribution width, platelet count, and serum albumin. Our model has acceptable discriminative power, with areas under the curve at 0.671 and 0.703 in the training and validation sets, respectively. DCA analysis showed that the prediction model had great clinical utility when the threshold probability interval was between 0.1 and 0.5.

Conclusions

We constructed and internally validated a nomograph-based predictive model based on six variables consisting of sex, intravenous immunoglobulin nonresponse, peripheral blood hemoglobin, platelet distribution width, platelet count, and serum albumin, which may be useful for earlier identification of children with KD who may have CAL.
背景与目的:预测川崎病(KD)患儿冠状动脉病变(CAL)发生的潜在危险因素对后续治疗至关重要。方法:选取2011年1月至2023年12月在武汉市儿童医院住院的KD患儿为研究对象,采用简单的随机抽样方法,按7:3的比例分为训练集(4793例)和验证集(2054例)。使用RStudio软件进行分析,首先使用LASSO回归分析筛选最佳预测因子,然后使用logistic回归分析对筛选到的预测因子进行分析,导出独立预测因子,并构建nomogram模型进行CAL风险预测。采用受试者工作特征(ROC)和校正曲线对模型的判别性和校正性进行评价。最后,采用决策曲线分析(DCA)来验证数据中评估模型的临床适用性。结果:在6847例符合条件的KD患儿中,845例(12%)最终被诊断为CAL,其中619例为男孩(73%),中位年龄为1.81(0.74,3.51)岁。确定了6个重要的独立预测因素,包括性别、静脉免疫球蛋白无反应、外周血血红蛋白、血小板分布宽度、血小板计数和血清白蛋白。我们的模型具有可接受的判别能力,训练集和验证集的曲线下面积分别为0.671和0.703。DCA分析表明,阈值概率区间在0.1 ~ 0.5之间时,预测模型具有较好的临床应用价值。结论:基于性别、静脉免疫球蛋白无反应、外周血血红蛋白、血小板分布宽度、血小板计数和血清白蛋白等6个变量,我们构建并内部验证了一个基于nomographs的预测模型,该模型可能有助于早期识别可能患有CAL的KD患儿。
{"title":"Development and validation of a nomogram-based prognostic model to predict coronary artery lesions in Kawasaki disease from 6847 children in China","authors":"Changjian Li ,&nbsp;Huayong Zhang ,&nbsp;Wei Yin ,&nbsp;Yong Zhang","doi":"10.1016/j.cmpb.2025.108588","DOIUrl":"10.1016/j.cmpb.2025.108588","url":null,"abstract":"<div><h3>Background and Objective</h3><div>Predicting potential risk factors for the occurrence of coronary artery lesions (CAL) in children with Kawasaki disease (KD) is critical for subsequent treatment. The aim of our study was to establish and validate a nomograph-based model for identifying children with KD at risk for CAL.</div></div><div><h3>Methods</h3><div>Hospitalized children with KD attending Wuhan Children's Hospital from Jan 2011 to Dec 2023 were included in the study and were grouped into a training set (4793 cases) and a validation set (2054 cases) using a simple random sampling method in a 7:3 ratio. The analysis was performed using RStudio software, which first used LASSO regression analysis to screen for the best predictors, and then analyzed the screened predictors using logistic regression analysis to derive independent predictors and construct a nomogram model to predict CAL risk. The receiver operating characteristic (ROC) and calibration curves were employed to evaluate the discrimination and calibration of the model. Finally, decision curve analysis (DCA) was utilized to validate the clinical applicability of the models assessed in the data.</div></div><div><h3>Results</h3><div>Of the 6847 eligible children with KD included, 845 (12 %) were ultimately diagnosed with CAL, of whom 619 were boys (73 %) with a median age of 1.81 (0.74, 3.51) years. Six significant independent predictors were identified, including sex, intravenous immunoglobulin nonresponse, peripheral blood hemoglobin, platelet distribution width, platelet count, and serum albumin. Our model has acceptable discriminative power, with areas under the curve at 0.671 and 0.703 in the training and validation sets, respectively. DCA analysis showed that the prediction model had great clinical utility when the threshold probability interval was between 0.1 and 0.5.</div></div><div><h3>Conclusions</h3><div>We constructed and internally validated a nomograph-based predictive model based on six variables consisting of sex, intravenous immunoglobulin nonresponse, peripheral blood hemoglobin, platelet distribution width, platelet count, and serum albumin, which may be useful for earlier identification of children with KD who may have CAL.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"260 ","pages":"Article 108588"},"PeriodicalIF":4.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142964031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning-based 28-day mortality prediction model for elderly neurocritically Ill patients 基于机器学习的老年神经危重症患者28天死亡率预测模型
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-06 DOI: 10.1016/j.cmpb.2025.108589
Jia Yuan , Jiong Xiong , Jinfeng Yang , Qi Dong , Yin Wang , Yumei Cheng , Xianjun Chen , Ying Liu , Chuan Xiao , Junlin Tao , Shuangzi Lizhang , Yangzi Liujiao , Qimin Chen , Feng Shen

Background

The growing population of elderly neurocritically ill patients highlights the need for effective prognosis prediction tools. This study aims to develop and validate machine learning (ML) models for predicting 28-day mortality in intensive care units (ICUs).

Methods

Data were extracted from the Medical Information Mart for Intensive Care IV(MIMIC-IV) database, focusing on elderly neurocritical ill patients with ICU stays ≥ 24 h. The cohort was split into 70 % for training and 30 % for internal validation. We analyzed 58 variables, including demographics, vital signs, medications, lab results, comorbidities, and medical scores, using Lasso regression to identify predictors of 28-day mortality. Seven ML algorithms were evaluated, and the best model was validated with data from Guizhou Medical University Affiliated Hospital. A log-rank test was used to assess survival differences in Kaplan-Meier curves. Shapley Additive Explanations (SHAP) were used to interpret the best model, while subgroup analysis identified variations in model performance across different populations.

Results

The study included 1,773 elderly neurocritically ill patients, with a 28-day mortality rate of 28.6 %. The Light Gradient Boosting Machine (LightGBM) outperformed other models, achieving an area under the curve (AUC) of 0.896 in internal validation and 0.812 in external validation. Kaplan-Meier analysis showed that higher LightGBM prediction scores correlated with lower survival probabilities. Key predictors identified through SHAP analysis included partial pressure of arterial carbon dioxide (PaCO2), Acute physiology and chronic health evaluation II (APACHE II), white blood cell count, age, and lactate. The LightGBM model demonstrated consistent performance across various subgroups.

Conclusions

The LightGBM model effectively predicts 28-day mortality risk in elderly neurocritically ill patients, aiding clinicians in management and resource allocation. Its reliable performance across diverse subgroups underscores its clinical utility.
背景:老年神经重症患者的人数不断增加,这凸显了对有效预后预测工具的需求。本研究旨在开发和验证用于预测重症监护病房(ICU)28 天死亡率的机器学习(ML)模型:数据提取自重症监护医学信息市场IV(MIMIC-IV)数据库,主要针对重症监护病房住院时间≥24小时的老年神经重症患者。我们使用拉索回归分析了 58 个变量,包括人口统计学、生命体征、药物、实验室结果、合并症和医疗评分,以确定 28 天死亡率的预测因素。对七种 ML 算法进行了评估,并利用贵州医科大学附属医院的数据对最佳模型进行了验证。对数秩检验用于评估 Kaplan-Meier 曲线的生存率差异。夏普利加法解释(SHAP)用于解释最佳模型,而亚组分析则确定了不同人群中模型性能的差异:研究纳入了 1,773 名老年神经重症患者,28 天死亡率为 28.6%。轻梯度提升机(LightGBM)的表现优于其他模型,内部验证的曲线下面积(AUC)为0.896,外部验证的曲线下面积(AUC)为0.812。Kaplan-Meier 分析表明,LightGBM 预测得分越高,生存概率越低。通过 SHAP 分析确定的关键预测因子包括动脉二氧化碳分压(PaCO2)、急性生理学和慢性健康评估 II(APACHE II)、白细胞计数、年龄和乳酸。LightGBM 模型在不同亚组中表现出一致的性能:LightGBM模型能有效预测老年神经重症患者28天内的死亡风险,有助于临床医生进行管理和资源分配。该模型在不同亚组中的可靠表现突显了其临床实用性。
{"title":"Machine learning-based 28-day mortality prediction model for elderly neurocritically Ill patients","authors":"Jia Yuan ,&nbsp;Jiong Xiong ,&nbsp;Jinfeng Yang ,&nbsp;Qi Dong ,&nbsp;Yin Wang ,&nbsp;Yumei Cheng ,&nbsp;Xianjun Chen ,&nbsp;Ying Liu ,&nbsp;Chuan Xiao ,&nbsp;Junlin Tao ,&nbsp;Shuangzi Lizhang ,&nbsp;Yangzi Liujiao ,&nbsp;Qimin Chen ,&nbsp;Feng Shen","doi":"10.1016/j.cmpb.2025.108589","DOIUrl":"10.1016/j.cmpb.2025.108589","url":null,"abstract":"<div><h3>Background</h3><div>The growing population of elderly neurocritically ill patients highlights the need for effective prognosis prediction tools. This study aims to develop and validate machine learning (ML) models for predicting 28-day mortality in intensive care units (ICUs).</div></div><div><h3>Methods</h3><div>Data were extracted from the Medical Information Mart for Intensive Care IV(MIMIC-IV) database, focusing on elderly neurocritical ill patients with ICU stays ≥ 24 h. The cohort was split into 70 % for training and 30 % for internal validation. We analyzed 58 variables, including demographics, vital signs, medications, lab results, comorbidities, and medical scores, using Lasso regression to identify predictors of 28-day mortality. Seven ML algorithms were evaluated, and the best model was validated with data from Guizhou Medical University Affiliated Hospital. A log-rank test was used to assess survival differences in Kaplan-Meier curves. Shapley Additive Explanations (SHAP) were used to interpret the best model, while subgroup analysis identified variations in model performance across different populations.</div></div><div><h3>Results</h3><div>The study included 1,773 elderly neurocritically ill patients, with a 28-day mortality rate of 28.6 %. The Light Gradient Boosting Machine (LightGBM) outperformed other models, achieving an area under the curve (AUC) of 0.896 in internal validation and 0.812 in external validation. Kaplan-Meier analysis showed that higher LightGBM prediction scores correlated with lower survival probabilities. Key predictors identified through SHAP analysis included partial pressure of arterial carbon dioxide (PaCO<sub>2</sub>), Acute physiology and chronic health evaluation II (APACHE II), white blood cell count, age, and lactate. The LightGBM model demonstrated consistent performance across various subgroups.</div></div><div><h3>Conclusions</h3><div>The LightGBM model effectively predicts 28-day mortality risk in elderly neurocritically ill patients, aiding clinicians in management and resource allocation. Its reliable performance across diverse subgroups underscores its clinical utility.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"260 ","pages":"Article 108589"},"PeriodicalIF":4.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic path planning for pelvic fracture reduction with multi-degree-of-freedom
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-06 DOI: 10.1016/j.cmpb.2025.108591
Chao Shi , Qing Yang , Yuantian Wang , Xiangrui Zhao , Shuchang Shi , Lijia Zhang , Sutuke Yibulayimu , Yanzhen Liu , Chendi Liang , Yu Wang , Chunpeng Zhao

Background and objectives

Computer-assisted orthopedic surgical techniques and robotics has improved the therapeutic outcome of pelvic fracture reduction surgery. The preoperative reduction path is one of the prerequisites for robotic movement and an essential reference for manual operation. As the largest irregular bone with complicated morphology, the rotational motion of pelvic fracture fragments impacts the reduction process directly. To address this, the primary objective of this study is to develop an efficient and effective algorithm for automatically planning the reduction trajectory in robot-assisted pelvic fracture surgeries.

Methods

After obtaining rotational and reorientated translational degrees of freedom through the initial and target positions of the fracture fragments, the initial path is acquired through improved path planning method combined with specific designed collision detection algorithm. The final reduction path is post-processed to be shortened and smoothed. The effectiveness of the algorithm was evaluated in various pelvic fracture models with surrounding muscles and was compared with prior relevant implementations.

Results

Simulation results showed the ability of the planner to save time and overcome the state of art in terms of collision detection, path length and smoothness, search time, and surrounding muscle stretching conditions.

Conclusions

The proposed method enables a reasonable reduction path for pelvic fracture, which is demonstrated to be superior in various pelvic fracture scenarios.
{"title":"Automatic path planning for pelvic fracture reduction with multi-degree-of-freedom","authors":"Chao Shi ,&nbsp;Qing Yang ,&nbsp;Yuantian Wang ,&nbsp;Xiangrui Zhao ,&nbsp;Shuchang Shi ,&nbsp;Lijia Zhang ,&nbsp;Sutuke Yibulayimu ,&nbsp;Yanzhen Liu ,&nbsp;Chendi Liang ,&nbsp;Yu Wang ,&nbsp;Chunpeng Zhao","doi":"10.1016/j.cmpb.2025.108591","DOIUrl":"10.1016/j.cmpb.2025.108591","url":null,"abstract":"<div><h3>Background and objectives</h3><div>Computer-assisted orthopedic surgical techniques and robotics has improved the therapeutic outcome of pelvic fracture reduction surgery. The preoperative reduction path is one of the prerequisites for robotic movement and an essential reference for manual operation. As the largest irregular bone with complicated morphology, the rotational motion of pelvic fracture fragments impacts the reduction process directly. To address this, the primary objective of this study is to develop an efficient and effective algorithm for automatically planning the reduction trajectory in robot-assisted pelvic fracture surgeries.</div></div><div><h3>Methods</h3><div>After obtaining rotational and reorientated translational degrees of freedom through the initial and target positions of the fracture fragments, the initial path is acquired through improved path planning method combined with specific designed collision detection algorithm. The final reduction path is post-processed to be shortened and smoothed. The effectiveness of the algorithm was evaluated in various pelvic fracture models with surrounding muscles and was compared with prior relevant implementations.</div></div><div><h3>Results</h3><div>Simulation results showed the ability of the planner to save time and overcome the state of art in terms of collision detection, path length and smoothness, search time, and surrounding muscle stretching conditions.</div></div><div><h3>Conclusions</h3><div>The proposed method enables a reasonable reduction path for pelvic fracture, which is demonstrated to be superior in various pelvic fracture scenarios.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"261 ","pages":"Article 108591"},"PeriodicalIF":4.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143028082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect of plaque micro-watershed changes on carotid atherosclerosis 斑块微分水岭变化对颈动脉粥样硬化的影响。
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-06 DOI: 10.1016/j.cmpb.2024.108582
Chenlong Guo , Xingsen Mu , Xianwei Wang , Yiming Zhao , Haoran Zhang , Dong Chen

Objective

The study aims to elucidate the mechanisms underlying plaque growth by analyzing the variations in hemodynamic parameters within the plaque region of patients' carotid arteries before and after the development of atherosclerotic lesions.

Methods

The study enrolls 25 patients with common carotid artery stenosis and 25 with tandem carotid artery stenosis. Based on pathological analysis, three-dimensional models of the actual blood vessels before and after the lesion are constructed for two patients within a two-year period. Computational fluid dynamics is employed to conduct unsteady periodic non-Newtonian fluid numerical simulations, enabling an in-depth investigation into the changes in the micro-environment of blood flow.

Results

During the systolic phase of the cardiac cycle, vortex regions are particularly prone to developing at the bifurcation point between the common carotid artery and the distal end of the internal carotid artery. In the early diastolic phase, blood reflux phenomena can be observed within the carotid artery. Towards the end of diastole, there is an expansion of vortex regions at the bifurcation point of the carotid artery. The shoulder region of initial small plaques within the blood vessel is susceptible to developing a low-speed recirculation zone, characterized by significantly reduced shear stress compared to the surrounding areas. Following vascular stenosis, the wall shear stress within the plaque domain generally increases; however, it maintains a consistent pattern of high central values and low upper shoulder values. The shear stress at the upper shoulder of the plaque of tandem carotid stenosis is below 0.4 Pa, whereas the central and lower shoulder regions exhibit shear stress exceeding 40 Pa.

Conclusions

The dynamic parameters of the blood flow micro-environment exhibit variations throughout the cardiac cycle, and temporal disparities exist in local lesions within the carotid artery. Both common and tandem carotid artery stenosis are particularly prone to developing lesions at the shoulder of initial small plaques. The micro-flow characteristics within the plaque domain undergo alterations prior to and following the onset of carotid artery disease. Furthermore, the occurrence of restenosis and rupture is associated with the location of plaque growth.
{"title":"Effect of plaque micro-watershed changes on carotid atherosclerosis","authors":"Chenlong Guo ,&nbsp;Xingsen Mu ,&nbsp;Xianwei Wang ,&nbsp;Yiming Zhao ,&nbsp;Haoran Zhang ,&nbsp;Dong Chen","doi":"10.1016/j.cmpb.2024.108582","DOIUrl":"10.1016/j.cmpb.2024.108582","url":null,"abstract":"<div><h3>Objective</h3><div>The study aims to elucidate the mechanisms underlying plaque growth by analyzing the variations in hemodynamic parameters within the plaque region of patients' carotid arteries before and after the development of atherosclerotic lesions.</div></div><div><h3>Methods</h3><div>The study enrolls 25 patients with common carotid artery stenosis and 25 with tandem carotid artery stenosis. Based on pathological analysis, three-dimensional models of the actual blood vessels before and after the lesion are constructed for two patients within a two-year period. Computational fluid dynamics is employed to conduct unsteady periodic non-Newtonian fluid numerical simulations, enabling an in-depth investigation into the changes in the micro-environment of blood flow.</div></div><div><h3>Results</h3><div>During the systolic phase of the cardiac cycle, vortex regions are particularly prone to developing at the bifurcation point between the common carotid artery and the distal end of the internal carotid artery. In the early diastolic phase, blood reflux phenomena can be observed within the carotid artery. Towards the end of diastole, there is an expansion of vortex regions at the bifurcation point of the carotid artery. The shoulder region of initial small plaques within the blood vessel is susceptible to developing a low-speed recirculation zone, characterized by significantly reduced shear stress compared to the surrounding areas. Following vascular stenosis, the wall shear stress within the plaque domain generally increases; however, it maintains a consistent pattern of high central values and low upper shoulder values. The shear stress at the upper shoulder of the plaque of tandem carotid stenosis is below 0.4 Pa, whereas the central and lower shoulder regions exhibit shear stress exceeding 40 Pa.</div></div><div><h3>Conclusions</h3><div>The dynamic parameters of the blood flow micro-environment exhibit variations throughout the cardiac cycle, and temporal disparities exist in local lesions within the carotid artery. Both common and tandem carotid artery stenosis are particularly prone to developing lesions at the shoulder of initial small plaques. The micro-flow characteristics within the plaque domain undergo alterations prior to and following the onset of carotid artery disease. Furthermore, the occurrence of restenosis and rupture is associated with the location of plaque growth.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"260 ","pages":"Article 108582"},"PeriodicalIF":4.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-path neural network extracts tumor microenvironment information from whole slide images to predict molecular typing and prognosis of Glioma 双路径神经网络从整个幻灯片图像中提取肿瘤微环境信息,预测胶质瘤的分子分型和预后。
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-04 DOI: 10.1016/j.cmpb.2024.108580
Zehang Ning , Bojie Yang , Yuanyuan Wang , Zhifeng Shi , Jinhua Yu , Guoqing Wu

Background and Objective:

Utilizing AI to mine tumor microenvironment information in whole slide images (WSIs) for glioma molecular subtype and prognosis prediction is significant for treatment. Existing weakly-supervised learning frameworks based on multi-instance learning have potential in WSIs analysis, but the large number of patches from WSIs challenges the effective extraction of key local patch and neighboring patch microenvironment info. Therefore, this paper aims to develop an automatic neural network that effectively extracts tumor microenvironment information from WSIs to predict molecular typing and prognosis of glioma.

Methods:

In this paper, we proposed a dual-path pathology analysis (DPPA) framework to enhance the analysis ability of WSIs for glioma diagnosis. Firstly, to mitigate the impact of redundant patches and enhance the integration of salient patch information within a multi-instance learning context, we propose a two-stage attention-based dynamic multi-instance learning network. In the network, two-stage attention and dynamic random sampling are designed to integrate diverse image patch information in pivotal regions adaptively. Secondly, to unearth the wealth of spatial context inherent in WSIs, we build a spatial relationship information quantification module. This module captures the spatial distribution of patches that encompass a variety of tissue structures, shedding light on the tumor microenvironment.

Results:

A large number of experiments on three datasets, two in-house and one public, totaling 1,795 WSIs demonstrate the encouraging performance of the DPPA, with mean area under curves of 0.94, 0.85, and 0.88 in predicting Isocitrate Dehydrogenase 1, Telomerase Reverse Tranase, and 1p/19q respectively, and a mean C-index of 0.82 in prognosis prediction. The proposed model can also group tumors in existing tumor subgroups into good and bad prognoses, with P < 0.05 on the Log-rank test.

Conclusions:

The results of multi-center experiments demonstrate that the proposed DPPA surpasses the state-of-the-art models across multiple metrics. Through ablation experiments and survival analysis, the outstanding analytical ability of this model is further validated. Meanwhile, based on the work related to the interpretability of the model, the reliability and validity of the model have also been strongly confirmed. All source codes are released at: https://github.com/nzehang97/DPPA.
背景与目的:利用人工智能在全幻灯片图像(WSIs)中挖掘肿瘤微环境信息,预测胶质瘤分子亚型及预后,对治疗具有重要意义。现有的基于多实例学习的弱监督学习框架在wsi分析中具有一定的潜力,但wsi中大量的补丁对有效提取关键局部补丁和邻近补丁微环境信息提出了挑战。因此,本文旨在开发一种自动神经网络,有效地从wsi中提取肿瘤微环境信息,以预测胶质瘤的分子分型和预后。方法:在本文中,我们提出了一个双路径病理分析(DPPA)框架,以提高wsi对胶质瘤诊断的分析能力。首先,为了减轻冗余补丁的影响,增强显著补丁信息在多实例学习环境中的整合,我们提出了一种两阶段的基于注意力的动态多实例学习网络。在网络中,设计了两阶段关注和动态随机采样,以自适应地整合关键区域的不同图像补丁信息。其次,构建空间关系信息量化模块,挖掘wsi所蕴含的丰富空间脉络。该模块捕获了包含各种组织结构的斑块的空间分布,揭示了肿瘤微环境。结果:在3个数据集(2个内部数据集和1个公共数据集)共1795个wsi上进行的大量实验表明,DPPA在预测异柠檬酸脱氢酶1、端粒酶逆转录酶和1p/19q方面的平均曲线下面积分别为0.94、0.85和0.88,预测预后的平均c指数为0.82。该模型还可以将现有肿瘤亚组中的肿瘤分为预后良好和预后不良,Log-rank检验P < 0.05。结论:多中心实验结果表明,所提出的DPPA在多个指标上都优于最先进的模型。通过烧蚀实验和生存分析,进一步验证了该模型出色的分析能力。同时,通过对模型可解释性的相关研究,也有力地证实了模型的信度和效度。所有源代码发布在:https://github.com/nzehang97/DPPA。
{"title":"Dual-path neural network extracts tumor microenvironment information from whole slide images to predict molecular typing and prognosis of Glioma","authors":"Zehang Ning ,&nbsp;Bojie Yang ,&nbsp;Yuanyuan Wang ,&nbsp;Zhifeng Shi ,&nbsp;Jinhua Yu ,&nbsp;Guoqing Wu","doi":"10.1016/j.cmpb.2024.108580","DOIUrl":"10.1016/j.cmpb.2024.108580","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Utilizing AI to mine tumor microenvironment information in whole slide images (WSIs) for glioma molecular subtype and prognosis prediction is significant for treatment. Existing weakly-supervised learning frameworks based on multi-instance learning have potential in WSIs analysis, but the large number of patches from WSIs challenges the effective extraction of key local patch and neighboring patch microenvironment info. Therefore, this paper aims to develop an automatic neural network that effectively extracts tumor microenvironment information from WSIs to predict molecular typing and prognosis of glioma.</div></div><div><h3>Methods:</h3><div>In this paper, we proposed a dual-path pathology analysis (DPPA) framework to enhance the analysis ability of WSIs for glioma diagnosis. Firstly, to mitigate the impact of redundant patches and enhance the integration of salient patch information within a multi-instance learning context, we propose a two-stage attention-based dynamic multi-instance learning network. In the network, two-stage attention and dynamic random sampling are designed to integrate diverse image patch information in pivotal regions adaptively. Secondly, to unearth the wealth of spatial context inherent in WSIs, we build a spatial relationship information quantification module. This module captures the spatial distribution of patches that encompass a variety of tissue structures, shedding light on the tumor microenvironment.</div></div><div><h3>Results:</h3><div>A large number of experiments on three datasets, two in-house and one public, totaling 1,795 WSIs demonstrate the encouraging performance of the DPPA, with mean area under curves of 0.94, 0.85, and 0.88 in predicting Isocitrate Dehydrogenase 1, Telomerase Reverse Tranase, and 1p/19q respectively, and a mean C-index of 0.82 in prognosis prediction. The proposed model can also group tumors in existing tumor subgroups into good and bad prognoses, with P <span><math><mo>&lt;</mo></math></span> 0.05 on the Log-rank test.</div></div><div><h3>Conclusions:</h3><div>The results of multi-center experiments demonstrate that the proposed DPPA surpasses the state-of-the-art models across multiple metrics. Through ablation experiments and survival analysis, the outstanding analytical ability of this model is further validated. Meanwhile, based on the work related to the interpretability of the model, the reliability and validity of the model have also been strongly confirmed. All source codes are released at: <span><span>https://github.com/nzehang97/DPPA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"261 ","pages":"Article 108580"},"PeriodicalIF":4.9,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142982634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Transformers-based models and linked data for deep phenotyping in radiology 利用基于transformer的模型和关联数据在放射学中进行深度表型分析。
IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-03 DOI: 10.1016/j.cmpb.2024.108567
Lluís-F. Hurtado , Luis Marco-Ruiz , Encarna Segarra , Maria Jose Castro-Bleda , Aurelia Bustos-Moreno , Maria de la Iglesia-Vayá , Juan Francisco Vallalta-Rueda

Background and Objective:

Despite significant investments in the normalization and the standardization of Electronic Health Records (EHRs), free text is still the rule rather than the exception in clinical notes. The use of free text has implications in data reuse methods used for supporting clinical research since the query mechanisms used in cohort definition and patient matching are mainly based on structured data and clinical terminologies. This study aims to develop a method for the secondary use of clinical text by: (a) using Natural Language Processing (NLP) for tagging clinical notes with biomedical terminology; and (b) designing an ontology that maps and classifies all the identified tags to various terminologies and allows for running phenotyping queries.

Methods and Results:

Transformers-based NLP Models, concretely pre-trained RoBERTa language models, were used to process radiology reports and annotate them identifying elements matching UMLS Concept Unique Identifiers (CUIs) definitions. CUIs were mapped into several biomedical ontologies useful for phenotyping (e.g., SNOMED-CT, HPO, ICD-10, FMA, LOINC, and ICPC2, among others) and represented as a lightweight ontology using OWL (Web Ontology Language) constructs. This process resulted in a Linked Knowledge Base (LKB), which allows running expressive queries to retrieve reports that comply with specific criteria using automatic reasoning.

Conclusion:

Although phenotyping tools mostly rely on relational databases, the combination of NLP and Linked Data technologies allows us to build scalable knowledge bases using standard ontologies from the Web of data. Our approach enables us to execute a pipeline which input is free text and automatically maps identified entities to a LKB that allows answering phenotyping queries. In this work, we have only used Spanish radiology reports, although it is extensible to other languages for which suitable corpora are available. This is particularly valuable in regional and national systems dealing with large research databases from different registries and cohorts and plays an essential role in the scalability of large data reuse infrastructures that require indexing and governing distributed data sources.
背景和目的:尽管在电子健康记录(EHRs)的规范化和标准化方面进行了大量投资,但在临床记录中,自由文本仍然是一种规则,而不是例外。由于队列定义和患者匹配中使用的查询机制主要基于结构化数据和临床术语,因此使用自由文本对用于支持临床研究的数据重用方法具有影响。本研究旨在开发一种临床文本的二次使用方法:(a)使用自然语言处理(NLP)用生物医学术语标记临床笔记;(b)设计一个本体,将所有已识别的标签映射和分类到各种术语,并允许运行表型查询。方法和结果:基于transformer的NLP模型,具体的预训练RoBERTa语言模型,用于处理放射学报告,并对其进行注释,以识别与UMLS概念唯一标识符(gui)定义匹配的元素。将gui映射为几种对表型分析有用的生物医学本体(例如,SNOMED-CT、HPO、ICD-10、FMA、LOINC和ICPC2等),并使用OWL (Web ontology Language)结构表示为轻量级本体。这个过程产生了一个链接知识库(link Knowledge Base, LKB),它允许运行表达性查询来检索使用自动推理符合特定标准的报告。结论:虽然表型工具主要依赖于关系数据库,但NLP和关联数据技术的结合使我们能够使用来自数据网络的标准本体构建可扩展的知识库。我们的方法使我们能够执行一个管道,它的输入是自由文本,并自动将识别的实体映射到允许回答表型查询的LKB。在这项工作中,我们只使用了西班牙语放射学报告,尽管它可以扩展到其他语言,因为有合适的语料库可用。这在处理来自不同登记和队列的大型研究数据库的区域和国家系统中特别有价值,并在需要索引和管理分布式数据源的大型数据重用基础设施的可伸缩性方面发挥重要作用。
{"title":"Leveraging Transformers-based models and linked data for deep phenotyping in radiology","authors":"Lluís-F. Hurtado ,&nbsp;Luis Marco-Ruiz ,&nbsp;Encarna Segarra ,&nbsp;Maria Jose Castro-Bleda ,&nbsp;Aurelia Bustos-Moreno ,&nbsp;Maria de la Iglesia-Vayá ,&nbsp;Juan Francisco Vallalta-Rueda","doi":"10.1016/j.cmpb.2024.108567","DOIUrl":"10.1016/j.cmpb.2024.108567","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Despite significant investments in the normalization and the standardization of Electronic Health Records (EHRs), free text is still the rule rather than the exception in clinical notes. The use of free text has implications in data reuse methods used for supporting clinical research since the query mechanisms used in cohort definition and patient matching are mainly based on structured data and clinical terminologies. This study aims to develop a method for the secondary use of clinical text by: (a) using Natural Language Processing (NLP) for tagging clinical notes with biomedical terminology; and (b) designing an ontology that maps and classifies all the identified tags to various terminologies and allows for running phenotyping queries.</div></div><div><h3>Methods and Results:</h3><div>Transformers-based NLP Models, concretely pre-trained RoBERTa language models, were used to process radiology reports and annotate them identifying elements matching UMLS Concept Unique Identifiers (CUIs) definitions. CUIs were mapped into several biomedical ontologies useful for phenotyping (e.g., SNOMED-CT, HPO, ICD-10, FMA, LOINC, and ICPC2, among others) and represented as a lightweight ontology using OWL (Web Ontology Language) constructs. This process resulted in a Linked Knowledge Base (LKB), which allows running expressive queries to retrieve reports that comply with specific criteria using automatic reasoning.</div></div><div><h3>Conclusion:</h3><div>Although phenotyping tools mostly rely on relational databases, the combination of NLP and Linked Data technologies allows us to build scalable knowledge bases using standard ontologies from the Web of data. Our approach enables us to execute a pipeline which input is free text and automatically maps identified entities to a LKB that allows answering phenotyping queries. In this work, we have only used Spanish radiology reports, although it is extensible to other languages for which suitable corpora are available. This is particularly valuable in regional and national systems dealing with large research databases from different registries and cohorts and plays an essential role in the scalability of large data reuse infrastructures that require indexing and governing distributed data sources.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"260 ","pages":"Article 108567"},"PeriodicalIF":4.9,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142945900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computer methods and programs in biomedicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1