首页 > 最新文献

Interdisciplinary Sciences: Computational Life Sciences最新文献

英文 中文
PLMC: Language Model of Protein Sequences Enhances Protein Crystallization Prediction. PLMC:蛋白质序列语言模型增强蛋白质结晶预测。
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-08-19 DOI: 10.1007/s12539-024-00639-6
Dapeng Xiong, Kaicheng U, Jianfeng Sun, Adam P Cribbs

X-ray diffraction crystallography has been most widely used for protein three-dimensional (3D) structure determination for which whether proteins are crystallizable is a central prerequisite. Yet, there are a number of procedures during protein crystallization, including protein material production, purification, and crystal production, which take turns affecting the crystallization outcome. Due to the expensive and laborious nature of this multi-stage process, various computational tools have been developed to predict protein crystallization propensity, which is then used to guide the experimental determination. In this study, we presented a novel deep learning framework, PLMC, to improve multi-stage protein crystallization propensity prediction by leveraging a pre-trained protein language model. To effectively train PLMC, two groups of features of each protein were integrated into a more comprehensive representation, including protein language embeddings from the large-scale protein sequence database and a handcrafted feature set consisting of physicochemical, sequence-based and disordered-related information. These features were further separately embedded for refinement, and then concatenated for the final prediction. Notably, our extensive benchmarking tests demonstrate that PLMC greatly outperforms other state-of-the-art methods by achieving AUC scores of 0.773, 0.893, and 0.913, respectively, at the aforementioned individual stages, and 0.982 at the final crystallization stage. Furthermore, PLMC is shown to be superior for predicting the crystallization of both globular and membrane proteins, as demonstrated by an AUC score of 0.991 for the latter. These results suggest the significant potential of PLMC in assisting researchers with the experimental design of crystallizable protein variants.

X 射线衍射晶体学最广泛地应用于蛋白质三维(3D)结构的确定,而蛋白质是否可结晶是其核心前提。然而,蛋白质结晶过程中存在许多程序,包括蛋白质材料生产、纯化和晶体生产,这些程序会轮流影响结晶结果。由于这一多阶段过程既昂贵又费力,人们开发了各种计算工具来预测蛋白质的结晶倾向,然后用来指导实验测定。在本研究中,我们提出了一种新颖的深度学习框架 PLMC,利用预先训练好的蛋白质语言模型来改进多阶段蛋白质结晶倾向预测。为了有效地训练 PLMC,我们将每个蛋白质的两组特征整合为一个更全面的表征,包括来自大规模蛋白质序列数据库的蛋白质语言嵌入,以及由物理化学、基于序列和无序相关信息组成的手工特征集。这些特征被进一步分别嵌入以进行细化,然后进行合并以进行最终预测。值得注意的是,我们进行的大量基准测试表明,PLMC 在上述各个阶段的 AUC 分数分别为 0.773、0.893 和 0.913,在最终结晶阶段的 AUC 分数为 0.982,大大优于其他最先进的方法。此外,PLMC 在预测球蛋白和膜蛋白的结晶方面也表现出色,后者的 AUC 得分为 0.991。这些结果表明,PLMC 在协助研究人员进行可结晶蛋白质变体的实验设计方面具有巨大潜力。
{"title":"PLMC: Language Model of Protein Sequences Enhances Protein Crystallization Prediction.","authors":"Dapeng Xiong, Kaicheng U, Jianfeng Sun, Adam P Cribbs","doi":"10.1007/s12539-024-00639-6","DOIUrl":"10.1007/s12539-024-00639-6","url":null,"abstract":"<p><p>X-ray diffraction crystallography has been most widely used for protein three-dimensional (3D) structure determination for which whether proteins are crystallizable is a central prerequisite. Yet, there are a number of procedures during protein crystallization, including protein material production, purification, and crystal production, which take turns affecting the crystallization outcome. Due to the expensive and laborious nature of this multi-stage process, various computational tools have been developed to predict protein crystallization propensity, which is then used to guide the experimental determination. In this study, we presented a novel deep learning framework, PLMC, to improve multi-stage protein crystallization propensity prediction by leveraging a pre-trained protein language model. To effectively train PLMC, two groups of features of each protein were integrated into a more comprehensive representation, including protein language embeddings from the large-scale protein sequence database and a handcrafted feature set consisting of physicochemical, sequence-based and disordered-related information. These features were further separately embedded for refinement, and then concatenated for the final prediction. Notably, our extensive benchmarking tests demonstrate that PLMC greatly outperforms other state-of-the-art methods by achieving AUC scores of 0.773, 0.893, and 0.913, respectively, at the aforementioned individual stages, and 0.982 at the final crystallization stage. Furthermore, PLMC is shown to be superior for predicting the crystallization of both globular and membrane proteins, as demonstrated by an AUC score of 0.991 for the latter. These results suggest the significant potential of PLMC in assisting researchers with the experimental design of crystallizable protein variants.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"802-813"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141999874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial Intelligence-Based Classification of CT Images Using a Hybrid SpinalZFNet. 使用混合 SpinalZFNet 对 CT 图像进行基于人工智能的分类。
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-08-21 DOI: 10.1007/s12539-024-00649-4
Faiqa Maqsood, Wang Zhenfei, Muhammad Mumtaz Ali, Baozhi Qiu, Naveed Ur Rehman, Fahad Sabah, Tahir Mahmood, Irfanud Din, Raheem Sarwar

The kidney is an abdominal organ in the human body that supports filtering excess water and waste from the blood. Kidney diseases generally occur due to changes in certain supplements, medical conditions, obesity, and diet, which causes kidney function and ultimately leads to complications such as chronic kidney disease, kidney failure, and other renal disorders. Combining patient metadata with computed tomography (CT) images is essential to accurately and timely diagnosing such complications. Deep Neural Networks (DNNs) have transformed medical fields by providing high accuracy in complex tasks. However, the high computational cost of these models is a significant challenge, particularly in real-time applications. This paper proposed SpinalZFNet, a hybrid deep learning approach that integrates the architectural strengths of Spinal Network (SpinalNet) with the feature extraction capabilities of Zeiler and Fergus Network (ZFNet) to classify kidney disease accurately using CT images. This unique combination enhanced feature analysis, significantly improving classification accuracy while reducing the computational overhead. At first, the acquired CT images are pre-processed using a median filter, and the pre-processed image is segmented using Efficient Neural Network (ENet). Later, the images are augmented, and different features are extracted from the augmented CT images. The extracted features finally classify the kidney disease into normal, tumor, cyst, and stone using the proposed SpinalZFNet model. The SpinalZFNet outperformed other models, with 99.9% sensitivity, 99.5% specificity, precision 99.6%, 99.8% accuracy, and 99.7% F1-Score in classifying kidney disease.

肾脏是人体的腹腔器官,负责过滤血液中多余的水分和废物。肾脏疾病的发生一般是由于某些补品、医疗条件、肥胖和饮食的变化,从而引起肾功能的变化,最终导致慢性肾病、肾衰竭和其他肾脏疾病等并发症。将患者元数据与计算机断层扫描(CT)图像相结合,对于准确及时地诊断此类并发症至关重要。深度神经网络(DNN)通过在复杂任务中提供高精确度,改变了医疗领域。然而,这些模型的高计算成本是一个重大挑战,尤其是在实时应用中。本文提出的 SpinalZFNet 是一种混合深度学习方法,它整合了脊柱网络(SpinalNet)的架构优势和 Zeiler 与 Fergus 网络(ZFNet)的特征提取能力,可利用 CT 图像对肾病进行准确分类。这种独特的组合增强了特征分析,显著提高了分类准确性,同时降低了计算开销。首先,使用中值滤波器对获取的 CT 图像进行预处理,然后使用高效神经网络(ENet)对预处理后的图像进行分割。之后,对图像进行增强,并从增强后的 CT 图像中提取不同的特征。提取的特征最终通过所提出的 SpinalZFNet 模型将肾病分为正常、肿瘤、囊肿和结石。在肾病分类方面,SpinalZFNet 的灵敏度为 99.9%,特异度为 99.5%,精确度为 99.6%,准确度为 99.8%,F1-Score 为 99.7%,均优于其他模型。
{"title":"Artificial Intelligence-Based Classification of CT Images Using a Hybrid SpinalZFNet.","authors":"Faiqa Maqsood, Wang Zhenfei, Muhammad Mumtaz Ali, Baozhi Qiu, Naveed Ur Rehman, Fahad Sabah, Tahir Mahmood, Irfanud Din, Raheem Sarwar","doi":"10.1007/s12539-024-00649-4","DOIUrl":"10.1007/s12539-024-00649-4","url":null,"abstract":"<p><p>The kidney is an abdominal organ in the human body that supports filtering excess water and waste from the blood. Kidney diseases generally occur due to changes in certain supplements, medical conditions, obesity, and diet, which causes kidney function and ultimately leads to complications such as chronic kidney disease, kidney failure, and other renal disorders. Combining patient metadata with computed tomography (CT) images is essential to accurately and timely diagnosing such complications. Deep Neural Networks (DNNs) have transformed medical fields by providing high accuracy in complex tasks. However, the high computational cost of these models is a significant challenge, particularly in real-time applications. This paper proposed SpinalZFNet, a hybrid deep learning approach that integrates the architectural strengths of Spinal Network (SpinalNet) with the feature extraction capabilities of Zeiler and Fergus Network (ZFNet) to classify kidney disease accurately using CT images. This unique combination enhanced feature analysis, significantly improving classification accuracy while reducing the computational overhead. At first, the acquired CT images are pre-processed using a median filter, and the pre-processed image is segmented using Efficient Neural Network (ENet). Later, the images are augmented, and different features are extracted from the augmented CT images. The extracted features finally classify the kidney disease into normal, tumor, cyst, and stone using the proposed SpinalZFNet model. The SpinalZFNet outperformed other models, with 99.9% sensitivity, 99.5% specificity, precision 99.6%, 99.8% accuracy, and 99.7% F1-Score in classifying kidney disease.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"907-925"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11512893/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142017327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Disease-Metabolite Associations Based on the Metapath Aggregation of Tripartite Heterogeneous Networks. 基于三方异构网络元路径聚合的疾病-代谢物关联预测
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-08-07 DOI: 10.1007/s12539-024-00645-8
Wenzhi Liu, Pengli Lu

The exploration of the interactions between diseases and metabolites holds significant implications for the diagnosis and treatment of diseases. However, traditional experimental methods are time-consuming and costly, and current computational methods often overlook the influence of other biological entities on both. In light of these limitations, we proposed a novel deep learning model based on metapath aggregation of tripartite heterogeneous networks (MAHN) to explore disease-related metabolites. Specifically, we introduced microbes to construct a tripartite heterogeneous network and employed graph convolutional network and enhanced GraphSAGE to learn node features with metapath length 3. Additionally, we utilized node-level and semantic-level attention mechanisms, a more granular approach, to aggregate node features with metapath length 2. Finally, the reconstructed association probability is obtained by fusing features from different metapaths into the bilinear decoder. The experiments demonstrate that the proposed MAHN model achieved superior performance in five-fold cross-validation with Acc (91.85%), Pre (90.48%), Recall (93.53%), F1 (91.94%), AUC (97.39%), and AUPR (97.47%), outperforming four state-of-the-art algorithms. Case studies on two complex diseases, irritable bowel syndrome and obesity, further validate the predictive results, and the MAHN model is a trustworthy prediction tool for discovering potential metabolites. Moreover, deep learning models integrating multi-omics data represent the future mainstream direction for predicting disease-related biological entities.

探索疾病与代谢物之间的相互作用对疾病的诊断和治疗具有重要意义。然而,传统的实验方法耗时长、成本高,目前的计算方法往往忽略了其他生物实体对两者的影响。鉴于这些局限性,我们提出了一种基于三方异构网络元路径聚合(MAHN)的新型深度学习模型来探索疾病相关代谢物。具体来说,我们引入微生物来构建三方异构网络,并采用图卷积网络和增强型 GraphSAGE 来学习元路径长度为 3 的节点特征;此外,我们还利用节点级和语义级注意力机制(一种更精细的方法)来聚合元路径长度为 2 的节点特征。实验证明,所提出的 MAHN 模型在五倍交叉验证中取得了优异的性能,Acc(91.85%)、Pre(90.48%)、Recall(93.53%)、F1(91.94%)、AUC(97.39%)和 AUPR(97.47%)均优于四种最先进的算法。对肠易激综合征和肥胖症这两种复杂疾病的案例研究进一步验证了预测结果,MAHN 模型是发现潜在代谢物的值得信赖的预测工具。此外,整合多组学数据的深度学习模型代表了预测疾病相关生物实体的未来主流方向。
{"title":"Predicting Disease-Metabolite Associations Based on the Metapath Aggregation of Tripartite Heterogeneous Networks.","authors":"Wenzhi Liu, Pengli Lu","doi":"10.1007/s12539-024-00645-8","DOIUrl":"10.1007/s12539-024-00645-8","url":null,"abstract":"<p><p>The exploration of the interactions between diseases and metabolites holds significant implications for the diagnosis and treatment of diseases. However, traditional experimental methods are time-consuming and costly, and current computational methods often overlook the influence of other biological entities on both. In light of these limitations, we proposed a novel deep learning model based on metapath aggregation of tripartite heterogeneous networks (MAHN) to explore disease-related metabolites. Specifically, we introduced microbes to construct a tripartite heterogeneous network and employed graph convolutional network and enhanced GraphSAGE to learn node features with metapath length 3. Additionally, we utilized node-level and semantic-level attention mechanisms, a more granular approach, to aggregate node features with metapath length 2. Finally, the reconstructed association probability is obtained by fusing features from different metapaths into the bilinear decoder. The experiments demonstrate that the proposed MAHN model achieved superior performance in five-fold cross-validation with Acc (91.85%), Pre (90.48%), Recall (93.53%), F1 (91.94%), AUC (97.39%), and AUPR (97.47%), outperforming four state-of-the-art algorithms. Case studies on two complex diseases, irritable bowel syndrome and obesity, further validate the predictive results, and the MAHN model is a trustworthy prediction tool for discovering potential metabolites. Moreover, deep learning models integrating multi-omics data represent the future mainstream direction for predicting disease-related biological entities.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"829-843"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141901633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Viral Rebound After Antiviral Treatment: A Mathematical Modeling Study of the Role of Antiviral Mechanism of Action. 抗病毒治疗后的病毒反弹:抗病毒作用机制的数学模型研究。
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-07-21 DOI: 10.1007/s12539-024-00643-w
Aubrey Chiarelli, Hana Dobrovolny

The development of antiviral treatments for SARS-CoV-2 was an important turning point for the pandemic. Availability of safe and effective antivirals has allowed people to return back to normal life. While SARS-CoV-2 antivirals are highly effective at preventing severe disease, there have been concerning reports of viral rebound in some patients after cessation of antiviral treatment. In this study, we use a mathematical model of viral infection to study the potential of different antivirals to prevent viral rebound. We find that antivirals that block production are most likely to result in viral rebound if the treatment time course is not sufficiently long. Since these antivirals do not prevent infection of cells, cells continue to be infected during treatment. When treatment is stopped, the infected cells will begin producing virus at the usual rate. Antivirals that prevent infection of cells are less likely to result in viral rebound since cells are not being infected during treatment. This study highlights the role of antiviral mechanism of action in increasing or reducing the probability of viral rebound.

针对 SARS-CoV-2 的抗病毒疗法的开发是这次大流行病的一个重要转折点。安全有效的抗病毒药物使人们得以恢复正常生活。虽然 SARS-CoV-2 抗病毒药物在预防严重疾病方面非常有效,但也有一些患者在停止抗病毒治疗后病毒反弹的报道,令人担忧。在这项研究中,我们利用病毒感染的数学模型来研究不同抗病毒药物预防病毒反弹的潜力。我们发现,如果治疗时间不够长,阻断病毒生成的抗病毒药物最有可能导致病毒反弹。由于这些抗病毒药物不能阻止细胞感染,因此在治疗期间细胞会继续受到感染。治疗停止后,受感染的细胞又会以通常的速度开始产生病毒。防止细胞感染的抗病毒药物不太可能导致病毒反弹,因为细胞在治疗期间没有受到感染。这项研究强调了抗病毒药物的作用机制在增加或减少病毒反弹概率方面的作用。
{"title":"Viral Rebound After Antiviral Treatment: A Mathematical Modeling Study of the Role of Antiviral Mechanism of Action.","authors":"Aubrey Chiarelli, Hana Dobrovolny","doi":"10.1007/s12539-024-00643-w","DOIUrl":"10.1007/s12539-024-00643-w","url":null,"abstract":"<p><p>The development of antiviral treatments for SARS-CoV-2 was an important turning point for the pandemic. Availability of safe and effective antivirals has allowed people to return back to normal life. While SARS-CoV-2 antivirals are highly effective at preventing severe disease, there have been concerning reports of viral rebound in some patients after cessation of antiviral treatment. In this study, we use a mathematical model of viral infection to study the potential of different antivirals to prevent viral rebound. We find that antivirals that block production are most likely to result in viral rebound if the treatment time course is not sufficiently long. Since these antivirals do not prevent infection of cells, cells continue to be infected during treatment. When treatment is stopped, the infected cells will begin producing virus at the usual rate. Antivirals that prevent infection of cells are less likely to result in viral rebound since cells are not being infected during treatment. This study highlights the role of antiviral mechanism of action in increasing or reducing the probability of viral rebound.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"844-853"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141734033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPJA-Net: A Lightweight End-to-End Network for Sleep Stage Prediction Based on Feature Pyramid and Joint Attention. FPJA-Net:基于特征金字塔和联合注意力的轻量级端到端睡眠阶段预测网络
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-08-19 DOI: 10.1007/s12539-024-00636-9
Zhi Liu, Qinhan Zhang, Sixin Luo, Meiqiao Qin

Sleep staging is the most crucial work before diagnosing and treating sleep disorders. Traditional manual sleep staging is time-consuming and depends on the skill of experts. Nowadays, automatic sleep staging based on deep learning attracts more and more scientific researchers. As we know, the salient waves in sleep signals contain the most important information for automatic sleep staging. However, the key information is not fully utilized in existing deep learning methods since most of them only use CNN or RNN which could not capture multi-scale features in salient waves effectively. To tackle this limitation, we propose a lightweight end-to-end network for sleep stage prediction based on feature pyramid and joint attention. The feature pyramid module is designed to effectively extract multi-scale features in salient waves, and these features are then fed to the joint attention module to closely attend to the channel and location information of the salient waves. The proposed network has much fewer parameters and significant performance improvement, which is better than the state-of-the-art results. The overall accuracy and macro F1 score on the public dataset Sleep-EDF39, Sleep-EDF153 and SHHS are 90.1%, 87.8%, 87.4%, 84.4% and 86.9%, 83.9%, respectively. Ablation experiments confirm the effectiveness of each module.

睡眠分期是诊断和治疗睡眠障碍前最关键的工作。传统的人工睡眠分期耗时长,且依赖于专家的技术。如今,基于深度学习的自动睡眠分期吸引了越来越多的科研人员。我们知道,睡眠信号中的显著波包含了对自动睡眠分期最重要的信息。然而,由于现有的深度学习方法大多只使用 CNN 或 RNN,无法有效捕捉显著波的多尺度特征,因此无法充分利用这些关键信息。针对这一局限,我们提出了一种基于特征金字塔和联合注意力的轻量级端到端网络,用于预测睡眠阶段。特征金字塔模块旨在有效提取突出波的多尺度特征,然后将这些特征反馈给联合注意模块,以密切关注突出波的信道和位置信息。所提出的网络参数更少,性能提升显著,优于最先进的结果。在公开数据集 Sleep-EDF39、Sleep-EDF153 和 SHHS 上的总体准确率和宏观 F1 分数分别为 90.1%、87.8%、87.4%、84.4% 和 86.9%、83.9%。消融实验证实了每个模块的有效性。
{"title":"FPJA-Net: A Lightweight End-to-End Network for Sleep Stage Prediction Based on Feature Pyramid and Joint Attention.","authors":"Zhi Liu, Qinhan Zhang, Sixin Luo, Meiqiao Qin","doi":"10.1007/s12539-024-00636-9","DOIUrl":"10.1007/s12539-024-00636-9","url":null,"abstract":"<p><p>Sleep staging is the most crucial work before diagnosing and treating sleep disorders. Traditional manual sleep staging is time-consuming and depends on the skill of experts. Nowadays, automatic sleep staging based on deep learning attracts more and more scientific researchers. As we know, the salient waves in sleep signals contain the most important information for automatic sleep staging. However, the key information is not fully utilized in existing deep learning methods since most of them only use CNN or RNN which could not capture multi-scale features in salient waves effectively. To tackle this limitation, we propose a lightweight end-to-end network for sleep stage prediction based on feature pyramid and joint attention. The feature pyramid module is designed to effectively extract multi-scale features in salient waves, and these features are then fed to the joint attention module to closely attend to the channel and location information of the salient waves. The proposed network has much fewer parameters and significant performance improvement, which is better than the state-of-the-art results. The overall accuracy and macro F1 score on the public dataset Sleep-EDF39, Sleep-EDF153 and SHHS are 90.1%, 87.8%, 87.4%, 84.4% and 86.9%, 83.9%, respectively. Ablation experiments confirm the effectiveness of each module.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"769-780"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141999873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Function-Genes and Disease-Genes Prediction Based on Network Embedding and One-Class Classification. 基于网络嵌入和单类分类的功能基因和疾病基因预测
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-09-04 DOI: 10.1007/s12539-024-00638-7
Weiyu Shi, Yan Zhang, Yeqing Sun, Zhengkui Lin

Using genes which have been experimentally-validated for diseases (functions) can develop machine learning methods to predict new disease/function-genes. However, the prediction of both function-genes and disease-genes faces the same problem: there are only certain positive examples, but no negative examples. To solve this problem, we proposed a function/disease-genes prediction algorithm based on network embedding (Variational Graph Auto-Encoders, VGAE) and one-class classification (Fast Minimum Covariance Determinant, Fast-MCD): VGAEMCD. Firstly, we constructed a protein-protein interaction (PPI) network centered on experimentally-validated genes; then VGAE was used to get the embeddings of nodes (genes) in the network; finally, the embeddings were input into the improved deep learning one-class classifier based on Fast-MCD to predict function/disease-genes. VGAEMCD can predict function-gene and disease-gene in a unified way, and only the experimentally-verified genes are needed to provide (no need for expression profile). VGAEMCD outperforms classical one-class classification algorithms in Recall, Precision, F-measure, Specificity, and Accuracy. Further experiments show that seven metrics of VGAEMCD are higher than those of state-of-art function/disease-genes prediction algorithms. The above results indicate that VGAEMCD can well learn the distribution characteristics of positive examples and accurately identify function/disease-genes.

利用已通过实验验证的疾病基因(功能),可以开发出预测新疾病/功能基因的机器学习方法。然而,功能基因和疾病基因的预测都面临同样的问题:只有一定的正例,而没有负例。为了解决这个问题,我们提出了一种基于网络嵌入(变异图自动编码器,VGAE)和单类分类(快速最小协方差判定,Fast-MCD)的功能/疾病基因预测算法:VGAEMCD。首先,我们以经过实验验证的基因为中心构建了一个蛋白质-蛋白质相互作用(PPI)网络;然后,使用VGAE获得网络中节点(基因)的嵌入;最后,将嵌入输入基于Fast-MCD的改进型深度学习单类分类器,以预测功能/疾病基因。VGAEMCD 可以统一预测功能基因和疾病基因,只需提供实验验证的基因(无需表达谱)。VGAEMCD 在 Recall、Precision、F-measure、Specificity 和 Accuracy 方面均优于经典的单类分类算法。进一步的实验表明,VGAEMCD 的七项指标均高于最先进的功能/疾病基因预测算法。上述结果表明,VGAEMCD 能很好地学习正例的分布特征,并准确识别功能/疾病基因。
{"title":"Function-Genes and Disease-Genes Prediction Based on Network Embedding and One-Class Classification.","authors":"Weiyu Shi, Yan Zhang, Yeqing Sun, Zhengkui Lin","doi":"10.1007/s12539-024-00638-7","DOIUrl":"10.1007/s12539-024-00638-7","url":null,"abstract":"<p><p>Using genes which have been experimentally-validated for diseases (functions) can develop machine learning methods to predict new disease/function-genes. However, the prediction of both function-genes and disease-genes faces the same problem: there are only certain positive examples, but no negative examples. To solve this problem, we proposed a function/disease-genes prediction algorithm based on network embedding (Variational Graph Auto-Encoders, VGAE) and one-class classification (Fast Minimum Covariance Determinant, Fast-MCD): VGAEMCD. Firstly, we constructed a protein-protein interaction (PPI) network centered on experimentally-validated genes; then VGAE was used to get the embeddings of nodes (genes) in the network; finally, the embeddings were input into the improved deep learning one-class classifier based on Fast-MCD to predict function/disease-genes. VGAEMCD can predict function-gene and disease-gene in a unified way, and only the experimentally-verified genes are needed to provide (no need for expression profile). VGAEMCD outperforms classical one-class classification algorithms in Recall, Precision, F-measure, Specificity, and Accuracy. Further experiments show that seven metrics of VGAEMCD are higher than those of state-of-art function/disease-genes prediction algorithms. The above results indicate that VGAEMCD can well learn the distribution characteristics of positive examples and accurately identify function/disease-genes.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"781-801"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142125655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Pragmatic Approach to Fetal Monitoring via Cardiotocography Using Feature Elimination and Hyperparameter Optimization. 利用特征消除和超参数优化通过心脏排出图监测胎儿的实用方法
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-10-05 DOI: 10.1007/s12539-024-00647-6
Fırat Hardalaç, Haad Akmal, Kubilay Ayturan, U Rajendra Acharya, Ru-San Tan

Cardiotocography (CTG) is used to assess the health of the fetus during birth or antenatally in the third trimester. It concurrently detects the maternal uterine contractions (UC) and fetal heart rate (FHR). Fetal distress, which may require therapeutic intervention, can be diagnosed using baseline FHR and its reaction to uterine contractions. Using CTG, a pragmatic machine learning strategy based on feature reduction and hyperparameter optimization was suggested in this study to classify the various fetal states (Normal, Suspect, Pathological). An application of this strategy can be a decision support tool to manage pregnancies. On a public dataset of 2126 CTG recordings, the model was assessed using various standard CTG dataset specific and relevant classifiers. The classifiers' accuracy was improved by the proposed method. The model accuracy was increased to 97.20% while using Random Forest (best classifier). Practically speaking, the model was able to correctly predict 100% of all pathological cases and 98.8% of all normal cases in the dataset. The proposed model was also implemented on another public CTG dataset having 552 CTG signals, resulting in a 97.34% accuracy. If integrated with telemedicine, this proposed model could also be used for long-distance "stay at home" fetal monitoring in high-risk pregnancies.

胎儿心动图(CTG)用于评估胎儿出生时或产前三个月的健康状况。它可同时检测母体子宫收缩(UC)和胎儿心率(FHR)。胎儿窘迫可能需要治疗干预,可通过基线 FHR 及其对子宫收缩的反应进行诊断。本研究利用 CTG,提出了一种基于特征缩减和超参数优化的实用机器学习策略,用于对各种胎儿状态(正常、可疑、病理)进行分类。这一策略的应用可作为管理妊娠的决策支持工具。在 2126 个 CTG 记录的公共数据集上,使用各种标准 CTG 数据集专用的相关分类器对模型进行了评估。建议的方法提高了分类器的准确性。使用随机森林(最佳分类器)时,模型准确率提高到 97.20%。实际上,该模型能够正确预测数据集中 100% 的病理病例和 98.8% 的正常病例。我们还在另一个包含 552 个 CTG 信号的公共 CTG 数据集上实施了所提出的模型,结果准确率达到 97.34%。如果与远程医疗相结合,该模型还可用于高危妊娠的远程 "在家 "胎儿监护。
{"title":"A Pragmatic Approach to Fetal Monitoring via Cardiotocography Using Feature Elimination and Hyperparameter Optimization.","authors":"Fırat Hardalaç, Haad Akmal, Kubilay Ayturan, U Rajendra Acharya, Ru-San Tan","doi":"10.1007/s12539-024-00647-6","DOIUrl":"10.1007/s12539-024-00647-6","url":null,"abstract":"<p><p>Cardiotocography (CTG) is used to assess the health of the fetus during birth or antenatally in the third trimester. It concurrently detects the maternal uterine contractions (UC) and fetal heart rate (FHR). Fetal distress, which may require therapeutic intervention, can be diagnosed using baseline FHR and its reaction to uterine contractions. Using CTG, a pragmatic machine learning strategy based on feature reduction and hyperparameter optimization was suggested in this study to classify the various fetal states (Normal, Suspect, Pathological). An application of this strategy can be a decision support tool to manage pregnancies. On a public dataset of 2126 CTG recordings, the model was assessed using various standard CTG dataset specific and relevant classifiers. The classifiers' accuracy was improved by the proposed method. The model accuracy was increased to 97.20% while using Random Forest (best classifier). Practically speaking, the model was able to correctly predict 100% of all pathological cases and 98.8% of all normal cases in the dataset. The proposed model was also implemented on another public CTG dataset having 552 CTG signals, resulting in a 97.34% accuracy. If integrated with telemedicine, this proposed model could also be used for long-distance \"stay at home\" fetal monitoring in high-risk pregnancies.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"882-906"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142377867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets. Adap-BDCM:用于 CNV 数据集分类任务的自适应双线性动态级联模型。
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-05-17 DOI: 10.1007/s12539-024-00635-w
Liancheng Jiang, Liye Jia, Yizhen Wang, Yongfei Wu, Junhong Yue

Copy number variation (CNV) is an essential genetic driving factor of cancer formation and progression, making intelligent classification based on CNV feasible. However, there are a few challenges in the current machine learning and deep learning methods, such as the design of base classifier combination schemes in ensemble methods and the selection of layers of neural networks, which often result in low accuracy. Therefore, an adaptive bilinear dynamic cascade model (Adap-BDCM) is developed to further enhance the accuracy and applicability of these methods for intelligent classification on CNV datasets. In this model, a feature selection module is introduced to mitigate the interference of redundant information, and a bilinear model based on the gated attention mechanism is proposed to extract more beneficial deep fusion features. Furthermore, an adaptive base classifier selection scheme is designed to overcome the difficulty of manually designing base classifier combinations and enhance the applicability of the model. Lastly, a novel feature fusion scheme with an attribute recall submodule is constructed, effectively avoiding getting stuck in local solutions and missing some valuable information. Numerous experiments have demonstrated that our Adap-BDCM model exhibits optimal performance in cancer classification, stage prediction, and recurrence on CNV datasets. This study can assist physicians in making diagnoses faster and better.

拷贝数变异(CNV)是癌症形成和发展的重要遗传驱动因素,因此基于 CNV 的智能分类是可行的。然而,目前的机器学习和深度学习方法存在一些挑战,如集合方法中基础分类器组合方案的设计和神经网络层的选择,这往往导致准确率较低。因此,我们开发了一种自适应双线性动态级联模型(Adap-BDCM),以进一步提高这些方法在 CNV 数据集智能分类中的准确性和适用性。在该模型中,引入了一个特征选择模块来减少冗余信息的干扰,并提出了一个基于门控注意机制的双线性模型来提取更多有益的深度融合特征。此外,还设计了一种自适应基础分类器选择方案,以克服人工设计基础分类器组合的困难,增强模型的适用性。最后,构建了一种带有属性召回子模块的新型特征融合方案,有效避免了陷入局部求解而遗漏一些有价值的信息。大量实验证明,我们的 Adap-BDCM 模型在 CNV 数据集的癌症分类、分期预测和复发方面表现出最佳性能。这项研究可以帮助医生更快更好地做出诊断。
{"title":"Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets.","authors":"Liancheng Jiang, Liye Jia, Yizhen Wang, Yongfei Wu, Junhong Yue","doi":"10.1007/s12539-024-00635-w","DOIUrl":"10.1007/s12539-024-00635-w","url":null,"abstract":"<p><p>Copy number variation (CNV) is an essential genetic driving factor of cancer formation and progression, making intelligent classification based on CNV feasible. However, there are a few challenges in the current machine learning and deep learning methods, such as the design of base classifier combination schemes in ensemble methods and the selection of layers of neural networks, which often result in low accuracy. Therefore, an adaptive bilinear dynamic cascade model (Adap-BDCM) is developed to further enhance the accuracy and applicability of these methods for intelligent classification on CNV datasets. In this model, a feature selection module is introduced to mitigate the interference of redundant information, and a bilinear model based on the gated attention mechanism is proposed to extract more beneficial deep fusion features. Furthermore, an adaptive base classifier selection scheme is designed to overcome the difficulty of manually designing base classifier combinations and enhance the applicability of the model. Lastly, a novel feature fusion scheme with an attribute recall submodule is constructed, effectively avoiding getting stuck in local solutions and missing some valuable information. Numerous experiments have demonstrated that our Adap-BDCM model exhibits optimal performance in cancer classification, stage prediction, and recurrence on CNV datasets. This study can assist physicians in making diagnoses faster and better.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"1019-1037"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140956486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CVGAE: A Self-Supervised Generative Method for Gene Regulatory Network Inference Using Single-Cell RNA Sequencing Data. CVGAE:利用单细胞 RNA 测序数据进行基因调控网络推断的自监督生成方法。
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-05-23 DOI: 10.1007/s12539-024-00633-y
Wei Liu, Zhijie Teng, Zejun Li, Jing Chen

Gene regulatory network (GRN) inference based on single-cell RNA sequencing data (scRNAseq) plays a crucial role in understanding the regulatory mechanisms between genes. Various computational methods have been employed for GRN inference, but their performance in terms of network accuracy and model generalization is not satisfactory, and their poor performance is caused by high-dimensional data and network sparsity. In this paper, we propose a self-supervised method for gene regulatory network inference using single-cell RNA sequencing data (CVGAE). CVGAE uses graph neural network for inductive representation learning, which merges gene expression data and observed topology into a low-dimensional vector space. The well-trained vectors will be used to calculate mathematical distance of each gene, and further predict interactions between genes. In overall framework, FastICA is implemented to relief computational complexity caused by high dimensional data, and CVGAE adopts multi-stacked GraphSAGE layers as an encoder and an improved decoder to overcome network sparsity. CVGAE is evaluated on several single cell datasets containing four related ground-truth networks, and the result shows that CVGAE achieve better performance than comparative methods. To validate learning and generalization capabilities, CVGAE is applied in few-shot environment by change the ratio of train set and test set. In condition of few-shot, CVGAE obtains comparable or superior performance.

基于单细胞 RNA 测序数据(scRNAseq)的基因调控网络(GRN)推断对理解基因之间的调控机制起着至关重要的作用。目前已有多种计算方法用于基因调控网络推断,但这些方法在网络准确性和模型泛化方面的表现并不令人满意,其性能不佳的原因在于高维数据和网络稀疏性。本文提出了一种利用单细胞 RNA 测序数据进行基因调控网络推断的自监督方法(CVGAE)。CVGAE 利用图神经网络进行归纳表征学习,将基因表达数据和观察到的拓扑结构合并到一个低维向量空间中。训练有素的向量将用于计算每个基因的数学距离,并进一步预测基因之间的相互作用。在整体框架中,FastICA 的实现减轻了高维数据带来的计算复杂性,CVGAE 采用多层图形AGE 层作为编码器和改进的解码器来克服网络稀疏性。CVGAE 在包含四个相关地面实况网络的多个单细胞数据集上进行了评估,结果表明 CVGAE 比其他方法取得了更好的性能。为了验证学习和泛化能力,CVGAE 被应用于少镜头环境,即改变训练集和测试集的比例。在少数几个测试集的条件下,CVGAE 获得了相当或更优的性能。
{"title":"CVGAE: A Self-Supervised Generative Method for Gene Regulatory Network Inference Using Single-Cell RNA Sequencing Data.","authors":"Wei Liu, Zhijie Teng, Zejun Li, Jing Chen","doi":"10.1007/s12539-024-00633-y","DOIUrl":"10.1007/s12539-024-00633-y","url":null,"abstract":"<p><p>Gene regulatory network (GRN) inference based on single-cell RNA sequencing data (scRNAseq) plays a crucial role in understanding the regulatory mechanisms between genes. Various computational methods have been employed for GRN inference, but their performance in terms of network accuracy and model generalization is not satisfactory, and their poor performance is caused by high-dimensional data and network sparsity. In this paper, we propose a self-supervised method for gene regulatory network inference using single-cell RNA sequencing data (CVGAE). CVGAE uses graph neural network for inductive representation learning, which merges gene expression data and observed topology into a low-dimensional vector space. The well-trained vectors will be used to calculate mathematical distance of each gene, and further predict interactions between genes. In overall framework, FastICA is implemented to relief computational complexity caused by high dimensional data, and CVGAE adopts multi-stacked GraphSAGE layers as an encoder and an improved decoder to overcome network sparsity. CVGAE is evaluated on several single cell datasets containing four related ground-truth networks, and the result shows that CVGAE achieve better performance than comparative methods. To validate learning and generalization capabilities, CVGAE is applied in few-shot environment by change the ratio of train set and test set. In condition of few-shot, CVGAE obtains comparable or superior performance.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"990-1004"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141081107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating HRMAS-NMR Data and Machine Learning-Assisted Profiling of Metabolite Fluxes to Classify Low- and High-Grade Gliomas. 整合 HRMAS-NMR 数据和机器学习辅助的代谢通量分析,对低级别和高级别胶质瘤进行分类。
IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-01 Epub Date: 2024-09-27 DOI: 10.1007/s12539-024-00642-x
Safia Firdous, Zubair Nawaz, Rizwan Abid, Leo L Cheng, Syed Ghulam Musharraf, Saima Sadaf

Diagnosing and classifying central nervous system tumors such as gliomas or glioblastomas pose a significant challenge due to their aggressive and infiltrative nature. However, recent advancements in metabolomics and magnetic resonance spectroscopy (MRS) offer promising avenues for differentiating tumor grades both in vivo and ex vivo. This study aimed to explore tissue-based metabolic signatures to classify/distinguish between low- and high-grade gliomas. Forty-six histologically confirmed, intact solid tumor samples from glioma patients were analyzed using high-resolution magic angle spinning nuclear magnetic resonance (HRMAS-NMR) spectroscopy. By integrating machine learning (ML) algorithms, spectral regions with the most discriminative potential were identified. Validation was performed through univariate and multivariate statistical analyses, along with HRMAS-NMR analyses of 46 paired plasma samples. Amongst the various ML models applied, the logistics regression identified 46 spectral regions capable of sub-classifying gliomas with accuracy 87% (F1-measure 0.87, Precision 0.82, Recall 0.93), whereas the extra-tree classifier identified three spectral regions with predictive accuracy of 91% (F1-measure 0.91, Precision 0.85, Recall 0.97). Wilcoxon test presented 51 spectral regions significantly differentiating low- and high-grade glioma groups (p < 0.05). Based on sensitivity and area under the curve values, 40 spectral regions corresponding to 18 metabolites were considered as potential biomarkers for tissue-based glioma classification and amongst these N-acetyl aspartate, glutamate, and glutamine emerged as the most important markers. These markers were validated in paired plasma samples, and their absolute concentrations were computed. Our results demonstrate that the metabolic markers identified through the HRMAS-NMR-ML analysis framework, and their associated metabolic networks, hold promise for targeted treatment planning and clinical interventions in the future.

由于胶质瘤或胶质母细胞瘤等中枢神经系统肿瘤具有侵袭性和浸润性,因此对其进行诊断和分类是一项重大挑战。然而,代谢组学和磁共振波谱学(MRS)的最新进展为体内和体外区分肿瘤等级提供了有希望的途径。本研究旨在探索基于组织的代谢特征来分类/区分低级别和高级别胶质瘤。研究人员使用高分辨率魔角旋转核磁共振(HRMAS-NMR)光谱分析了来自胶质瘤患者的 46 份经组织学证实的完整实体瘤样本。通过整合机器学习(ML)算法,确定了最具鉴别潜力的光谱区域。通过对 46 份配对血浆样本进行 HRMAS-NMR 分析,并通过单变量和多变量统计分析进行了验证。在应用的各种多变量模型中,物流回归确定了 46 个能够对胶质瘤进行亚分类的光谱区域,准确率为 87%(F1-measure 0.87,Precision 0.82,Recall 0.93),而树外分类器确定了 3 个光谱区域,预测准确率为 91%(F1-measure 0.91,Precision 0.85,Recall 0.97)。Wilcoxon 检验显示,51 个光谱区域能明显区分低级别和高级别胶质瘤组(p
{"title":"Integrating HRMAS-NMR Data and Machine Learning-Assisted Profiling of Metabolite Fluxes to Classify Low- and High-Grade Gliomas.","authors":"Safia Firdous, Zubair Nawaz, Rizwan Abid, Leo L Cheng, Syed Ghulam Musharraf, Saima Sadaf","doi":"10.1007/s12539-024-00642-x","DOIUrl":"10.1007/s12539-024-00642-x","url":null,"abstract":"<p><p>Diagnosing and classifying central nervous system tumors such as gliomas or glioblastomas pose a significant challenge due to their aggressive and infiltrative nature. However, recent advancements in metabolomics and magnetic resonance spectroscopy (MRS) offer promising avenues for differentiating tumor grades both in vivo and ex vivo. This study aimed to explore tissue-based metabolic signatures to classify/distinguish between low- and high-grade gliomas. Forty-six histologically confirmed, intact solid tumor samples from glioma patients were analyzed using high-resolution magic angle spinning nuclear magnetic resonance (HRMAS-NMR) spectroscopy. By integrating machine learning (ML) algorithms, spectral regions with the most discriminative potential were identified. Validation was performed through univariate and multivariate statistical analyses, along with HRMAS-NMR analyses of 46 paired plasma samples. Amongst the various ML models applied, the logistics regression identified 46 spectral regions capable of sub-classifying gliomas with accuracy 87% (F1-measure 0.87, Precision 0.82, Recall 0.93), whereas the extra-tree classifier identified three spectral regions with predictive accuracy of 91% (F1-measure 0.91, Precision 0.85, Recall 0.97). Wilcoxon test presented 51 spectral regions significantly differentiating low- and high-grade glioma groups (p < 0.05). Based on sensitivity and area under the curve values, 40 spectral regions corresponding to 18 metabolites were considered as potential biomarkers for tissue-based glioma classification and amongst these N-acetyl aspartate, glutamate, and glutamine emerged as the most important markers. These markers were validated in paired plasma samples, and their absolute concentrations were computed. Our results demonstrate that the metabolic markers identified through the HRMAS-NMR-ML analysis framework, and their associated metabolic networks, hold promise for targeted treatment planning and clinical interventions in the future.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"854-871"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142346019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Interdisciplinary Sciences: Computational Life Sciences
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1