首页 > 最新文献

Computational biology and chemistry最新文献

英文 中文
Application of bioinformatic tools in cell type classification for single-cell RNA-seq data. 生物信息学工具在单细胞RNA-seq数据细胞类型分类中的应用。
Pub Date : 2025-01-02 DOI: 10.1016/j.compbiolchem.2024.108332
Shah Tania Akter Sujana, Md Shahjaman, Atul Chandra Singha

The advancements in single-cell RNA sequencing (scRNAseq) technology have significantly transformed genomics research, enabling the handling of thousands of cells in each experiment. As of now, 32,068 research studies have been cataloged in the Pubmed database. The primary aim of scRNAseq investigations is to identify cell types, understand the antitumor immune response, and identify new and uncommon cell types. Traditional techniques for identifying cell types include microscopy, histology, and pathological characteristics. However, the complexity of instruments and the need for precise experimental design make it difficult to fully capture the overall heterogeneity. Unsupervised clustering and supervised classification methods have been used to solve this task. Supervised cell type classification methods have gained popularity as large-scale, high-quality, well-annotated and more robust results compared to clustering methods. A recent study showed that support vector machine (SVM) gives a high-quality classification performance in different scenarios. In this article, we compare and evaluate the performance of four different kernels (sigmoid, linear, radial, polynomial) of SVM. The results of the experiments on three standard scRNA-seq datasets indicate that SVM with linear and SVM with sigmoid kernel classify the cells more accurately (approx. 99 %) where SVM linear kernel method has remarkably fast computation time and we also evaluate the results using some single cell specific evaluation matrices F-1 score, MCC, AUC value. Additionally, it sheds light on the potential use of kernels of SVM to give underlying information of single-cell RNA-Seq data more effectively.

单细胞RNA测序(scRNAseq)技术的进步极大地改变了基因组学研究,使每次实验中处理数千个细胞成为可能。截至目前,Pubmed数据库已收录了32,068项研究。scRNAseq研究的主要目的是鉴定细胞类型,了解抗肿瘤免疫反应,并鉴定新的和不常见的细胞类型。鉴定细胞类型的传统技术包括显微镜、组织学和病理特征。然而,由于仪器的复杂性和对精确实验设计的需要,很难完全捕捉到整体的异质性。无监督聚类和监督分类方法被用来解决这个问题。与聚类方法相比,监督细胞类型分类方法因其大规模、高质量、注释良好、鲁棒性更强而广受欢迎。最近的一项研究表明,支持向量机(SVM)在不同的场景下都能给出高质量的分类性能。在本文中,我们比较和评价了支持向量机的四种不同核(s型核、线性核、径向核、多项式核)的性能。在三个标准scRNA-seq数据集上的实验结果表明,线性支持向量机和sigmoid核支持向量机对细胞的分类精度更高。99 %),其中SVM线性核方法具有非常快的计算时间,并且我们还使用一些单细胞特定的评估矩阵F-1分数,MCC, AUC值来评估结果。此外,它还揭示了支持向量机核的潜在用途,以更有效地提供单细胞RNA-Seq数据的底层信息。
{"title":"Application of bioinformatic tools in cell type classification for single-cell RNA-seq data.","authors":"Shah Tania Akter Sujana, Md Shahjaman, Atul Chandra Singha","doi":"10.1016/j.compbiolchem.2024.108332","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108332","url":null,"abstract":"<p><p>The advancements in single-cell RNA sequencing (scRNAseq) technology have significantly transformed genomics research, enabling the handling of thousands of cells in each experiment. As of now, 32,068 research studies have been cataloged in the Pubmed database. The primary aim of scRNAseq investigations is to identify cell types, understand the antitumor immune response, and identify new and uncommon cell types. Traditional techniques for identifying cell types include microscopy, histology, and pathological characteristics. However, the complexity of instruments and the need for precise experimental design make it difficult to fully capture the overall heterogeneity. Unsupervised clustering and supervised classification methods have been used to solve this task. Supervised cell type classification methods have gained popularity as large-scale, high-quality, well-annotated and more robust results compared to clustering methods. A recent study showed that support vector machine (SVM) gives a high-quality classification performance in different scenarios. In this article, we compare and evaluate the performance of four different kernels (sigmoid, linear, radial, polynomial) of SVM. The results of the experiments on three standard scRNA-seq datasets indicate that SVM with linear and SVM with sigmoid kernel classify the cells more accurately (approx. 99 %) where SVM linear kernel method has remarkably fast computation time and we also evaluate the results using some single cell specific evaluation matrices F-1 score, MCC, AUC value. Additionally, it sheds light on the potential use of kernels of SVM to give underlying information of single-cell RNA-Seq data more effectively.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108332"},"PeriodicalIF":0.0,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An ensemble learning method combined with multiple feature representation strategies to predict lncRNA subcellular localizations. 结合多种特征表示策略的集成学习方法预测lncRNA亚细胞定位。
Pub Date : 2025-01-01 DOI: 10.1016/j.compbiolchem.2024.108336
Lina Zhang, Sizan Gao, Qinghao Yuan, Yao Fu, Runtao Yang

Long non-coding RNAs (lncRNAs) are strongly associated with cellular physiological mechanisms and implicated in the numerous diseases. By exploring the subcellular localizations of lncRNAs, we can not only gain crucial insights into the molecular mechanisms of lncRNA-related biological processes but also make valuable contributions towards the diagnosis, prevention, and treatment of various human diseases. However, conventional experimental techniques tend to be laborious and time-intensive. In this context, computational methods are in increased demand. The focus of this paper is the development of an innovative ensemble method that incorporates hybrid features to accurately predict the subcellular localizations of lncRNAs. To address the issue of incomplete reflection of inherent correlation with the intended target using singular source features, the utilization of heterogeneous multi-source features is implemented by introducing information on sequence composition, physicochemical properties, and structure. To address the issue of the imbalance classes in the benchmark dataset, the Synthetic Minority Over-sampling Technique (SMOTE) is employed. Finally, the resulting predictor termed lncSLPre is developed by integrating the outputs of the individual classifiers. Experimental findings suggest that the complementarity of multi-source heterogeneous features improves prediction performance. Additionally, it is demonstrated that the application of SMOTE is effective in mitigating the issue of the imbalanced dataset, while the feature selection approach is critical in eliminating extraneous and redundant features. Compared with existing advanced methods, lncSLPre achieves better performance with an overall accuracy improvement of 13.13%, 2.15%, and 3.23%, respectively, indicating that lncSLPre can effectively predict lncRNA subcellular localizations.

长链非编码rna (lncRNAs)与细胞生理机制密切相关,并与许多疾病有关。通过探索lncrna的亚细胞定位,我们不仅可以对lncrna相关生物学过程的分子机制有重要的认识,而且可以对各种人类疾病的诊断、预防和治疗做出有价值的贡献。然而,传统的实验技术往往是费力和费时的。在这种情况下,计算方法的需求增加了。本文的重点是开发一种创新的集成方法,该方法结合混合特征来准确预测lncrna的亚细胞定位。为了解决单一源特征不能完全反映目标的内在相关性的问题,通过引入序列组成、理化性质和结构等信息,实现了异构多源特征的利用。为了解决基准数据集中的类不平衡问题,采用了合成少数派过采样技术(SMOTE)。最后,通过集成各个分类器的输出来开发称为lncSLPre的预测器。实验结果表明,多源异构特征的互补性提高了预测性能。此外,还证明了SMOTE在缓解数据集不平衡问题方面的应用是有效的,而特征选择方法对于消除多余和多余的特征至关重要。与现有的先进方法相比,lncSLPre取得了更好的性能,总体准确率分别提高了13.13%、2.15%和3.23%,表明lncSLPre可以有效地预测lncRNA亚细胞定位。
{"title":"An ensemble learning method combined with multiple feature representation strategies to predict lncRNA subcellular localizations.","authors":"Lina Zhang, Sizan Gao, Qinghao Yuan, Yao Fu, Runtao Yang","doi":"10.1016/j.compbiolchem.2024.108336","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108336","url":null,"abstract":"<p><p>Long non-coding RNAs (lncRNAs) are strongly associated with cellular physiological mechanisms and implicated in the numerous diseases. By exploring the subcellular localizations of lncRNAs, we can not only gain crucial insights into the molecular mechanisms of lncRNA-related biological processes but also make valuable contributions towards the diagnosis, prevention, and treatment of various human diseases. However, conventional experimental techniques tend to be laborious and time-intensive. In this context, computational methods are in increased demand. The focus of this paper is the development of an innovative ensemble method that incorporates hybrid features to accurately predict the subcellular localizations of lncRNAs. To address the issue of incomplete reflection of inherent correlation with the intended target using singular source features, the utilization of heterogeneous multi-source features is implemented by introducing information on sequence composition, physicochemical properties, and structure. To address the issue of the imbalance classes in the benchmark dataset, the Synthetic Minority Over-sampling Technique (SMOTE) is employed. Finally, the resulting predictor termed lncSLPre is developed by integrating the outputs of the individual classifiers. Experimental findings suggest that the complementarity of multi-source heterogeneous features improves prediction performance. Additionally, it is demonstrated that the application of SMOTE is effective in mitigating the issue of the imbalanced dataset, while the feature selection approach is critical in eliminating extraneous and redundant features. Compared with existing advanced methods, lncSLPre achieves better performance with an overall accuracy improvement of 13.13%, 2.15%, and 3.23%, respectively, indicating that lncSLPre can effectively predict lncRNA subcellular localizations.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108336"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid optimization enabled DenseNet for autism spectrum disorders using MRI image. 混合优化使DenseNet能够使用MRI图像检测自闭症谱系障碍。
Pub Date : 2024-12-30 DOI: 10.1016/j.compbiolchem.2024.108335
Sakthi Ulaganathan, Pon Harshavardhanan, N V Ganapathi Raju, G Parthasarathy

Autism spectrum disorder (ASD) is the neuro-developmental disorder caused by various changes in the brain. It affects the life conditions with social interaction and communication. Most of the previous researches used the various techniques for the early detection to reduce the ASD, but it had been occurred several complications such as, time expenses, and low accessibility for diagnosis.This paper aims to develop the JSTO-DenseNetmodel is for the detection of ASD. In this paper, an input autism brainimage is considered as an input applied to image pre-processing phase. In image pre-processing, the clatters are removed utilizing Gaussian filtering and also, Region of Interest (ROI) extraction is carried out. Thereafter, extraction of pivotal region is done based on functional connectivity utilizing proposed Jaya Sewing Training Optimization (JSTO). The JSTO is newly introduced by combining Jaya algorithm and Sewing Training-Based Optimization (STBO). Thus, output-1 is obtained. In feature extraction phase, grey level co-occurrence matrix (GLCM) features like entropy, correlation, energy, homogeneity, inverse difference moment, Angular second moment and texture features namelylocal ternary patterns (LTP), Local Optimal Oriented Pattern (LOOP) and Histogram of Oriented Gradients (HOG) are extracted from the Magnetic Resonance Imaging (MRI). Therefore, output-2 is obtained. From output-1 and output-2, ASD classification is accomplished using DenseNet, which is trained employing same proposed JSTO.The proposed JSTO-DenseNet model achieves the highest accuracy of 94.8 %, True Positive Rate (TPR) of 90 %, True Negative Rate (TNR) of 90.5 %, un-weighted average recall (UAR) of 89.8 % and the lowest False Negative Rate (FNR) of 86.7 %, and False Positive Rate of 82.6 %, when compared with other traditional methods like, Explainable Artificial Intelligence (XAI), Hybrid deep lightweight feature generator, CLAttention, Two stream end-to-end deep learning (DL), Auto-Encoder feature representation, and Fuzzy Inference Gait System-Deep Extreme Adaptive Fuzzy (FIGS-DEAF) based on Abide 1 dataset.

自闭症谱系障碍(ASD)是由大脑的各种变化引起的神经发育障碍。它通过社会交往和交流影响着人们的生活状况。以往的研究大多采用各种早期检测技术来减少ASD的发生,但存在诊断费时、可及性低等并发症。本文旨在开发用于ASD检测的JSTO-DenseNetmodel。本文将自闭症脑图像作为输入,应用于图像预处理阶段。在图像预处理中,利用高斯滤波去除杂波,并提取感兴趣区域(ROI)。然后,利用提出的JSTO算法,基于功能连通性提取关键区域。JSTO是将Jaya算法与基于缝纫训练的优化算法(Sewing training based Optimization, STBO)相结合而提出的一种新算法。因此,得到output-1。在特征提取阶段,从磁共振成像(MRI)中提取灰度共生矩阵(GLCM)特征,如熵、相关性、能量、均匀性、逆差矩、角秒矩和纹理特征,即局部三元模式(LTP)、局部最优定向模式(LOOP)和定向梯度直方图(HOG)。因此,得到输出2。从输出1和输出2来看,ASD分类是使用DenseNet完成的,DenseNet使用相同的JSTO进行训练。与可解释人工智能(Explainable Artificial Intelligence, XAI)、混合深度轻量级特征生成器、CLAttention、两流端到端深度学习(Two stream end- end deep learning, DL)等传统方法相比,JSTO-DenseNet模型的准确率最高,为94.8 %,真阳性率(True Positive Rate, TPR)为90 %,真阴性率(True Negative Rate, TNR)为90.5 %,非加权平均召回率(unweighted average recall, UAR)为89.8 %,假阴性率(False Negative Rate, FNR)最低,为86.7 %,假阳性率为82.6 %。基于遵守1数据集的自编码器特征表示和模糊推理步态系统-深度极端自适应模糊(FIGS-DEAF)。
{"title":"Hybrid optimization enabled DenseNet for autism spectrum disorders using MRI image.","authors":"Sakthi Ulaganathan, Pon Harshavardhanan, N V Ganapathi Raju, G Parthasarathy","doi":"10.1016/j.compbiolchem.2024.108335","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108335","url":null,"abstract":"<p><p>Autism spectrum disorder (ASD) is the neuro-developmental disorder caused by various changes in the brain. It affects the life conditions with social interaction and communication. Most of the previous researches used the various techniques for the early detection to reduce the ASD, but it had been occurred several complications such as, time expenses, and low accessibility for diagnosis.This paper aims to develop the JSTO-DenseNetmodel is for the detection of ASD. In this paper, an input autism brainimage is considered as an input applied to image pre-processing phase. In image pre-processing, the clatters are removed utilizing Gaussian filtering and also, Region of Interest (ROI) extraction is carried out. Thereafter, extraction of pivotal region is done based on functional connectivity utilizing proposed Jaya Sewing Training Optimization (JSTO). The JSTO is newly introduced by combining Jaya algorithm and Sewing Training-Based Optimization (STBO). Thus, output-1 is obtained. In feature extraction phase, grey level co-occurrence matrix (GLCM) features like entropy, correlation, energy, homogeneity, inverse difference moment, Angular second moment and texture features namelylocal ternary patterns (LTP), Local Optimal Oriented Pattern (LOOP) and Histogram of Oriented Gradients (HOG) are extracted from the Magnetic Resonance Imaging (MRI). Therefore, output-2 is obtained. From output-1 and output-2, ASD classification is accomplished using DenseNet, which is trained employing same proposed JSTO.The proposed JSTO-DenseNet model achieves the highest accuracy of 94.8 %, True Positive Rate (TPR) of 90 %, True Negative Rate (TNR) of 90.5 %, un-weighted average recall (UAR) of 89.8 % and the lowest False Negative Rate (FNR) of 86.7 %, and False Positive Rate of 82.6 %, when compared with other traditional methods like, Explainable Artificial Intelligence (XAI), Hybrid deep lightweight feature generator, CLAttention, Two stream end-to-end deep learning (DL), Auto-Encoder feature representation, and Fuzzy Inference Gait System-Deep Extreme Adaptive Fuzzy (FIGS-DEAF) based on Abide 1 dataset.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108335"},"PeriodicalIF":0.0,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unraveling the interplay between cardiovascular diseases and alcohol use disorder: A bioinformatics and network-based exploration of shared molecular pathways and key biomarkers validation via western blot analysis. 揭示心血管疾病和酒精使用障碍之间的相互作用:基于生物信息学和网络的共享分子途径探索,以及通过western blot分析验证的关键生物标志物。
Pub Date : 2024-12-30 DOI: 10.1016/j.compbiolchem.2024.108338
Kamelia Zaman Moon, Md Habibur Rahman, Md Jahangir Alam, Md Arju Hossain, Sungho Hwang, Sojin Kang, Seungjoon Moon, Moon Nyeo Park, Chi-Hoon Ahn, Bonglee Kim

Clinical observations indicate a pronounced exacerbation of Cardiovascular Diseases (CVDs) in individuals grappling with Alcohol Use Disorder (AUD), suggesting an intricate interplay between these maladies. Pinpointing shared risk factors for both conditions has proven elusive. To address this, we pioneered a sophisticated bioinformatics framework and network-based strategy to unearth genes exhibiting aberrant expression patterns in both AUD and CVDs. In heart tissue samples from patients battling both AUD and CVDs, our study identified 76 Differentially Expressed Genes (DEGs) further used for retrieving important Gene Ontology (GO) keywords and metabolic pathways, highlighting mechanisms like proinflammatory cascades, T-cell cytotoxicity, antigen processing and presentation. By using Protein-Protein Interaction (PPI) analysis, we were able to identify key hub proteins that have a significant impact on the pathophysiology of these illnesses. Several hub proteins were identified include PTGS2, VCAM1, CCL2, CXCL8, IL7R, among these only CDH1 was covered in 10 algorithms of cytoHubba plugin. Furthermore, we pinpointed several Transcription Factors (TFs), including SOD2, CXCL8, THBS2, GREM1, CCL2, and PTGS2, alongside potential microRNAs (miRNAs) such as hsa-mir-203a-3p, hsa-mir-23a-3p, hsa-mir-98-5p, and hsa-mir-7-5p, which exert critical regulatory control over gene expression… In vitro study investigates the effect of alcohol on E-cadherin (CDH1) expression in HepG2 and Hep3B cells, showing a significant decrease in expression following ethanol treatment. These findings suggest that alcohol exposure may disrupt cell adhesion, potentially contributing to cellular changes associated with cardiovascular diseases. Our innovative approach has unveiled distinctive biomarkers delineating the dynamic interplay between AUD and various cardiovascular conditions for future therapeutic exploration.

临床观察表明,与酒精使用障碍(AUD)斗争的个体心血管疾病(cvd)明显恶化,表明这些疾病之间存在复杂的相互作用。事实证明,确定这两种疾病的共同风险因素是难以捉摸的。为了解决这个问题,我们开创了一个复杂的生物信息学框架和基于网络的策略,以发现在AUD和cvd中表现出异常表达模式的基因。在患有AUD和cvd的患者的心脏组织样本中,我们的研究确定了76个差异表达基因(DEGs),进一步用于检索重要的基因本体(GO)关键字和代谢途径,突出了促炎级联反应、t细胞细胞毒性、抗原加工和递呈等机制。通过蛋白质-蛋白质相互作用(PPI)分析,我们能够确定对这些疾病的病理生理有重大影响的关键枢纽蛋白。其中,PTGS2、VCAM1、CCL2、CXCL8、IL7R等枢纽蛋白在cytoHubba插件的10种算法中仅被CDH1覆盖。此外,我们确定了几种转录因子(TFs),包括SOD2, CXCL8, THBS2, GREM1, CCL2和PTGS2,以及潜在的microRNAs (miRNAs),如hsa-mir-203a-3p, hsa-mir-23a-3p, hsa-mir-98-5p和hsa-mir-7-5p,它们对基因表达具有关键的调控作用。在体外研究中,酒精对HepG2和Hep3B细胞中E-cadherin (CDH1)表达的影响,显示乙醇处理后表达显著降低。这些发现表明,酒精暴露可能会破坏细胞粘附,可能导致与心血管疾病相关的细胞变化。我们的创新方法揭示了独特的生物标志物,描绘了AUD与各种心血管疾病之间的动态相互作用,为未来的治疗探索奠定了基础。
{"title":"Unraveling the interplay between cardiovascular diseases and alcohol use disorder: A bioinformatics and network-based exploration of shared molecular pathways and key biomarkers validation via western blot analysis.","authors":"Kamelia Zaman Moon, Md Habibur Rahman, Md Jahangir Alam, Md Arju Hossain, Sungho Hwang, Sojin Kang, Seungjoon Moon, Moon Nyeo Park, Chi-Hoon Ahn, Bonglee Kim","doi":"10.1016/j.compbiolchem.2024.108338","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108338","url":null,"abstract":"<p><p>Clinical observations indicate a pronounced exacerbation of Cardiovascular Diseases (CVDs) in individuals grappling with Alcohol Use Disorder (AUD), suggesting an intricate interplay between these maladies. Pinpointing shared risk factors for both conditions has proven elusive. To address this, we pioneered a sophisticated bioinformatics framework and network-based strategy to unearth genes exhibiting aberrant expression patterns in both AUD and CVDs. In heart tissue samples from patients battling both AUD and CVDs, our study identified 76 Differentially Expressed Genes (DEGs) further used for retrieving important Gene Ontology (GO) keywords and metabolic pathways, highlighting mechanisms like proinflammatory cascades, T-cell cytotoxicity, antigen processing and presentation. By using Protein-Protein Interaction (PPI) analysis, we were able to identify key hub proteins that have a significant impact on the pathophysiology of these illnesses. Several hub proteins were identified include PTGS2, VCAM1, CCL2, CXCL8, IL7R, among these only CDH1 was covered in 10 algorithms of cytoHubba plugin. Furthermore, we pinpointed several Transcription Factors (TFs), including SOD2, CXCL8, THBS2, GREM1, CCL2, and PTGS2, alongside potential microRNAs (miRNAs) such as hsa-mir-203a-3p, hsa-mir-23a-3p, hsa-mir-98-5p, and hsa-mir-7-5p, which exert critical regulatory control over gene expression… In vitro study investigates the effect of alcohol on E-cadherin (CDH1) expression in HepG2 and Hep3B cells, showing a significant decrease in expression following ethanol treatment. These findings suggest that alcohol exposure may disrupt cell adhesion, potentially contributing to cellular changes associated with cardiovascular diseases. Our innovative approach has unveiled distinctive biomarkers delineating the dynamic interplay between AUD and various cardiovascular conditions for future therapeutic exploration.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108338"},"PeriodicalIF":0.0,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142960276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Insights into the behaviour of phosphorylated DNA breaks from molecular dynamic simulations. 从分子动力学模拟中深入了解磷酸化DNA的行为。
Pub Date : 2024-12-30 DOI: 10.1016/j.compbiolchem.2024.108337
Li Zhang, Outi Lampela, Lari Lehtiö, André H Juffer

Single-stranded breaks (SSBs) are the most frequent DNA lesions threatening genomic integrity-understanding how DNA sensor proteins recognize certain SSB types is crucial for studies of the DNA repair pathways. During repair of damaged DNA the final SSB that is to be ligated contains a 5'-phosphorylated end. The present work employed molecular simulation (MD) of DNA with a phosphorylated break in solution to address multiple questions regarding the dynamics of the break site. How does the 5'-phosphate group behave before it initiates a connection with other biomolecules? What is the conformation of the SSB site when it is likely to be recognized by DNA repair factors once the DNA repair response is triggered? And how is the structure and dynamics of DNA affected by the presence of a break? For this purpose, a series of MD simulations of 20 base pair DNAs, each with either a pyrimidine-based or purine-based break, were completed at a combined length of over 20,000 ns simulation time and compared with intact DNA of the same sequence. An analysis of the DNA forms, translational and orientational helical parameters, local break site stiffness, bending angles, 5'-phosphate group orientation dynamics, and the effects of the protonation state of the break site phosphate group provides insights into the mechanism for the break site recognition.

单链断裂(SSBs)是威胁基因组完整性的最常见的DNA损伤,了解DNA传感器蛋白如何识别某些类型的SSBs对于DNA修复途径的研究至关重要。在受损DNA的修复过程中,要连接的最终SSB包含一个5'磷酸化的末端。目前的工作采用分子模拟(MD)的DNA磷酸化断裂的解决方案,以解决有关断裂位点的动力学的多个问题。5'-磷酸基团在开始与其他生物分子连接之前是如何表现的?一旦DNA修复反应被触发,SSB位点可能被DNA修复因子识别的构象是什么?DNA的结构和动力学是如何受到断裂的影响的?为此,对20个碱基对DNA进行了一系列的DNA序列模拟,每个碱基对都有嘧啶或嘌呤断裂,在超过20,000 ns的模拟时间内完成,并与相同序列的完整DNA进行了比较。通过对DNA形态、平移和定向螺旋参数、断裂位点局部刚度、弯曲角度、5'-磷酸基取向动力学以及断裂位点磷酸基质子化状态的影响的分析,为断裂位点识别的机制提供了见解。
{"title":"Insights into the behaviour of phosphorylated DNA breaks from molecular dynamic simulations.","authors":"Li Zhang, Outi Lampela, Lari Lehtiö, André H Juffer","doi":"10.1016/j.compbiolchem.2024.108337","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108337","url":null,"abstract":"<p><p>Single-stranded breaks (SSBs) are the most frequent DNA lesions threatening genomic integrity-understanding how DNA sensor proteins recognize certain SSB types is crucial for studies of the DNA repair pathways. During repair of damaged DNA the final SSB that is to be ligated contains a 5'-phosphorylated end. The present work employed molecular simulation (MD) of DNA with a phosphorylated break in solution to address multiple questions regarding the dynamics of the break site. How does the 5'-phosphate group behave before it initiates a connection with other biomolecules? What is the conformation of the SSB site when it is likely to be recognized by DNA repair factors once the DNA repair response is triggered? And how is the structure and dynamics of DNA affected by the presence of a break? For this purpose, a series of MD simulations of 20 base pair DNAs, each with either a pyrimidine-based or purine-based break, were completed at a combined length of over 20,000 ns simulation time and compared with intact DNA of the same sequence. An analysis of the DNA forms, translational and orientational helical parameters, local break site stiffness, bending angles, 5'-phosphate group orientation dynamics, and the effects of the protonation state of the break site phosphate group provides insights into the mechanism for the break site recognition.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108337"},"PeriodicalIF":0.0,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CPI-GGS: A deep learning model for predicting compound-protein interaction based on graphs and sequences. CPI-GGS:基于图和序列预测化合物-蛋白质相互作用的深度学习模型。
Pub Date : 2024-12-29 DOI: 10.1016/j.compbiolchem.2024.108326
Zhanwei Hou, Zhenhan Xu, Chaokun Yan, Huimin Luo, Junwei Luo

Background: Compound-protein interaction (CPI) is essential to drug discovery and design, where traditional methods are often costly and have low success rates. Recently, the integration of machine learning and deep learning in CPI research has shown potential to reduce costs and enhance discovery efficiency by improving protein target identification accuracy. Additionally, with an urgent need for novel therapies against complex diseases, CPI investigation could lead to the identification of effective new drugs. Since drug-target interactions involve complex biological processes, refined models are necessary for precise feature extraction and analysis. Nevertheless, current CPI prediction methods still face significant limitations: predictions lack sufficient accuracy, models require improved generalization ability, and further validation across diverse datasets remains essential.

Results: To address some issues at the current stage, this paper proposes a combined deep learning method, CPI-GGS, for predicting and analyzing compound-protein interactions. The source code is available on GitHub at https://github.com/xingjie321/CPI-GGS.

Conclusions: The experimental results demonstrate improved accuracy in predicting compound-protein interactions and enhance the understanding of how compounds and proteins interact, providing a valuable new tool for drug discovery and development.

背景:化合物-蛋白质相互作用(CPI)对药物发现和设计至关重要,传统方法往往成本高昂且成功率低。最近,机器学习和深度学习在CPI研究中的结合显示出通过提高蛋白质靶点识别精度来降低成本和提高发现效率的潜力。此外,由于迫切需要针对复杂疾病的新疗法,CPI研究可以导致有效的新药的鉴定。由于药物-靶标相互作用涉及复杂的生物过程,精确的特征提取和分析需要精细的模型。然而,目前的CPI预测方法仍然面临着明显的局限性:预测缺乏足够的准确性,模型需要提高泛化能力,并且在不同数据集上进一步验证仍然是必要的。结果:针对现阶段存在的一些问题,本文提出了一种用于预测和分析化合物-蛋白质相互作用的组合深度学习方法CPI-GGS。实验结果表明,预测化合物和蛋白质相互作用的准确性得到了提高,并增强了对化合物和蛋白质如何相互作用的理解,为药物发现和开发提供了一个有价值的新工具。
{"title":"CPI-GGS: A deep learning model for predicting compound-protein interaction based on graphs and sequences.","authors":"Zhanwei Hou, Zhenhan Xu, Chaokun Yan, Huimin Luo, Junwei Luo","doi":"10.1016/j.compbiolchem.2024.108326","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108326","url":null,"abstract":"<p><strong>Background: </strong>Compound-protein interaction (CPI) is essential to drug discovery and design, where traditional methods are often costly and have low success rates. Recently, the integration of machine learning and deep learning in CPI research has shown potential to reduce costs and enhance discovery efficiency by improving protein target identification accuracy. Additionally, with an urgent need for novel therapies against complex diseases, CPI investigation could lead to the identification of effective new drugs. Since drug-target interactions involve complex biological processes, refined models are necessary for precise feature extraction and analysis. Nevertheless, current CPI prediction methods still face significant limitations: predictions lack sufficient accuracy, models require improved generalization ability, and further validation across diverse datasets remains essential.</p><p><strong>Results: </strong>To address some issues at the current stage, this paper proposes a combined deep learning method, CPI-GGS, for predicting and analyzing compound-protein interactions. The source code is available on GitHub at https://github.com/xingjie321/CPI-GGS.</p><p><strong>Conclusions: </strong>The experimental results demonstrate improved accuracy in predicting compound-protein interactions and enhance the understanding of how compounds and proteins interact, providing a valuable new tool for drug discovery and development.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108326"},"PeriodicalIF":0.0,"publicationDate":"2024-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The immune microenvironment related biomarker CCL18 for patients with gout by comprehensive analysis. 综合分析痛风患者免疫微环境相关生物标志物CCL18。
Pub Date : 2024-12-28 DOI: 10.1016/j.compbiolchem.2024.108334
Mingchao Zhang, Zhenming Lin, Wenbin Liu

In the present study, we uncovered and validated potential biomarkers related to gout, characterized by the accumulation of sodium urate crystals in different joint and non-joint structures. The data set GSE160170 was obtained from the GEO database. We conducted differential gene expression analysis, GO enrichment assessment, and KEGG pathway analysis to understand the underlying processes. The overlap of 66 methodologies was visualized through UpSetR (v1.3.3). We used Cytoscape's cytoHubba to detect pivotal genes and mapped out protein-protein interaction (PPI) networks. The overlapping targets among upregulated, downregulated, and key genes were depicted using a Venn diagram. CIBERSORT was employed to ascertain the composition of 22 immune cell types in tissue samples. Subsequently, CCL18 levels in serum samples were quantified using enzyme-linked immunosorbent assay (ELISA) and served as a biomarker evaluation metric. The DEG analysis revealed 1000 genes with varied expression (with an even split of 500 upregulated and 500 downregulated genes) when contrasting gout patients with healthy counterparts. The GO enrichment findings revealed a predominant association with small molecule degradation, positive regulatory catabolic mechanism, organelle division, signal transduction, and axon formation. KEGG assay associated the DEGs predominantly with conditions such as systemic lupus erythematosus, pathways such as tumor necrosis factor (TNF) signaling, as well as alcohol dependency and necroptosis. Intersections were visualized using UpSetR, resulting in the identification of 20 hub genes. A Venn representation highlighted five upregulated genes and three downregulated genes. CIBERSORT analysis revealed a noticeable increase in the number of gamma delta T cells and regulatory T cells. The PPI network analysis revealed CC Chemokine ligand 18 (CCL18) as a critical gene. Gout-afflicted samples exhibited a heightened CCL18 expression compared to healthy ones (P < 0.01). Altogether, CCL18 is a promising biomarker for patients with gout and is suitable for predicting of gout.

在本研究中,我们发现并验证了与痛风相关的潜在生物标志物,其特征是尿酸钠晶体在不同关节和非关节结构中的积累。数据集GSE160170来自GEO数据库。我们进行了差异基因表达分析、氧化石墨烯富集评估和KEGG通路分析,以了解潜在的过程。通过UpSetR (v1.3.3)可视化66种方法的重叠。我们使用Cytoscape的cytoHubba来检测关键基因,并绘制了蛋白质-蛋白质相互作用(PPI)网络。上调、下调和关键基因之间的重叠靶点使用维恩图来描述。采用CIBERSORT确定组织样品中22种免疫细胞类型的组成。随后,使用酶联免疫吸附法(ELISA)定量血清样品中的CCL18水平,并作为生物标志物评估指标。DEG分析显示,痛风患者与健康患者相比,有1000个基因表达不同(500个基因上调,500个基因下调)。氧化石墨烯富集与小分子降解、正调节分解代谢机制、细胞器分裂、信号转导和轴突形成密切相关。KEGG检测将deg主要与系统性红斑狼疮、肿瘤坏死因子(TNF)信号通路以及酒精依赖和坏死下垂等疾病联系起来。利用UpSetR可视化交叉点,鉴定出20个枢纽基因。维恩表示强调了5个上调基因和3个下调基因。CIBERSORT分析显示γ δ T细胞和调节性T细胞的数量明显增加。PPI网络分析显示CC趋化因子配体18 (CCL18)是一个关键基因。痛风患者的CCL18表达比健康患者高(P
{"title":"The immune microenvironment related biomarker CCL18 for patients with gout by comprehensive analysis.","authors":"Mingchao Zhang, Zhenming Lin, Wenbin Liu","doi":"10.1016/j.compbiolchem.2024.108334","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108334","url":null,"abstract":"<p><p>In the present study, we uncovered and validated potential biomarkers related to gout, characterized by the accumulation of sodium urate crystals in different joint and non-joint structures. The data set GSE160170 was obtained from the GEO database. We conducted differential gene expression analysis, GO enrichment assessment, and KEGG pathway analysis to understand the underlying processes. The overlap of 66 methodologies was visualized through UpSetR (v1.3.3). We used Cytoscape's cytoHubba to detect pivotal genes and mapped out protein-protein interaction (PPI) networks. The overlapping targets among upregulated, downregulated, and key genes were depicted using a Venn diagram. CIBERSORT was employed to ascertain the composition of 22 immune cell types in tissue samples. Subsequently, CCL18 levels in serum samples were quantified using enzyme-linked immunosorbent assay (ELISA) and served as a biomarker evaluation metric. The DEG analysis revealed 1000 genes with varied expression (with an even split of 500 upregulated and 500 downregulated genes) when contrasting gout patients with healthy counterparts. The GO enrichment findings revealed a predominant association with small molecule degradation, positive regulatory catabolic mechanism, organelle division, signal transduction, and axon formation. KEGG assay associated the DEGs predominantly with conditions such as systemic lupus erythematosus, pathways such as tumor necrosis factor (TNF) signaling, as well as alcohol dependency and necroptosis. Intersections were visualized using UpSetR, resulting in the identification of 20 hub genes. A Venn representation highlighted five upregulated genes and three downregulated genes. CIBERSORT analysis revealed a noticeable increase in the number of gamma delta T cells and regulatory T cells. The PPI network analysis revealed CC Chemokine ligand 18 (CCL18) as a critical gene. Gout-afflicted samples exhibited a heightened CCL18 expression compared to healthy ones (P < 0.01). Altogether, CCL18 is a promising biomarker for patients with gout and is suitable for predicting of gout.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108334"},"PeriodicalIF":0.0,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigation of mesenchymal stem cell secretome on breast cancer gene expression: A bioinformatic approach to identify differentially expressed genes, functional networks, and potential therapeutic targets. 间充质干细胞分泌组对乳腺癌基因表达的影响:一种识别差异表达基因、功能网络和潜在治疗靶点的生物信息学方法。
Pub Date : 2024-12-28 DOI: 10.1016/j.compbiolchem.2024.108331
Mohammad Rasouli, Fatemeh Safari, Raheleh Roudi, Navid Sobhani

The mesenchymal stem cell (MSC) secretome plays a pivotal role in shaping the tumor microenvironment, influencing both cancer progression and potential therapeutic outcomes. In this research, by using publicly available dataset GSE196312, we investigated the role of MSC secretome on breast cancer cell gene expression. Our results raveled differentially expressed genes, including the upregulation of Phosphatidylinositol-3,4,5-Trisphosphate Dependent Rac Exchange Factor 1 (PREX1), C-C Motif Chemokine Ligand 28 (CCL28), and downregulation of Collagen Type I Alpha 1 Chain (COL1A1), Collagen Type I Alpha 3 Chain (COL1A3), Collagen Type III Alpha 1 Chain (COL3A1), which contributing to extra cellular matrix (ECM) weakening and promoting cell migration. Functional enrichment analyses also highlighted suppression of ECM remodeling pathways, and activation of calcium ion binding and Rap1 signaling pathway. We proposed that Ca2 + medicated activation of Ras-related protein 1 (Rap1) through its its downstream pathways such as Matrix Metalloprotease (MMP), PI3K/Akt, and MEK/ERK signaling pathway contribute to promotion of cell migration. However, the co-culture model by reducing Fibronectin 1 (FN1) and Secreted Protein Acidic and Cysteine Rich (SPARC) gene expression in cancer cells, emphasized on therapeutical aspects of MSC secretome. These findings emphasize on the dual edge sword nature of MSC secretome on cancer cell behaviors, while our major results emphasize on the cancer progression through ECM remodeling, the therapeutic aspects should not be underscored.

间充质干细胞(MSC)分泌组在塑造肿瘤微环境、影响癌症进展和潜在治疗结果方面起着关键作用。在这项研究中,我们利用公开的数据集GSE196312,研究了MSC分泌组在乳腺癌细胞基因表达中的作用。我们的研究结果揭示了差异表达的基因,包括磷脂酰肌醇-3,4,5-三磷酸依赖性Rac交换因子1 (PREX1)、C-C基序趋化因子配体28 (CCL28)的上调,以及I型胶原α 1链(COL1A1)、I型胶原α 3链(COL1A3)、III型胶原α 1链(COL3A1)的下调,这些基因有助于细胞外基质(ECM)减弱和促进细胞迁移。功能富集分析也强调了ECM重塑途径的抑制,以及钙离子结合和Rap1信号通路的激活。我们提出Ca2 +通过其下游通路如基质金属蛋白酶(MMP)、PI3K/Akt和MEK/ERK信号通路激活ras相关蛋白1 (Rap1)有助于促进细胞迁移。然而,通过减少癌细胞中纤维连接蛋白1 (FN1)和分泌蛋白酸和富含半胱氨酸(SPARC)基因表达的共培养模型强调了MSC分泌组的治疗方面。这些发现强调了间充质干细胞分泌组对癌细胞行为的双刃剑性质,而我们的主要结果强调了通过ECM重塑的癌症进展,不应强调治疗方面。
{"title":"Investigation of mesenchymal stem cell secretome on breast cancer gene expression: A bioinformatic approach to identify differentially expressed genes, functional networks, and potential therapeutic targets.","authors":"Mohammad Rasouli, Fatemeh Safari, Raheleh Roudi, Navid Sobhani","doi":"10.1016/j.compbiolchem.2024.108331","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108331","url":null,"abstract":"<p><p>The mesenchymal stem cell (MSC) secretome plays a pivotal role in shaping the tumor microenvironment, influencing both cancer progression and potential therapeutic outcomes. In this research, by using publicly available dataset GSE196312, we investigated the role of MSC secretome on breast cancer cell gene expression. Our results raveled differentially expressed genes, including the upregulation of Phosphatidylinositol-3,4,5-Trisphosphate Dependent Rac Exchange Factor 1 (PREX1), C-C Motif Chemokine Ligand 28 (CCL28), and downregulation of Collagen Type I Alpha 1 Chain (COL1A1), Collagen Type I Alpha 3 Chain (COL1A3), Collagen Type III Alpha 1 Chain (COL3A1), which contributing to extra cellular matrix (ECM) weakening and promoting cell migration. Functional enrichment analyses also highlighted suppression of ECM remodeling pathways, and activation of calcium ion binding and Rap1 signaling pathway. We proposed that Ca<sup>2 +</sup> medicated activation of Ras-related protein 1 (Rap1) through its its downstream pathways such as Matrix Metalloprotease (MMP), PI3K/Akt, and MEK/ERK signaling pathway contribute to promotion of cell migration. However, the co-culture model by reducing Fibronectin 1 (FN1) and Secreted Protein Acidic and Cysteine Rich (SPARC) gene expression in cancer cells, emphasized on therapeutical aspects of MSC secretome. These findings emphasize on the dual edge sword nature of MSC secretome on cancer cell behaviors, while our major results emphasize on the cancer progression through ECM remodeling, the therapeutic aspects should not be underscored.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108331"},"PeriodicalIF":0.0,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DMNAG: Prediction of disease-metabolite associations based on Neighborhood Aggregation Graph Transformer. DMNAG:基于邻域聚集图转换器的疾病代谢物关联预测。
Pub Date : 2024-12-28 DOI: 10.1016/j.compbiolchem.2024.108320
Pengli Lu, Jiajie Gao, Wenzhi Liu

The metabolic level within an organism typically reflects its health status. Studying the relationship between human diseases and metabolites helps enhance medical professionals' ability for early disease diagnosis and risk prediction. However, traditional biological experimental methods often require substantial resources and manpower, and there is still room for improvement in the performance of existing predictive models. To tackle these, we propose a novel method based on the Neighborhood Aggregation Graph Transformer (NAGphormer) to predict potential associations between diseases and metabolites (DMNAG), aiming to provide guidance for biological experiments and improve experimental efficiency. First, we calculated the Gaussian kernel similarity of diseases and the physicochemical similarity of metabolites, and combined them with known associations to construct a bipartite heterogeneous network. We then calculated the semantic similarity of diseases and the Mol2vec similarity of metabolites, using them respectively as the similarity feature vectors for the disease nodes and metabolite nodes. Meanwhile, we calculate the positional information features of nodes and combine them with similarity features as the initial features of the nodes. Next, we input the bipartite heterogeneous network and node initial features into the Hop2Token module to capture multihop neighborhood information between nodes. Finally, we input the multi-hop features of nodes into the Transformer model for training and obtain the edge prediction probabilities through the decoder. Through experiments, our model achieved an AUC value of 0.9801 and an AUPR value of 0.9818 in five-fold cross-validation. In case studies, most DMNAG-predicted associations have been validated, showcasing the model's reliability and superiority.

生物体内的代谢水平通常反映其健康状况。研究人类疾病与代谢物之间的关系,有助于提高医学工作者对疾病的早期诊断和风险预测能力。然而,传统的生物学实验方法往往需要大量的资源和人力,并且现有的预测模型的性能仍有改进的空间。为了解决这些问题,我们提出了一种基于邻域聚集图转换器(NAGphormer)的新方法来预测疾病与代谢物(DMNAG)之间的潜在关联,旨在为生物学实验提供指导,提高实验效率。首先,我们计算了疾病的高斯核相似度和代谢物的理化相似度,并将它们与已知的关联结合起来,构建了一个二部异构网络。然后计算疾病的语义相似度和代谢物的Mol2vec相似度,分别作为疾病节点和代谢物节点的相似特征向量。同时,计算节点的位置信息特征,并结合相似度特征作为节点的初始特征。接下来,我们将二部异构网络和节点初始特征输入到Hop2Token模块中,以捕获节点之间的多跳邻居信息。最后,我们将节点的多跳特征输入到Transformer模型中进行训练,并通过解码器获得边缘预测概率。通过实验,我们的模型经过五次交叉验证,AUC值为0.9801,AUPR值为0.9818。在案例研究中,大多数dmnag预测的关联已经得到验证,显示了模型的可靠性和优越性。
{"title":"DMNAG: Prediction of disease-metabolite associations based on Neighborhood Aggregation Graph Transformer.","authors":"Pengli Lu, Jiajie Gao, Wenzhi Liu","doi":"10.1016/j.compbiolchem.2024.108320","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108320","url":null,"abstract":"<p><p>The metabolic level within an organism typically reflects its health status. Studying the relationship between human diseases and metabolites helps enhance medical professionals' ability for early disease diagnosis and risk prediction. However, traditional biological experimental methods often require substantial resources and manpower, and there is still room for improvement in the performance of existing predictive models. To tackle these, we propose a novel method based on the Neighborhood Aggregation Graph Transformer (NAGphormer) to predict potential associations between diseases and metabolites (DMNAG), aiming to provide guidance for biological experiments and improve experimental efficiency. First, we calculated the Gaussian kernel similarity of diseases and the physicochemical similarity of metabolites, and combined them with known associations to construct a bipartite heterogeneous network. We then calculated the semantic similarity of diseases and the Mol2vec similarity of metabolites, using them respectively as the similarity feature vectors for the disease nodes and metabolite nodes. Meanwhile, we calculate the positional information features of nodes and combine them with similarity features as the initial features of the nodes. Next, we input the bipartite heterogeneous network and node initial features into the Hop2Token module to capture multihop neighborhood information between nodes. Finally, we input the multi-hop features of nodes into the Transformer model for training and obtain the edge prediction probabilities through the decoder. Through experiments, our model achieved an AUC value of 0.9801 and an AUPR value of 0.9818 in five-fold cross-validation. In case studies, most DMNAG-predicted associations have been validated, showcasing the model's reliability and superiority.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108320"},"PeriodicalIF":0.0,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142923970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ANNInter: A platform to explore ncRNA-ncRNA interactome of Arabidopsis thaliana. ANNInter:一个探索拟南芥ncRNA-ncRNA相互作用的平台。
Pub Date : 2024-12-27 DOI: 10.1016/j.compbiolchem.2024.108328
A T Vivek, Namrata Sahu, Garima Kalakoti, Shailesh Kumar

Eukaryotic transcriptomes are remarkably complex, encompassing not only protein-coding RNAs but also an expanding repertoire of noncoding RNAs (ncRNAs). In plants, ncRNA-ncRNA interactions (NNIs) have emerged as pivotal regulators of gene expression, orchestrating development and adaptive responses to stress. Despite their critical roles, the functional significance of NNIs remains poorly understood, largely due to a lack of comprehensive resources. Here, we present ANNInter, a comprehensive platform that integrates computational predictions with experimental datasets to systematically identify and analyze NNIs. The current version catalogs over 90,000 interactions spanning eight categories of sRNA-to-longer ncRNAs, each extensively annotated with interaction types, identification methods, and functional descriptions. The integrated schema and advanced visualization framework in ANNInter enable users to explore intricate interaction networks, providing system-wide insights into ncRNA-mediated regulation. These interaction data provide unparalleled opportunities to uncover the regulatory roles of NNIs in key biological processes such as growth regulation, stress adaptation, and cellular signaling. By providing an extensive, curated repository of computational and degradome-based interaction data, ANNInter will provide a platform to the study of ncRNA biology, elucidating the complex mechanisms of NNIs and supporting the concept of competing endogenous RNAs (ceRNAs) in gene regulation. The platform is freely accessible at https://www.nipgr.ac.in/ANNInter/.

真核生物转录组非常复杂,不仅包括蛋白质编码rna,还包括不断扩大的非编码rna (ncRNAs)。在植物中,ncRNA-ncRNA相互作用(NNIs)已成为基因表达的关键调节因子,协调发育和对胁迫的适应性反应。尽管它们具有关键作用,但由于缺乏全面的资源,人们对它们的功能意义仍然知之甚少。在这里,我们提出了ANNInter,这是一个综合平台,将计算预测与实验数据集相结合,系统地识别和分析NNIs。目前的版本编目了超过90,000个相互作用,涵盖8类srna到更长的ncrna,每个都广泛地注释了相互作用类型,识别方法和功能描述。ANNInter的集成架构和高级可视化框架使用户能够探索复杂的交互网络,为ncrna介导的调控提供全系统的见解。这些相互作用数据为揭示NNIs在生长调节、应激适应和细胞信号传导等关键生物过程中的调节作用提供了无与伦比的机会。通过提供一个广泛的、精心策划的基于计算和降解体的相互作用数据库,ANNInter将为ncRNA生物学研究提供一个平台,阐明NNIs的复杂机制,并支持基因调控中竞争内源rna (ceRNAs)的概念。该平台可在https://www.nipgr.ac.in/ANNInter/免费访问。
{"title":"ANNInter: A platform to explore ncRNA-ncRNA interactome of Arabidopsis thaliana.","authors":"A T Vivek, Namrata Sahu, Garima Kalakoti, Shailesh Kumar","doi":"10.1016/j.compbiolchem.2024.108328","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108328","url":null,"abstract":"<p><p>Eukaryotic transcriptomes are remarkably complex, encompassing not only protein-coding RNAs but also an expanding repertoire of noncoding RNAs (ncRNAs). In plants, ncRNA-ncRNA interactions (NNIs) have emerged as pivotal regulators of gene expression, orchestrating development and adaptive responses to stress. Despite their critical roles, the functional significance of NNIs remains poorly understood, largely due to a lack of comprehensive resources. Here, we present ANNInter, a comprehensive platform that integrates computational predictions with experimental datasets to systematically identify and analyze NNIs. The current version catalogs over 90,000 interactions spanning eight categories of sRNA-to-longer ncRNAs, each extensively annotated with interaction types, identification methods, and functional descriptions. The integrated schema and advanced visualization framework in ANNInter enable users to explore intricate interaction networks, providing system-wide insights into ncRNA-mediated regulation. These interaction data provide unparalleled opportunities to uncover the regulatory roles of NNIs in key biological processes such as growth regulation, stress adaptation, and cellular signaling. By providing an extensive, curated repository of computational and degradome-based interaction data, ANNInter will provide a platform to the study of ncRNA biology, elucidating the complex mechanisms of NNIs and supporting the concept of competing endogenous RNAs (ceRNAs) in gene regulation. The platform is freely accessible at https://www.nipgr.ac.in/ANNInter/.</p>","PeriodicalId":93952,"journal":{"name":"Computational biology and chemistry","volume":"115 ","pages":"108328"},"PeriodicalIF":0.0,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational biology and chemistry
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1