首页 > 最新文献

Current Bioinformatics最新文献

英文 中文
Mining Transcriptional Data for Precision Medicine: Bioinformatics Insights into Inflammatory Bowel Disease 挖掘转录数据,实现精准医疗:生物信息学对炎症性肠病的启示
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-08-12 DOI: 10.2174/0115748936302814240729062857
Arman Shahriari, Shokoofeh Amirzadeh Shams, Hamidreza Mahboobi, Maryam Yazdanparast, Amirreza Jabbaripour Sarmadian
: Inflammatory Bowel Disease (IBD), encompassing ulcerative colitis and Crohn’s disease, affects millions worldwide. Characterized by a complex interplay of genetic, microbial, and environmental factors, IBD challenges conventional treatment approaches, necessitating precision medicine. This paper reviews the role of bioinformatics in leveraging transcriptional data for novel IBD diagnostics and therapeutics. It highlights the genomic landscape of IBD, focusing on genetic factors and insights from genome-wide association studies. The interrelation between the gut microbiome and host transcriptional responses in IBD is examined, emphasizing the use of bioinformatics tools in deciphering these interactions. Our study synthesizes developments in transcriptomics and proteomics, revealing aberrant gene and protein expression patterns linked to IBD pathogenesis. We advocate for the integration of multi-omics data, underscoring the complexity and necessity of bioinformatics in interpreting these datasets. This approach paves the way for personalized treatment strategies, improved disease prognosis, and enhanced patient care. The insights provided offer a comprehensive overview of IBD, highlighting bioinformatics as key in advancing personalized healthcare in IBD management.
:炎症性肠病(IBD)包括溃疡性结肠炎和克罗恩病,影响着全球数百万人。IBD 的特点是遗传、微生物和环境因素的复杂相互作用,它对传统治疗方法提出了挑战,需要精准医疗。本文回顾了生物信息学在利用转录数据进行新型 IBD 诊断和治疗方面的作用。它强调了 IBD 的基因组状况,重点是遗传因素和全基因组关联研究的见解。研究还探讨了肠道微生物组和宿主转录反应在 IBD 中的相互关系,强调了生物信息学工具在破译这些相互作用中的应用。我们的研究综合了转录组学和蛋白质组学的发展,揭示了与 IBD 发病机制相关的异常基因和蛋白质表达模式。我们提倡整合多组学数据,强调生物信息学在解读这些数据集时的复杂性和必要性。这种方法为个性化治疗策略、改善疾病预后和加强患者护理铺平了道路。所提供的见解全面概述了 IBD,强调生物信息学是推进 IBD 管理中个性化医疗的关键。
{"title":"Mining Transcriptional Data for Precision Medicine: Bioinformatics Insights into Inflammatory Bowel Disease","authors":"Arman Shahriari, Shokoofeh Amirzadeh Shams, Hamidreza Mahboobi, Maryam Yazdanparast, Amirreza Jabbaripour Sarmadian","doi":"10.2174/0115748936302814240729062857","DOIUrl":"https://doi.org/10.2174/0115748936302814240729062857","url":null,"abstract":": Inflammatory Bowel Disease (IBD), encompassing ulcerative colitis and Crohn’s disease, affects millions worldwide. Characterized by a complex interplay of genetic, microbial, and environmental factors, IBD challenges conventional treatment approaches, necessitating precision medicine. This paper reviews the role of bioinformatics in leveraging transcriptional data for novel IBD diagnostics and therapeutics. It highlights the genomic landscape of IBD, focusing on genetic factors and insights from genome-wide association studies. The interrelation between the gut microbiome and host transcriptional responses in IBD is examined, emphasizing the use of bioinformatics tools in deciphering these interactions. Our study synthesizes developments in transcriptomics and proteomics, revealing aberrant gene and protein expression patterns linked to IBD pathogenesis. We advocate for the integration of multi-omics data, underscoring the complexity and necessity of bioinformatics in interpreting these datasets. This approach paves the way for personalized treatment strategies, improved disease prognosis, and enhanced patient care. The insights provided offer a comprehensive overview of IBD, highlighting bioinformatics as key in advancing personalized healthcare in IBD management.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Parallel Implementation for Large-Scale TSR-based 3D Structural Comparisons of Protein and Amino Acid 基于 TSR 的大规模蛋白质和氨基酸三维结构比较的并行实现
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-08-02 DOI: 10.2174/0115748936306625240724102438
Feng Chen, Tarikul I. Milon, Poorya Khajouie, Antoinette Myers, Wu Xu
Background: Proteins play a vital role in sustaining life, requiring the formation of specific 3D structures to manifest their essential biological functions. Structure comparison techniques are benefiting from the ever-expanding repositories of the Protein Data Bank. The development of computational tools for protein and amino acid 3D structural comparisons plays an important role in understanding protein functions. The Triangular Spatial Relationship (TSR)-based was developed for such purpose. Methods: A parallelization strategy and actual implementation on high-performance clusters using the distributed and shared memory programming model, along with the utilization of multi-core CPU and many-core GPU accelerators, were developed. 3D structures of proteins and amino acids are represented by an integer vector in the TSR-based method. This parallelization strategy is designed for the TSR-based method for large-scale 3D structural comparisons of proteins and amino acids in this study. It can also be adapted to other applications where a vector type of data structure is used. Results: Due to the nature of the vector representation of protein and amino acid structures using the TSR-based method, the comparison algorithm is well-suited for parallelization on large scale supercomputers. Performance studies on the representative datasets were conducted to demonstrate the efficiency of the parallelization strategy. It allows comparisons of large 3D protein or amino acid structure datasets to finish within a reasonable amount of time. Conclusion: The case studies, by taking advantage of this parallelization code, demonstrate that applying either mirror image or feature selection in the TSR-based algorithms improves the classifications of protein and amino acid 3D structures. The TSR keys have the advantage of performing structure-based BLAST searches. The parallelization code could be used as a reference for similar future studies.
背景:蛋白质在维持生命方面发挥着至关重要的作用,需要形成特定的三维结构才能体现其基本生物功能。结构比较技术得益于蛋白质数据库不断扩大的资源库。蛋白质和氨基酸三维结构比较计算工具的开发在了解蛋白质功能方面发挥着重要作用。基于三角空间关系(TSR)的计算工具就是为此而开发的。方法:利用分布式和共享内存编程模型,同时利用多核 CPU 和多核 GPU 加速器,开发了一种并行化策略,并在高性能集群上实际实施。在基于 TSR 的方法中,蛋白质和氨基酸的三维结构由整数向量表示。在本研究中,这种并行化策略是为基于 TSR 的方法设计的,用于蛋白质和氨基酸的大规模三维结构比较。它也可适用于使用矢量类型数据结构的其他应用。结果由于使用基于 TSR 的方法对蛋白质和氨基酸结构进行矢量表示的性质,该比较算法非常适合在大型超级计算机上进行并行化。对代表性数据集进行的性能研究证明了并行化策略的效率。它允许在合理的时间内完成大型三维蛋白质或氨基酸结构数据集的比较。结论利用该并行化代码进行的案例研究表明,在基于 TSR 的算法中应用镜像或特征选择可以改进蛋白质和氨基酸三维结构的分类。TSR 密钥具有执行基于结构的 BLAST 搜索的优势。该并行化代码可作为今后类似研究的参考。
{"title":"A Parallel Implementation for Large-Scale TSR-based 3D Structural Comparisons of Protein and Amino Acid","authors":"Feng Chen, Tarikul I. Milon, Poorya Khajouie, Antoinette Myers, Wu Xu","doi":"10.2174/0115748936306625240724102438","DOIUrl":"https://doi.org/10.2174/0115748936306625240724102438","url":null,"abstract":"Background: Proteins play a vital role in sustaining life, requiring the formation of specific 3D structures to manifest their essential biological functions. Structure comparison techniques are benefiting from the ever-expanding repositories of the Protein Data Bank. The development of computational tools for protein and amino acid 3D structural comparisons plays an important role in understanding protein functions. The Triangular Spatial Relationship (TSR)-based was developed for such purpose. Methods: A parallelization strategy and actual implementation on high-performance clusters using the distributed and shared memory programming model, along with the utilization of multi-core CPU and many-core GPU accelerators, were developed. 3D structures of proteins and amino acids are represented by an integer vector in the TSR-based method. This parallelization strategy is designed for the TSR-based method for large-scale 3D structural comparisons of proteins and amino acids in this study. It can also be adapted to other applications where a vector type of data structure is used. Results: Due to the nature of the vector representation of protein and amino acid structures using the TSR-based method, the comparison algorithm is well-suited for parallelization on large scale supercomputers. Performance studies on the representative datasets were conducted to demonstrate the efficiency of the parallelization strategy. It allows comparisons of large 3D protein or amino acid structure datasets to finish within a reasonable amount of time. Conclusion: The case studies, by taking advantage of this parallelization code, demonstrate that applying either mirror image or feature selection in the TSR-based algorithms improves the classifications of protein and amino acid 3D structures. The TSR keys have the advantage of performing structure-based BLAST searches. The parallelization code could be used as a reference for similar future studies.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of miRNA-disease Associations by Deep Matrix Decomposition Method based on Fused Similarity Information 基于融合相似性信息的深度矩阵分解法预测 miRNA 与疾病的关联性
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-08-02 DOI: 10.2174/0115748936300759240712061707
Xia Chen, Qiang Qu, Xiang Zhang, Hao Nie, Xiuxiu Chao, Weihao Ou, Haowen Chen, Xiangzheng Fu
Aim: MicroRNAs (miRNAs), pivotal regulators in various biological processes, are closely linked to human diseases. This study aims to propose a computational model, SIDMF, for predicting miRNA-disease associations. Background: Computational methods have proven efficient in predicting miRNA-disease associations, leveraging functional similarity and network-based inference. Machine learning techniques, including support vector machines, semi-supervised algorithms, and deep learning models, have gained prominence in this domain. Objective: Develop a computational model that integrates disease semantic similarity and miRNA functional similarity within a deep matrix factorization framework to predict potential associations between miRNAs and diseases accurately. Methods: SIDMF, introduced in this study, integrates disease semantic similarity and miRNA functional similarity within a deep matrix factorization framework. Through the reconstruction of the miRNA-disease association matrix, SIDMF predicts potential associations between miRNAs and diseases. Results: The performance of SIDMF was evaluated using global Leave-One-Out Cross-Validation (LOOCV) and local LOOCV, achieving high Area Under the Curve (AUC) values of 0.9536 and 0.9404, respectively. Comparative analysis against other methods demonstrated the superior performance of SIDMF. Case studies on breast cancer, esophageal cancer, and prostate cancer further validated SIDMF's predictive accuracy, with a substantial percentage of the top 50 predicted miRNAs confirmed in relevant databases. Conclusion: SIDMF emerges as a promising computational model for predicting potential associations between miRNAs and diseases. Its robust performance in global and local evaluations, along with successful case studies, underscores its potential contributions to disease prevention, diagnosis, and treatment.
目的:微RNA(miRNA)是多种生物过程的关键调控因子,与人类疾病密切相关。本研究旨在提出一种预测 miRNA 与疾病关联的计算模型 SIDMF。背景:事实证明,利用功能相似性和基于网络的推断,计算方法能有效预测 miRNA 与疾病的关联。机器学习技术,包括支持向量机、半监督算法和深度学习模型,在这一领域已占据重要地位。目标:开发一种在深度矩阵因式分解框架内整合疾病语义相似性和 miRNA 功能相似性的计算模型,以准确预测 miRNA 与疾病之间的潜在关联。研究方法本研究提出的 SIDMF 在深度矩阵因式分解框架内整合了疾病语义相似性和 miRNA 功能相似性。通过重建 miRNA 与疾病的关联矩阵,SIDMF 预测了 miRNA 与疾病之间的潜在关联。结果:采用全局留空交叉验证(LOOCV)和局部留空交叉验证对 SIDMF 的性能进行了评估,结果显示 SIDMF 的曲线下面积(AUC)分别达到了 0.9536 和 0.9404 的高值。与其他方法的比较分析表明,SIDMF 的性能更优越。对乳腺癌、食管癌和前列腺癌的病例研究进一步验证了 SIDMF 的预测准确性,在预测的前 50 个 miRNA 中,有相当大的比例在相关数据库中得到了证实。结论SIDMF 是预测 miRNA 与疾病之间潜在关联的一种有前途的计算模型。它在全局和局部评估中的强劲表现,以及成功的案例研究,凸显了它对疾病预防、诊断和治疗的潜在贡献。
{"title":"Prediction of miRNA-disease Associations by Deep Matrix Decomposition Method based on Fused Similarity Information","authors":"Xia Chen, Qiang Qu, Xiang Zhang, Hao Nie, Xiuxiu Chao, Weihao Ou, Haowen Chen, Xiangzheng Fu","doi":"10.2174/0115748936300759240712061707","DOIUrl":"https://doi.org/10.2174/0115748936300759240712061707","url":null,"abstract":"Aim: MicroRNAs (miRNAs), pivotal regulators in various biological processes, are closely linked to human diseases. This study aims to propose a computational model, SIDMF, for predicting miRNA-disease associations. Background: Computational methods have proven efficient in predicting miRNA-disease associations, leveraging functional similarity and network-based inference. Machine learning techniques, including support vector machines, semi-supervised algorithms, and deep learning models, have gained prominence in this domain. Objective: Develop a computational model that integrates disease semantic similarity and miRNA functional similarity within a deep matrix factorization framework to predict potential associations between miRNAs and diseases accurately. Methods: SIDMF, introduced in this study, integrates disease semantic similarity and miRNA functional similarity within a deep matrix factorization framework. Through the reconstruction of the miRNA-disease association matrix, SIDMF predicts potential associations between miRNAs and diseases. Results: The performance of SIDMF was evaluated using global Leave-One-Out Cross-Validation (LOOCV) and local LOOCV, achieving high Area Under the Curve (AUC) values of 0.9536 and 0.9404, respectively. Comparative analysis against other methods demonstrated the superior performance of SIDMF. Case studies on breast cancer, esophageal cancer, and prostate cancer further validated SIDMF's predictive accuracy, with a substantial percentage of the top 50 predicted miRNAs confirmed in relevant databases. Conclusion: SIDMF emerges as a promising computational model for predicting potential associations between miRNAs and diseases. Its robust performance in global and local evaluations, along with successful case studies, underscores its potential contributions to disease prevention, diagnosis, and treatment.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying Key Clinical Indicators Associated with the Risk of Death in Hospitalized COVID-19 Patients 确定与 COVID-19 住院患者死亡风险相关的关键临床指标
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-08-02 DOI: 10.2174/0115748936306893240720192301
QingLan Ma, Jingxin Ren, Lei Chen, Wei Guo, KaiYan Feng, Tao Huang, Yu-Dong Cai
Background: Accurately predicting survival in hospitalized COVID-19 patients is crucial but challenging due to multiple risk factors. This study addresses the limitations of existing research by proposing a comprehensive machine-learning framework to identify key mortality risk factors and develop a robust predictive model. Objective: This study proposes an analytical framework that leverages various machine learning techniques to predict the survival of hospitalized COVID-19 patients accurately. The framework comprehensively evaluates multiple clinical indicators and their associations with mortality risk. Method: Patient data, including gender, age, health condition, and smoking habits, was divided into discharged (n=507) and deceased (n=300) categories. Each patient was characterized by 92 clinical features. The framework incorporated seven feature ranking algorithms (LASSO, LightGBM, MCFS, mRMR, RF, CATBoost, and XGBoost), the IFS method, and four classification algorithms (DT, KNN, RF, and SVM). Results: Age, diabetes, dyspnea, chronic kidney failure, and high blood pressure were identified as the most important risk factors. The best model achieved an F1-score of 0.857 using KNN with 34 selected features. Conclusion: Our findings provide a comprehensive analysis of COVID-19 mortality risk factors and develops a robust predictive model. The findings highlight the increased risk in patients with comorbidities, consistent with existing literature. The proposed framework can aid in developing personalized treatment plans and allocating healthcare resources effectively.
背景:准确预测 COVID-19 住院患者的存活率至关重要,但由于存在多种风险因素,预测难度很大。本研究针对现有研究的局限性,提出了一个全面的机器学习框架来识别关键的死亡风险因素,并开发出一个稳健的预测模型。目标:本研究提出了一个分析框架,利用各种机器学习技术来准确预测 COVID-19 住院患者的存活率。该框架全面评估了多个临床指标及其与死亡风险的关联。研究方法将患者数据(包括性别、年龄、健康状况和吸烟习惯)分为出院(507 人)和死亡(300 人)两类。每位患者都有 92 个临床特征。该框架包含七种特征排序算法(LASSO、LightGBM、MCFS、mRMR、RF、CATBoost 和 XGBoost)、IFS 方法和四种分类算法(DT、KNN、RF 和 SVM)。结果年龄、糖尿病、呼吸困难、慢性肾衰竭和高血压被认为是最重要的风险因素。使用 KNN 和 34 个选定特征的最佳模型达到了 0.857 的 F1 分数。结论我们的研究结果对 COVID-19 的死亡风险因素进行了全面分析,并建立了一个稳健的预测模型。研究结果突显了合并症患者的风险增加,这与现有文献一致。所提出的框架有助于制定个性化治疗方案和有效分配医疗资源。
{"title":"Identifying Key Clinical Indicators Associated with the Risk of Death in Hospitalized COVID-19 Patients","authors":"QingLan Ma, Jingxin Ren, Lei Chen, Wei Guo, KaiYan Feng, Tao Huang, Yu-Dong Cai","doi":"10.2174/0115748936306893240720192301","DOIUrl":"https://doi.org/10.2174/0115748936306893240720192301","url":null,"abstract":"Background: Accurately predicting survival in hospitalized COVID-19 patients is crucial but challenging due to multiple risk factors. This study addresses the limitations of existing research by proposing a comprehensive machine-learning framework to identify key mortality risk factors and develop a robust predictive model. Objective: This study proposes an analytical framework that leverages various machine learning techniques to predict the survival of hospitalized COVID-19 patients accurately. The framework comprehensively evaluates multiple clinical indicators and their associations with mortality risk. Method: Patient data, including gender, age, health condition, and smoking habits, was divided into discharged (n=507) and deceased (n=300) categories. Each patient was characterized by 92 clinical features. The framework incorporated seven feature ranking algorithms (LASSO, LightGBM, MCFS, mRMR, RF, CATBoost, and XGBoost), the IFS method, and four classification algorithms (DT, KNN, RF, and SVM). Results: Age, diabetes, dyspnea, chronic kidney failure, and high blood pressure were identified as the most important risk factors. The best model achieved an F1-score of 0.857 using KNN with 34 selected features. Conclusion: Our findings provide a comprehensive analysis of COVID-19 mortality risk factors and develops a robust predictive model. The findings highlight the increased risk in patients with comorbidities, consistent with existing literature. The proposed framework can aid in developing personalized treatment plans and allocating healthcare resources effectively.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recent Progress of Deep Learning Methods for RBP Binding Sites Prediction on circRNA 用于 circRNA 上 RBP 结合位点预测的深度学习方法的最新进展
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-08-02 DOI: 10.2174/0115748936308564240712053215
Zhengfeng Wang, Xiujuan Lei, Yuchen Zhang, Fang-Xiang Wu, Yi Pan
The interaction between circular RNA (circRNA) and RNA binding protein (RBP) plays an important biological role in the occurrence and development of various diseases. Highthroughput biological experimental methods such as CLIP-seq can effectively analyze the interaction between the two, but biological experiments are inefficient and expensive, and they can only capture binding sites of a specific RBP on circRNA in a selected cell environment at a time. These biological experiments still rely on downstream data analysis to understand the mechanisms behind many biological structures and physiological processes. However, the rapid growth of experimental data dimensions and production speed pose challenges to traditional analysis methods. In recent years, deep learning has made great progress in the genome and transcriptome, and some deep learning prediction algorithms for RBP binding sites on circRNA have also emerged. In this paper, we briefly introduce some biological background knowledge related to circRNA-RBP interaction; present relevant deep learning techniques in this field, including the problem formulation, data source, sequence encoding, deep learning model and overall process of RBP binding sites prediction on circRNA; deeply analyze the current deep learning methods. Finally, some problems existing in the current research and the direction of future research are discussed. It is hoped to help researchers without basic knowledge of deep learning or basic biological background quickly understand the RBP binding sites prediction on circRNA.
环状 RNA(circRNA)与 RNA 结合蛋白(RBP)之间的相互作用在各种疾病的发生和发展中发挥着重要的生物学作用。CLIP-seq等高通量生物实验方法能有效分析二者之间的相互作用,但生物实验效率低、成本高,且每次只能捕获特定细胞环境中特定RBP与circRNA的结合位点。这些生物实验仍然依赖于下游数据分析来了解许多生物结构和生理过程背后的机制。然而,实验数据维度和生产速度的快速增长给传统分析方法带来了挑战。近年来,深度学习在基因组和转录组方面取得了长足的进步,一些针对 circRNA 上 RBP 结合位点的深度学习预测算法也应运而生。本文简要介绍了circRNA-RBP相互作用相关的生物学背景知识;介绍了该领域相关的深度学习技术,包括circRNA上RBP结合位点预测的问题提出、数据来源、序列编码、深度学习模型和整体流程;深入分析了当前的深度学习方法。最后,讨论了当前研究中存在的一些问题以及未来的研究方向。希望能帮助没有深度学习基础知识或基本生物学背景的研究人员快速理解 circRNA 上的 RBP 结合位点预测。
{"title":"Recent Progress of Deep Learning Methods for RBP Binding Sites Prediction on circRNA","authors":"Zhengfeng Wang, Xiujuan Lei, Yuchen Zhang, Fang-Xiang Wu, Yi Pan","doi":"10.2174/0115748936308564240712053215","DOIUrl":"https://doi.org/10.2174/0115748936308564240712053215","url":null,"abstract":"The interaction between circular RNA (circRNA) and RNA binding protein (RBP) plays an important biological role in the occurrence and development of various diseases. Highthroughput biological experimental methods such as CLIP-seq can effectively analyze the interaction between the two, but biological experiments are inefficient and expensive, and they can only capture binding sites of a specific RBP on circRNA in a selected cell environment at a time. These biological experiments still rely on downstream data analysis to understand the mechanisms behind many biological structures and physiological processes. However, the rapid growth of experimental data dimensions and production speed pose challenges to traditional analysis methods. In recent years, deep learning has made great progress in the genome and transcriptome, and some deep learning prediction algorithms for RBP binding sites on circRNA have also emerged. In this paper, we briefly introduce some biological background knowledge related to circRNA-RBP interaction; present relevant deep learning techniques in this field, including the problem formulation, data source, sequence encoding, deep learning model and overall process of RBP binding sites prediction on circRNA; deeply analyze the current deep learning methods. Finally, some problems existing in the current research and the direction of future research are discussed. It is hoped to help researchers without basic knowledge of deep learning or basic biological background quickly understand the RBP binding sites prediction on circRNA.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TCM@MPXV: A Resource for Treating Monkeypox Patients in Traditional Chinese Medicine TCM@MPXV:猴痘患者的中医治疗资源
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-08-02 DOI: 10.2174/0115748936299878240723044438
Xin Zhang, Feiran Zhou, Pinglu Zhang, Quan Zou, Ying Zhang
Introduction: Traditional Chinese Medicine (TCM) has been extensively employed in the treatment of Monkeypox Virus (MPXV) infections, and it has historically played a significant role in combating diseases like contagious pox-like viral diseases in China. Method: Various traditional Chinese medicine (TCM) therapies have been recommended for patients with monkeypox virus (MPXV). However, as far as we know, there is no comprehensive database dedicated to preserving and coordinating TCM remedies for combating MPXV. To address this gap, we introduce TCM@MPXV, a carefully curated repository of research materials focusing on formulations with anti-MPXV properties. Importantly, TCM@MPXV extends its scope beyond herbal remedies, encompassing mineral-based medicines as well. Result: The current iteration of TCM@MPXV boasts an impressive array of features, including (1) Documenting over 42 types of TCM herbs, with more than 27 unique herbs; (2) Recording over 285 bioactivity compounds within these herbs; (3) Launching a user-friendly web server for the docking, analysis, and visualization of 2D or 3D molecular structures; and (4) Providing 3D structures of druggable proteins of MPXV. Conclusion: To summarize, TCM@MPXV presents a user-friendly and effective platform for recording, querying, and viewing anti-MPXV TCM resources and will contribute to the development and explanation of novel anti-MPXV mechanisms of action to aid in the ongoing battle against monkeypox. TCM@MPXV is accessible for academic use at http://101.34.238.132:5000/.
导言:中医药在治疗猴痘病毒(MPXV)感染方面有着广泛的应用,在中国历史上,中医药在防治传染性痘样病毒病等疾病方面发挥了重要作用。方法:对于猴痘病毒(MPXV)患者,人们推荐了各种传统中医疗法。然而,据我们所知,目前还没有一个专门用于保存和协调防治猴痘病毒的中医药疗法的综合数据库。为了填补这一空白,我们推出了 TCM@MPXV,这是一个经过精心策划的研究资料库,重点关注具有抗 MPXV 特性的配方。重要的是,TCM@MPXV 的研究范围超出了中草药,还包括以矿物质为基础的药物。成果:目前的 TCM@MPXV 具有一系列令人印象深刻的功能,包括:(1)记录了超过 42 种中草药,其中有超过 27 种独特的中草药;(2)记录了这些中草药中超过 285 种具有生物活性的化合物;(3)推出了一个用户友好型网络服务器,用于对接、分析和可视化二维或三维分子结构;以及(4)提供 MPXV 可药用蛋白质的三维结构。结论总之,TCM@MPXV 为记录、查询和查看抗 MPXV 中药资源提供了一个用户友好型的有效平台,并将有助于开发和解释新型抗 MPXV 作用机制,从而为正在进行的猴痘防治工作提供帮助。TCM@MPXV 可在 http://101.34.238.132:5000/ 网站上供学术界使用。
{"title":"TCM@MPXV: A Resource for Treating Monkeypox Patients in Traditional Chinese Medicine","authors":"Xin Zhang, Feiran Zhou, Pinglu Zhang, Quan Zou, Ying Zhang","doi":"10.2174/0115748936299878240723044438","DOIUrl":"https://doi.org/10.2174/0115748936299878240723044438","url":null,"abstract":"Introduction: Traditional Chinese Medicine (TCM) has been extensively employed in the treatment of Monkeypox Virus (MPXV) infections, and it has historically played a significant role in combating diseases like contagious pox-like viral diseases in China. Method: Various traditional Chinese medicine (TCM) therapies have been recommended for patients with monkeypox virus (MPXV). However, as far as we know, there is no comprehensive database dedicated to preserving and coordinating TCM remedies for combating MPXV. To address this gap, we introduce TCM@MPXV, a carefully curated repository of research materials focusing on formulations with anti-MPXV properties. Importantly, TCM@MPXV extends its scope beyond herbal remedies, encompassing mineral-based medicines as well. Result: The current iteration of TCM@MPXV boasts an impressive array of features, including (1) Documenting over 42 types of TCM herbs, with more than 27 unique herbs; (2) Recording over 285 bioactivity compounds within these herbs; (3) Launching a user-friendly web server for the docking, analysis, and visualization of 2D or 3D molecular structures; and (4) Providing 3D structures of druggable proteins of MPXV. Conclusion: To summarize, TCM@MPXV presents a user-friendly and effective platform for recording, querying, and viewing anti-MPXV TCM resources and will contribute to the development and explanation of novel anti-MPXV mechanisms of action to aid in the ongoing battle against monkeypox. TCM@MPXV is accessible for academic use at http://101.34.238.132:5000/.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison between Ribosomal Assembly and Machine Learning Tools for Microbial Identification of Organisms with Different Characteristics 比较核糖体组装和机器学习工具在微生物鉴定中的不同特性
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-07-29 DOI: 10.2174/0115748936299440240709070105
Stephanie Chau, Carlos Rojas, Jorjeta G. Jetcheva, Mary Markart, Sudha Vijayakumar, Sophia Yuan, Vincent Stowbunenko, Amanda N. Shelton, William B. Andreopoulos
Background: Genome assembly tools are used to reconstruct genomic sequences from raw sequencing data, which are then used for identifying the organisms present in a metagenomic sample. Methodology: More recently, machine learning approaches have been applied to a variety of bioinformatics problems, and in this paper, we explore their use for organism identification. We start by evaluating several commonly used metagenomic assembly tools, including PhyloFlash, MEGAHIT, MetaSPAdes, Kraken2, Mothur, UniCycler, and PathRacer, and compare them against state-of-theart deep learning-based machine learning classification approaches represented by DNABERT and DeLUCS, in the context of two synthetic mock community datasets. Result: Our analysis focuses on determining whether ensembling metagenome assembly tools with machine learning tools have the potential to improve identification performance relative to using the tools individually. Conclusion: We find that this is indeed the case, and analyze the level of effectiveness of potential tool ensembling for organisms with different characteristics (based on factors such as repetitiveness, genome size, and GC content).
背景:基因组组装工具用于从原始测序数据中重建基因组序列,然后用于识别元基因组样本中存在的生物。方法:最近,机器学习方法被应用于各种生物信息学问题,在本文中,我们探讨了机器学习方法在生物识别中的应用。我们首先评估了几种常用的元基因组组装工具,包括 PhyloFlash、MEGAHIT、MetaSPAdes、Kraken2、Mothur、UniCycler 和 PathRacer,并以两个合成模拟群落数据集为背景,将它们与以 DNABERT 和 DeLUCS 为代表的基于深度学习的先进机器学习分类方法进行比较。结果我们的分析重点是确定将元基因组组装工具与机器学习工具组装在一起是否有可能比单独使用这些工具提高识别性能。结论我们发现情况确实如此,并分析了针对具有不同特征(基于重复性、基因组大小和 GC 含量等因素)的生物的潜在工具组合的有效性水平。
{"title":"Comparison between Ribosomal Assembly and Machine Learning Tools for Microbial Identification of Organisms with Different Characteristics","authors":"Stephanie Chau, Carlos Rojas, Jorjeta G. Jetcheva, Mary Markart, Sudha Vijayakumar, Sophia Yuan, Vincent Stowbunenko, Amanda N. Shelton, William B. Andreopoulos","doi":"10.2174/0115748936299440240709070105","DOIUrl":"https://doi.org/10.2174/0115748936299440240709070105","url":null,"abstract":"Background: Genome assembly tools are used to reconstruct genomic sequences from raw sequencing data, which are then used for identifying the organisms present in a metagenomic sample. Methodology: More recently, machine learning approaches have been applied to a variety of bioinformatics problems, and in this paper, we explore their use for organism identification. We start by evaluating several commonly used metagenomic assembly tools, including PhyloFlash, MEGAHIT, MetaSPAdes, Kraken2, Mothur, UniCycler, and PathRacer, and compare them against state-of-theart deep learning-based machine learning classification approaches represented by DNABERT and DeLUCS, in the context of two synthetic mock community datasets. Result: Our analysis focuses on determining whether ensembling metagenome assembly tools with machine learning tools have the potential to improve identification performance relative to using the tools individually. Conclusion: We find that this is indeed the case, and analyze the level of effectiveness of potential tool ensembling for organisms with different characteristics (based on factors such as repetitiveness, genome size, and GC content).","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141872100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Method for Mining Regulatory sRNAs Related to Rice Resistance Against Blast Fungus from Multi-Omics Data 从多组学数据中挖掘与水稻抗瘟性相关的sRNA调控因子的新方法
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-07-23 DOI: 10.2174/0115748936305102240705052723
Jianhua Sheng, Enshuang Zhao, Yuheng Zhu, Yinfei Dai, Borui Zhang, Qingming Qin, Hao Zhang
Background: Due to infection by the rice blast fungus, rice, a major global staple, faces yield challenges. While chemical control methods are common, their environmental and economic costs are growing concerns. Traditional biological experiments are also inefficient for exploring resistance genes. Therefore, understanding the interaction between rice and the rice blast fungus is urgent and important. Objective: This study aims to use multi-omics data to uncover key elements in rice's defense against rice blast fungus Magnaporthe oryzae. We built a detailed, multi-layered heterogeneous interaction network, employing an innovative graph embedding feature with a cross-layer random walk algorithm to identify crucial crucial resistance factors.This could inform strategies for enhancing disease resistance in rice. objective: This study aims to use multi-omics data to uncover key elements in rice's defense against rice blast fungus Magnaporthe oryzae. We built a detailed, multi-layered heterogeneous interaction network, employing an innovative graph embedding feature with a cross-layer random walk algorithm, to identify crucial crucial resistance factors. This could inform strategies for enhancing disease resistance in rice. Methods: We integrated genomics, transcriptomics, and proteomics data on Magnaporthe oryzae infecting rice. This multi-omics data was used to construct a multi-layer heterogeneous network.An advanced graph embedding algorithm (BINE) provided rich vector representations of network nodes. A multi-layer network walking algorithm was then used to analyze the network and identify key regulatory small RNA (sRNAs) in rice. Results: Node similarity rankings allowed us to identify significant regulatory sRNAs in rice that are integral to disease resistance. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses further revealed their roles in biological processes and key metabolic pathways.Our integrative method precisely and efficiently identified these crucial elements, offering a valuable systems biology tool. Conclusion: By integrating multi-omics data with computational analysis, this study reveals key regulatory sRNAs in rice's disease resistance mechanism. These findings enhance our understanding of rice disease resistance and provide genetic resources for breeding disease-resistant rice. Despite limitations in sRNA functional interpretation, this research demonstrates the power of applying multi- omics data to address complex biological problems.
背景:由于稻瘟病真菌的感染,作为全球主要主食的水稻面临产量挑战。虽然化学防治方法很普遍,但其环境和经济成本越来越令人担忧。传统的生物学实验在探索抗性基因方面也效率低下。因此,了解水稻与稻瘟病菌之间的相互作用显得尤为迫切和重要。研究目的本研究旨在利用多组学数据揭示水稻防御稻瘟病真菌 Magnaporthe oryzae 的关键因素。我们建立了一个详细的多层异质相互作用网络,利用创新的图嵌入特征和跨层随机行走算法来识别关键的重要抗性因子:本研究旨在利用多组学数据揭示水稻防御稻瘟病真菌 Magnaporthe oryzae 的关键因素。我们利用创新的图嵌入特征和跨层随机行走算法,构建了一个详细的多层异质相互作用网络,以确定关键的重要抗病因子。这可以为提高水稻抗病性的策略提供参考。研究方法我们整合了感染水稻的 Magnaporthe oryzae 的基因组学、转录组学和蛋白质组学数据。先进的图嵌入算法(BINE)为网络节点提供了丰富的向量表示。先进的图嵌入算法(BINE)提供了丰富的网络节点向量表示,然后使用多层网络行走算法对网络进行分析,找出水稻中的关键调控小 RNA(sRNA)。结果通过节点相似性排名,我们确定了水稻中与抗病性密不可分的重要调控 sRNA。基因本体(GO)和京都基因组百科全书(KEGG)分析进一步揭示了它们在生物过程和关键代谢途径中的作用。结论通过将多组学数据与计算分析相结合,本研究揭示了水稻抗病机制中的关键调控 sRNA。这些发现加深了我们对水稻抗病性的理解,并为培育抗病水稻提供了遗传资源。尽管在 sRNA 功能解释方面存在局限性,但这项研究展示了应用多组学数据解决复杂生物学问题的能力。
{"title":"A Novel Method for Mining Regulatory sRNAs Related to Rice Resistance Against Blast Fungus from Multi-Omics Data","authors":"Jianhua Sheng, Enshuang Zhao, Yuheng Zhu, Yinfei Dai, Borui Zhang, Qingming Qin, Hao Zhang","doi":"10.2174/0115748936305102240705052723","DOIUrl":"https://doi.org/10.2174/0115748936305102240705052723","url":null,"abstract":"Background: Due to infection by the rice blast fungus, rice, a major global staple, faces yield challenges. While chemical control methods are common, their environmental and economic costs are growing concerns. Traditional biological experiments are also inefficient for exploring resistance genes. Therefore, understanding the interaction between rice and the rice blast fungus is urgent and important. Objective: This study aims to use multi-omics data to uncover key elements in rice's defense against rice blast fungus Magnaporthe oryzae. We built a detailed, multi-layered heterogeneous interaction network, employing an innovative graph embedding feature with a cross-layer random walk algorithm to identify crucial crucial resistance factors.This could inform strategies for enhancing disease resistance in rice. objective: This study aims to use multi-omics data to uncover key elements in rice's defense against rice blast fungus Magnaporthe oryzae. We built a detailed, multi-layered heterogeneous interaction network, employing an innovative graph embedding feature with a cross-layer random walk algorithm, to identify crucial crucial resistance factors. This could inform strategies for enhancing disease resistance in rice. Methods: We integrated genomics, transcriptomics, and proteomics data on Magnaporthe oryzae infecting rice. This multi-omics data was used to construct a multi-layer heterogeneous network.An advanced graph embedding algorithm (BINE) provided rich vector representations of network nodes. A multi-layer network walking algorithm was then used to analyze the network and identify key regulatory small RNA (sRNAs) in rice. Results: Node similarity rankings allowed us to identify significant regulatory sRNAs in rice that are integral to disease resistance. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses further revealed their roles in biological processes and key metabolic pathways.Our integrative method precisely and efficiently identified these crucial elements, offering a valuable systems biology tool. Conclusion: By integrating multi-omics data with computational analysis, this study reveals key regulatory sRNAs in rice's disease resistance mechanism. These findings enhance our understanding of rice disease resistance and provide genetic resources for breeding disease-resistant rice. Despite limitations in sRNA functional interpretation, this research demonstrates the power of applying multi- omics data to address complex biological problems.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141785583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DNA Binding Protein Prediction based on Multi-feature Deep Metatransfer Learning 基于多特征深度迁移学习的 DNA 结合蛋白预测
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-07-22 DOI: 10.2174/0115748936290782240624114950
Chunliang Wang, Fanfan kong, Yu Wang, Hongjie Wu, Jun Yan
Background: In recent years, the rapid development of deep learning technology has had a significant impact on the prediction of DNA-binding proteins. Deep neural networks can automatically learn complex features in protein and DNA sequences, improving prediction accuracy and generalization capabilities. Objective: This article mainly establishes a meta-migration model and combines it with a deep learning model to predict DNA-binding proteins. Methods: This study introduces a meta-learning algorithm based on transfer learning, which helps achieve rapid learning and adaptation to new tasks. In addition, normalized Moreau-Broto autocorrelation attributes (NMBAC), position-specific scoring matrix-discrete cosine transform (PSSMDCT), and position-specific scoring matrix-discrete wavelet transform (PSSM-DWT) are also used for feature extraction. Finally, the prediction of DBP is achieved through the deep neural network model based on the attention mechanism. Results: This paper first establishes the basis of deep meta-transfer learning and uses the PDB186 data set as the benchmark to extract features using NMBAC, PSSM-DCT, and PSSM-DWT, respectively, and compare the fused features in pairs, and finally obtain the fused feature process. Through deep learning processing, it is concluded that the fused feature prediction effect is the best. At the same time, compared with the currently popular models, there are obvious improvements in the ACC, MCC, SN and Spec evaluation indicators. Conclusion: Finally, it was concluded that the method used in this article can effectively predict DNA-binding proteins and show more significant performance.
背景:近年来,深度学习技术的快速发展对 DNA 结合蛋白的预测产生了重大影响。深度神经网络可以自动学习蛋白质和 DNA 序列中的复杂特征,提高预测的准确性和泛化能力。目的:本文主要建立元迁移模型,并将其与深度学习模型相结合,预测DNA结合蛋白。方法:本研究引入了一种基于迁移学习的元学习算法,有助于实现快速学习和适应新任务。此外,归一化莫罗-布罗托自相关属性(NMBAC)、特定位置评分矩阵-离散余弦变换(PSSMDCT)和特定位置评分矩阵-离散小波变换(PSSM-DWT)也被用于特征提取。最后,通过基于注意力机制的深度神经网络模型实现对 DBP 的预测。结果本文首先建立了深度元转移学习的基础,并以 PDB186 数据集为基准,分别使用 NMBAC、PSSM-DCT 和 PSSM-DWT 提取特征,并对融合特征进行成对比较,最终得到融合特征过程。通过深度学习处理,得出融合特征预测效果最好的结论。同时,与目前流行的模型相比,在ACC、MCC、SN和Spec评价指标上都有明显改善。结论最后得出结论,本文所采用的方法能有效预测 DNA 结合蛋白,并表现出较为显著的性能。
{"title":"DNA Binding Protein Prediction based on Multi-feature Deep Metatransfer Learning","authors":"Chunliang Wang, Fanfan kong, Yu Wang, Hongjie Wu, Jun Yan","doi":"10.2174/0115748936290782240624114950","DOIUrl":"https://doi.org/10.2174/0115748936290782240624114950","url":null,"abstract":"Background: In recent years, the rapid development of deep learning technology has had a significant impact on the prediction of DNA-binding proteins. Deep neural networks can automatically learn complex features in protein and DNA sequences, improving prediction accuracy and generalization capabilities. Objective: This article mainly establishes a meta-migration model and combines it with a deep learning model to predict DNA-binding proteins. Methods: This study introduces a meta-learning algorithm based on transfer learning, which helps achieve rapid learning and adaptation to new tasks. In addition, normalized Moreau-Broto autocorrelation attributes (NMBAC), position-specific scoring matrix-discrete cosine transform (PSSMDCT), and position-specific scoring matrix-discrete wavelet transform (PSSM-DWT) are also used for feature extraction. Finally, the prediction of DBP is achieved through the deep neural network model based on the attention mechanism. Results: This paper first establishes the basis of deep meta-transfer learning and uses the PDB186 data set as the benchmark to extract features using NMBAC, PSSM-DCT, and PSSM-DWT, respectively, and compare the fused features in pairs, and finally obtain the fused feature process. Through deep learning processing, it is concluded that the fused feature prediction effect is the best. At the same time, compared with the currently popular models, there are obvious improvements in the ACC, MCC, SN and Spec evaluation indicators. Conclusion: Finally, it was concluded that the method used in this article can effectively predict DNA-binding proteins and show more significant performance.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141753994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CNRBind: Small Molecule-RNA Binding Sites Recognition via Site Significant from Nucleotide and Complex Network Information CNRBind:通过核苷酸和复杂网络信息中的显著位点识别小分子-RNA 结合位点
IF 4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-07-22 DOI: 10.2174/0115748936296412240625111040
Lichao Zhang, Kang Xiao, Xueting Wang, Liang Kong
Background: Small molecule-RNA binding sites play a significant role in developing drugs for disease treatment. However, it is a challenge to propose accurate computational tools for identifying these binding sites. Method: In this study, an accurate prediction model named CNRBind was constructed by extracting site significant information from nucleotide and complex networks. We designed complex networks and calculated three topological structural parameters according to RNA tertiary structure. Acknowledging nucleotide interdependence, a sliding window was selected to integrate the influence of adjacent sites. Finally, the model was constructed using a random forest classifier. Results: Compared to the other computational tools, CNRBind was competitive and had excellent discriminative ability for metal ion-binding site prediction. Furthermore, statistic analysis revealed significant differences between CNRBind and existing methods. Additionally, CNRBind is a promising predictor in cases where experimental tertiary structure is unavailable. Conclusion: These results show that CNRBind is effective because of the proposed site significant information encoding strategy. The approach provides a reasonable supplement for biology researches. The dataset and resource codes can be accessed at: https://github.com/Kangxiaoneuq/CNRBind.
背景:小分子-RNA 结合位点在开发疾病治疗药物方面发挥着重要作用。然而,提出准确的计算工具来识别这些结合位点是一项挑战。研究方法本研究通过提取核苷酸和复杂网络中的重要结合位点信息,构建了名为 CNRBind 的精确预测模型。我们设计了复杂的网络,并根据 RNA 的三级结构计算了三个拓扑结构参数。考虑到核苷酸之间的相互依赖性,我们选择了一个滑动窗口来整合相邻位点的影响。最后,使用随机森林分类器构建了模型。结果与其他计算工具相比,CNRBind 在预测金属离子结合位点方面具有竞争力和出色的判别能力。此外,统计分析显示 CNRBind 与现有方法存在显著差异。此外,在没有实验三级结构的情况下,CNRBind 是一种很有前途的预测工具。结论:这些结果表明,CNRBind 是一种有效的方法,因为它采用了拟议的位点重要信息编码策略。该方法为生物学研究提供了合理的补充。数据集和资源代码请访问:https://github.com/Kangxiaoneuq/CNRBind。
{"title":"CNRBind: Small Molecule-RNA Binding Sites Recognition via Site Significant from Nucleotide and Complex Network Information","authors":"Lichao Zhang, Kang Xiao, Xueting Wang, Liang Kong","doi":"10.2174/0115748936296412240625111040","DOIUrl":"https://doi.org/10.2174/0115748936296412240625111040","url":null,"abstract":"Background: Small molecule-RNA binding sites play a significant role in developing drugs for disease treatment. However, it is a challenge to propose accurate computational tools for identifying these binding sites. Method: In this study, an accurate prediction model named CNRBind was constructed by extracting site significant information from nucleotide and complex networks. We designed complex networks and calculated three topological structural parameters according to RNA tertiary structure. Acknowledging nucleotide interdependence, a sliding window was selected to integrate the influence of adjacent sites. Finally, the model was constructed using a random forest classifier. Results: Compared to the other computational tools, CNRBind was competitive and had excellent discriminative ability for metal ion-binding site prediction. Furthermore, statistic analysis revealed significant differences between CNRBind and existing methods. Additionally, CNRBind is a promising predictor in cases where experimental tertiary structure is unavailable. Conclusion: These results show that CNRBind is effective because of the proposed site significant information encoding strategy. The approach provides a reasonable supplement for biology researches. The dataset and resource codes can be accessed at: https://github.com/Kangxiaoneuq/CNRBind.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141753995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Current Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1