首页 > 最新文献

Current Bioinformatics最新文献

英文 中文
Research on the Mechanism of Traditional Chinese Medicine Treatment for Diseases caused by Human Coronavirus COVID-19 中药治疗人类冠状病毒 COVID-19 引起的疾病的机理研究
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-04-02 DOI: 10.2174/0115748936292599240308102616
Xian-Fang Wang, Chong-Yang Ma, Zhi-Yong Du, Yi-Feng Liu, Shao-Hui Ma, Sang Yu, Rui-xia Jin, Dong-qing Wei
Background: Human coronaviruses are a large group of viruses that exist widely in nature and multiply through self-replication. Due to its suddenness and variability, it poses a great threat to global human health and is a major problem currently faced by the medical and health fields. background: Human coronaviruses are a large group of viruses that exist widely in nature and multiply through self-replication. Due to its suddenness and variability, it poses a great threat to global human health and is a major problem currently faced by the medical and health fields. Objective: COVID-19 is the seventh known coronavirus that can infect humans. The main purpose of this paper is to analyze the effective components and action targets of the Longyi Zhengqi formula and Lianhua Qingwen formula, study their mechanism of action in the treatment of new coronavirus pneumonia (new coronavirus pneumonia), compare the similarities and differences of their pharmacological effects, and obtain the pharmacodynamic mechanism of the two traditional Chinese medicine compounds. Method: Obtain the effective ingredients and targets of Longyi-Zhengqi Formula and Lianhua- Qingwen Formula from ETCM (Encyclopedia of Traditional Chinese Medicine) and other traditional Chinese medicine databases, use GeneCards database to obtain the relevant targets of COVID-19, and use Cytoscape software to build the component COVID-19 target network of Longyi-Zhengqi Formula and the component COVID-19 target network of Lianhua-Qingwen Formula. STRING was used to construct a protein interaction network and screen key targets. GO (Gene Ontology) was used for enrichment analysis and KEGG (Kyoto Encyclopedia of Genes and Genomes) was used for pathways to find out the targets and pathways related to the treatment of COVID-19. Results: In the GO enrichment analysis results, there are 106 biological processes, 31 cell localization and 28 molecular functions of the intersection PPI network targets of Longyi-Zhengqi Formula- COVID-19, 224 biological processes, 51 cell localization and 55 molecular functions of the intersection PPI network targets of Lianhua-Qingwen Formula-COVID-19. In the KEGG pathway analysis results, the number of targets of Longyi-Zhengqi Formula on the COVID-19 pathway is 7, and the number of targets of Lianhua-Qingwen Formula on the COVID-19 pathway is 19; In the regulation analysis results, Longyi-Zhengqi Formula achieves the effect of treating COVID-19 by regulating IL-6, and Lianhua-Qingwen Formula achieves the effect of treating pneumonia by regulating TLR4. Conclusion: This paper explores the mechanism of action of Longyi-Zhengqi Formula and Lianhua-Qingwen Formula in treating COVID-19 based on the method of network pharmacology, and provides a theoretical basis for traditional Chinese medicine to treat sudden diseases caused by human coronavirus in terms of drug targets and disease interactions. It has certain practical significance.
背景:人类冠状病毒是一大类病毒,广泛存在于自然界中,通过自我复制进行繁殖。由于其突发性和变异性,它对全球人类健康构成了巨大威胁,也是医疗卫生领域目前面临的一个主要问题:人类冠状病毒是一大类病毒,广泛存在于自然界中,通过自我复制进行繁殖。由于其突发性和变异性,它对全球人类健康构成了巨大威胁,也是医学和卫生领域目前面临的主要问题。目的:COVID-19 是已知的第七种可感染人类的冠状病毒。本文的主要目的是分析龙益正气方和莲花清心方的有效成分和作用靶点,研究其治疗新型冠状病毒肺炎(新型冠状病毒肺炎)的作用机制,比较其药理作用的异同,获得两种中药复方的药效学机制。方法从ETCM(Encyclopedia of Traditional Chinese Medicine)等中药数据库中获取龙益正气方和连花清瘟方的有效成分和靶点,利用GeneCards数据库获取COVID-19的相关靶点,利用Cytoscape软件构建龙益正气方的组分COVID-19靶点网络和连花清瘟方的组分COVID-19靶点网络。STRING 用于构建蛋白质相互作用网络和筛选关键靶标。利用GO(Gene Ontology)进行富集分析,利用KEGG(Kyoto Encyclopedia of Genes and Genomes)进行通路分析,寻找与COVID-19治疗相关的靶点和通路。结果在GO富集分析结果中,龙益正气方-COVID-19的交叉PPI网络靶点有106个生物过程、31个细胞定位和28个分子功能;莲花清心方-COVID-19的交叉PPI网络靶点有224个生物过程、51个细胞定位和55个分子功能。在KEGG通路分析结果中,龙益正气方在COVID-19通路上的靶点数为7个,连花清瘟方在COVID-19通路上的靶点数为19个;在调控分析结果中,龙益正气方通过调控IL-6达到治疗COVID-19的效果,连花清瘟方通过调控TLR4达到治疗肺炎的效果。结论本文基于网络药理学的方法,探讨了龙益正气方和连花清瘟方治疗COVID-19的作用机制,从药物靶点、疾病相互作用等方面为中药治疗人类冠状病毒所致突发性疾病提供了理论依据。具有一定的现实意义。
{"title":"Research on the Mechanism of Traditional Chinese Medicine Treatment for Diseases caused by Human Coronavirus COVID-19","authors":"Xian-Fang Wang, Chong-Yang Ma, Zhi-Yong Du, Yi-Feng Liu, Shao-Hui Ma, Sang Yu, Rui-xia Jin, Dong-qing Wei","doi":"10.2174/0115748936292599240308102616","DOIUrl":"https://doi.org/10.2174/0115748936292599240308102616","url":null,"abstract":"Background: Human coronaviruses are a large group of viruses that exist widely in nature and multiply through self-replication. Due to its suddenness and variability, it poses a great threat to global human health and is a major problem currently faced by the medical and health fields. background: Human coronaviruses are a large group of viruses that exist widely in nature and multiply through self-replication. Due to its suddenness and variability, it poses a great threat to global human health and is a major problem currently faced by the medical and health fields. Objective: COVID-19 is the seventh known coronavirus that can infect humans. The main purpose of this paper is to analyze the effective components and action targets of the Longyi Zhengqi formula and Lianhua Qingwen formula, study their mechanism of action in the treatment of new coronavirus pneumonia (new coronavirus pneumonia), compare the similarities and differences of their pharmacological effects, and obtain the pharmacodynamic mechanism of the two traditional Chinese medicine compounds. Method: Obtain the effective ingredients and targets of Longyi-Zhengqi Formula and Lianhua- Qingwen Formula from ETCM (Encyclopedia of Traditional Chinese Medicine) and other traditional Chinese medicine databases, use GeneCards database to obtain the relevant targets of COVID-19, and use Cytoscape software to build the component COVID-19 target network of Longyi-Zhengqi Formula and the component COVID-19 target network of Lianhua-Qingwen Formula. STRING was used to construct a protein interaction network and screen key targets. GO (Gene Ontology) was used for enrichment analysis and KEGG (Kyoto Encyclopedia of Genes and Genomes) was used for pathways to find out the targets and pathways related to the treatment of COVID-19. Results: In the GO enrichment analysis results, there are 106 biological processes, 31 cell localization and 28 molecular functions of the intersection PPI network targets of Longyi-Zhengqi Formula- COVID-19, 224 biological processes, 51 cell localization and 55 molecular functions of the intersection PPI network targets of Lianhua-Qingwen Formula-COVID-19. In the KEGG pathway analysis results, the number of targets of Longyi-Zhengqi Formula on the COVID-19 pathway is 7, and the number of targets of Lianhua-Qingwen Formula on the COVID-19 pathway is 19; In the regulation analysis results, Longyi-Zhengqi Formula achieves the effect of treating COVID-19 by regulating IL-6, and Lianhua-Qingwen Formula achieves the effect of treating pneumonia by regulating TLR4. Conclusion: This paper explores the mechanism of action of Longyi-Zhengqi Formula and Lianhua-Qingwen Formula in treating COVID-19 based on the method of network pharmacology, and provides a theoretical basis for traditional Chinese medicine to treat sudden diseases caused by human coronavirus in terms of drug targets and disease interactions. It has certain practical significance.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Machine-learning Model to Classify Schizophrenia Using Methylation Data Based on Gene Expression 利用基于基因表达的甲基化数据对精神分裂症进行分类的新型机器学习模型
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-03-11 DOI: 10.2174/0115748936293407240222113019
Karthikeyan A. Vijayakumar, Gwang-Won Cho
Introduction: The recent advancement in artificial intelligence has compelled medical research to adapt the technologies. The abundance of molecular data and AI technology has helped in explaining various diseases, even cancers. Schizophrenia is a complex neuropsychological disease whose etiology is unknown. Several gene-wide association studies attempted to narrow down the cause of the disease but did not successfully point out the mechanism behind the disease. There are studies regarding the epigenetic changes in the schizophrenia disease condition, and a classification machine-learning model has been trained using the blood methylation data. Method: In this study, we have demonstrated a novel approach to elucidating the molecular cause of the disease. We used a two-step machine-learning approach to determine the causal molecular markers. By doing so, we developed classification models using both gene expression microarray and methylation microarray data. Result: Our models, because of our novel approach, achieved good classification accuracy with the available data size. We analyzed the important features, and they add up as evidence for the glutamate hypothesis of schizophrenia. Conclusion: In this way, we have demonstrated explaining a disease through machine learning models.
简介近年来,人工智能的发展迫使医学研究对技术进行调整。丰富的分子数据和人工智能技术有助于解释各种疾病,甚至癌症。精神分裂症是一种病因不明的复杂神经心理疾病。一些全基因关联研究试图缩小病因范围,但并未成功指出疾病背后的机制。目前已有关于精神分裂症疾病表观遗传变化的研究,并利用血液甲基化数据训练了一个分类机器学习模型。方法:在这项研究中,我们展示了一种阐明疾病分子原因的新方法。我们采用了两步机器学习法来确定致病分子标记。为此,我们利用基因表达微阵列和甲基化微阵列数据建立了分类模型。结果由于采用了新颖的方法,我们的模型在数据量有限的情况下实现了良好的分类准确性。我们分析了重要的特征,这些特征为精神分裂症的谷氨酸假说提供了证据。结论通过这种方式,我们证明了通过机器学习模型可以解释一种疾病。
{"title":"A Novel Machine-learning Model to Classify Schizophrenia Using Methylation Data Based on Gene Expression","authors":"Karthikeyan A. Vijayakumar, Gwang-Won Cho","doi":"10.2174/0115748936293407240222113019","DOIUrl":"https://doi.org/10.2174/0115748936293407240222113019","url":null,"abstract":"Introduction: The recent advancement in artificial intelligence has compelled medical research to adapt the technologies. The abundance of molecular data and AI technology has helped in explaining various diseases, even cancers. Schizophrenia is a complex neuropsychological disease whose etiology is unknown. Several gene-wide association studies attempted to narrow down the cause of the disease but did not successfully point out the mechanism behind the disease. There are studies regarding the epigenetic changes in the schizophrenia disease condition, and a classification machine-learning model has been trained using the blood methylation data. Method: In this study, we have demonstrated a novel approach to elucidating the molecular cause of the disease. We used a two-step machine-learning approach to determine the causal molecular markers. By doing so, we developed classification models using both gene expression microarray and methylation microarray data. Result: Our models, because of our novel approach, achieved good classification accuracy with the available data size. We analyzed the important features, and they add up as evidence for the glutamate hypothesis of schizophrenia. Conclusion: In this way, we have demonstrated explaining a disease through machine learning models.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140105243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Extended Feature Representation Technique for Predicting Sequenced-based Host-pathogen Protein-protein Interaction 预测基于序列的宿主-病原体蛋白质-蛋白质相互作用的扩展特征表示技术
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-03-11 DOI: 10.2174/0115748936286848240108074303
Jerry Emmanuel, Itunuoluwa Isewon, Grace Olasehinde, Jelili Oyelade
Background: The use of machine learning models in sequence-based Protein-Protein Interaction prediction typically requires the conversion of amino acid sequences into feature vectors. From the literature, two approaches have been used to achieve this transformation. These are referred to as the Independent Protein Feature (IPF) and Merged Protein Feature (MPF) extraction methods. As observed, studies have predominantly adopted the IPF approach, while others preferred the MPF method, in which host and pathogen sequences are concatenated before feature encoding. Objective: This presents the challenge of determining which approach should be adopted for improved HPPPI prediction. Therefore, this work introduces the Extended Protein Feature (EPF) method. Methods: The proposed method combines the predictive capabilities of IPF and MPF, extracting essential features, handling multicollinearity, and removing features with zero importance. EPF, IPF, and MPF were tested using bacteria, parasite, virus, and plant HPPPI datasets and were deployed to machine learning models, including Random Forest (RF), Support Vector Machine (SVM), Multilayer Perceptron (MLP), Naïve Bayes (NB), Logistic Regression (LR), and Deep Forest (DF). Results: The results indicated that MPF exhibited the lowest performance overall, whereas IPF performed better with decision tree-based models, such as RF and DF. In contrast, EPF demonstrated improved performance with SVM, LR, NB, and MLP and also yielded competitive results with DF and RF. Conclusion: In conclusion, the EPF approach developed in this study exhibits substantial improvements in four out of the six models evaluated. This suggests that EPF offers competitiveness with IPF and is particularly well-suited for traditional machine learning models.
背景:在基于序列的蛋白质-蛋白质相互作用预测中使用机器学习模型通常需要将氨基酸序列转换为特征向量。从文献来看,有两种方法可以实现这种转换。这两种方法被称为独立蛋白质特征(IPF)提取法和合并蛋白质特征(MPF)提取法。据观察,相关研究主要采用 IPF 方法,而其他研究则倾向于 MPF 方法,即在特征编码前将宿主和病原体序列合并。目标这就给确定采用哪种方法来改进 HPPPI 预测带来了挑战。因此,本研究引入了扩展蛋白质特征(EPF)方法。方法:所提出的方法结合了 IPF 和 MPF 的预测能力,提取了基本特征,处理了多重共线性,并删除了重要性为零的特征。使用细菌、寄生虫、病毒和植物 HPPPI 数据集测试了 EPF、IPF 和 MPF,并将其部署到机器学习模型中,包括随机森林 (RF)、支持向量机 (SVM)、多层感知器 (MLP)、奈夫贝叶斯 (NB)、逻辑回归 (LR) 和深度森林 (DF)。结果显示结果表明,MPF 的整体性能最低,而 IPF 在使用 RF 和 DF 等基于决策树的模型时表现更好。相比之下,EPF 在 SVM、LR、NB 和 MLP 中的性能有所提高,在 DF 和 RF 中也取得了具有竞争力的结果。结论总之,在本研究中开发的 EPF 方法在六个评估模型中的四个模型中都有显著改进。这表明 EPF 与 IPF 相比具有竞争力,尤其适合传统的机器学习模型。
{"title":"An Extended Feature Representation Technique for Predicting Sequenced-based Host-pathogen Protein-protein Interaction","authors":"Jerry Emmanuel, Itunuoluwa Isewon, Grace Olasehinde, Jelili Oyelade","doi":"10.2174/0115748936286848240108074303","DOIUrl":"https://doi.org/10.2174/0115748936286848240108074303","url":null,"abstract":"Background: The use of machine learning models in sequence-based Protein-Protein Interaction prediction typically requires the conversion of amino acid sequences into feature vectors. From the literature, two approaches have been used to achieve this transformation. These are referred to as the Independent Protein Feature (IPF) and Merged Protein Feature (MPF) extraction methods. As observed, studies have predominantly adopted the IPF approach, while others preferred the MPF method, in which host and pathogen sequences are concatenated before feature encoding. Objective: This presents the challenge of determining which approach should be adopted for improved HPPPI prediction. Therefore, this work introduces the Extended Protein Feature (EPF) method. Methods: The proposed method combines the predictive capabilities of IPF and MPF, extracting essential features, handling multicollinearity, and removing features with zero importance. EPF, IPF, and MPF were tested using bacteria, parasite, virus, and plant HPPPI datasets and were deployed to machine learning models, including Random Forest (RF), Support Vector Machine (SVM), Multilayer Perceptron (MLP), Naïve Bayes (NB), Logistic Regression (LR), and Deep Forest (DF). Results: The results indicated that MPF exhibited the lowest performance overall, whereas IPF performed better with decision tree-based models, such as RF and DF. In contrast, EPF demonstrated improved performance with SVM, LR, NB, and MLP and also yielded competitive results with DF and RF. Conclusion: In conclusion, the EPF approach developed in this study exhibits substantial improvements in four out of the six models evaluated. This suggests that EPF offers competitiveness with IPF and is particularly well-suited for traditional machine learning models.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140105363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relational Graph Convolution Network with Multi Features for AntiCOVID-19 Drugs Discovery using 3CLpro Potential Target 利用 3CLpro 潜在靶点发现具有多种特征的关系图卷积网络用于抗 COVID-19 药物研究
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-03-11 DOI: 10.2174/0115748936280392240219054047
Medard Edmund Mswahili, Goodwill Erasmo Ndomba, Young Jin Kim, Kyuri Jo, Young-Seob Jeong
Background: The potential of graph neural networks (GNNs) to revolutionize the analysis of non-Euclidean data has gained attention recently, making them attractive models for deep machine learning. However, insufficient compound or moleculargraphs and feature representations might significantly impair and jeopardize their full potential. Despite the devastating impacts of ongoing COVID-19 across the globe, for which there is no drug with proven efficacy that has been shown tobe effective. As various stages of drug discovery and repositioning require the accurate prediction of drugtarget interactions(DTI), here, we propose a relational graph convolution network using multi-features based on the developed drug chemicalcompound-coronavirus target graph representation and combination of features. During the implementation of the model, we further introduced the use of not only the feature module to understand the topological structure of drugs but also the structure of the proven drug target (i.e., 3CLpro) for SARS-Cov-2 that shares a genome sequence similar to that of other members of the beta-coronavirus group such as SARS-Cov, MERS-CoV, bat coronavirus. Our feature comprises topologicalinformation in molecular SMILES and local chemical context in the SMILES sequence for the drug chemical compound and drug target. Our proposed method prevailed with high and compelling performance accuracy of 97.30% which could beprioritized as the potential and promising prediction route for the development of novel oral antiviral medicine for COVID-19 drugs. Objective: Forecasting DTI stands as a pivotal aspect of drug discovery. The focus on computational methods in DTI prediction has intensified due to the considerable expense and time investment associated with conducting extensive in vitro and in vivo experiments. Machine learning techniques, particularly deep learning, have found broad applications in DTI prediction. We are convinced that this study could be prioritized and utilized as the promising predictive route for the development of novel oral antiviral treatments for COVID-19 and other variants of coronaviruses. Methods: This study addressed the problem of COVID-19 drugs using proposed RGCN with multifeatures as an attractive and potential route. This study focused mainly on the prediction of novel antiviral drugs against coronaviruses using graph-based methodology, namely RGCN. This research further utilized the features of both drugs and common potential drug targets found in betacoronaviruses group to deepen understanding of their underlying relation. Results: Our suggested approach prevailed with a high and convincing performance accuracy of 97.30%, which may be utilizedas a top priority to support and advance this field in the prediction and development of novel antiviral treatments against coronaviruses and their variants. Conclusion: We recursively performed experiments using the proposed method on our constructed DCCCvT graph dataset from our c
背景:最近,图神经网络(GNN)彻底改变非欧几里得数据分析的潜力备受关注,使其成为具有吸引力的深度机器学习模型。然而,不充分的复合图或分子图和特征表示可能会极大地损害和危及它们的全部潜力。尽管 COVID-19 正在全球范围内造成破坏性影响,但目前还没有证明有效的药物。由于药物发现和重新定位的各个阶段都需要对药物靶点相互作用(DTI)进行准确预测,在此,我们基于已开发的药物化学合成物-冠状病毒靶点图表示和特征组合,提出了一种使用多特征的关系图卷积网络。在该模型的实施过程中,我们不仅进一步引入了使用特征模块来了解药物的拓扑结构,还引入了针对 SARS-Cov-2(与 SARS-Cov、MERS-CoV、蝙蝠冠状病毒等其他乙型冠状病毒群成员的基因组序列相似)的已证实药物靶标(即 3CLpro)的结构。我们的特征包括分子 SMILES 中的拓扑信息以及药物化合物和药物靶点的 SMILES 序列中的局部化学背景。我们提出的方法准确率高达 97.30%,可作为开发 COVID-19 新型口服抗病毒药物的潜在预测途径。目标:预测 DTI 是药物发现的关键环节。由于进行大量的体外和体内实验需要投入大量的费用和时间,因此在 DTI 预测中对计算方法的关注日益加强。机器学习技术,尤其是深度学习,已在 DTI 预测中得到广泛应用。我们相信,这项研究可以作为开发针对 COVID-19 和其他冠状病毒变种的新型口服抗病毒疗法的有前途的预测途径,并优先加以利用。研究方法本研究利用具有多特征的 RGCN 作为一种有吸引力的潜在途径来解决 COVID-19 药物问题。本研究主要侧重于使用基于图的方法(即 RGCN)预测针对冠状病毒的新型抗病毒药物。本研究进一步利用了这两种药物的特征以及在 betacoronaviruses 组中发现的常见潜在药物靶点,以加深对其潜在关系的理解。研究结果我们建议的方法准确率高达 97.30%,令人信服,可作为该领域预测和开发新型冠状病毒及其变种抗病毒疗法的首要支持和推动因素。结论我们在从收集的数据集中构建的 DCCCvT 图数据集上使用所提出的方法进行了递归实验,发现我们的模型在 T7 特征上取得了可比的最佳平均准确率性能,其次是 T7、R6 和 L8 的组合。本研究中提出的模型结果优于之前的相关研究。
{"title":"Relational Graph Convolution Network with Multi Features for AntiCOVID-19 Drugs Discovery using 3CLpro Potential Target","authors":"Medard Edmund Mswahili, Goodwill Erasmo Ndomba, Young Jin Kim, Kyuri Jo, Young-Seob Jeong","doi":"10.2174/0115748936280392240219054047","DOIUrl":"https://doi.org/10.2174/0115748936280392240219054047","url":null,"abstract":"Background: The potential of graph neural networks (GNNs) to revolutionize the analysis of non-Euclidean data has gained attention recently, making them attractive models for deep machine learning. However, insufficient compound or moleculargraphs and feature representations might significantly impair and jeopardize their full potential. Despite the devastating impacts of ongoing COVID-19 across the globe, for which there is no drug with proven efficacy that has been shown tobe effective. As various stages of drug discovery and repositioning require the accurate prediction of drugtarget interactions(DTI), here, we propose a relational graph convolution network using multi-features based on the developed drug chemicalcompound-coronavirus target graph representation and combination of features. During the implementation of the model, we further introduced the use of not only the feature module to understand the topological structure of drugs but also the structure of the proven drug target (i.e., 3CLpro) for SARS-Cov-2 that shares a genome sequence similar to that of other members of the beta-coronavirus group such as SARS-Cov, MERS-CoV, bat coronavirus. Our feature comprises topologicalinformation in molecular SMILES and local chemical context in the SMILES sequence for the drug chemical compound and drug target. Our proposed method prevailed with high and compelling performance accuracy of 97.30% which could beprioritized as the potential and promising prediction route for the development of novel oral antiviral medicine for COVID-19 drugs. Objective: Forecasting DTI stands as a pivotal aspect of drug discovery. The focus on computational methods in DTI prediction has intensified due to the considerable expense and time investment associated with conducting extensive in vitro and in vivo experiments. Machine learning techniques, particularly deep learning, have found broad applications in DTI prediction. We are convinced that this study could be prioritized and utilized as the promising predictive route for the development of novel oral antiviral treatments for COVID-19 and other variants of coronaviruses. Methods: This study addressed the problem of COVID-19 drugs using proposed RGCN with multifeatures as an attractive and potential route. This study focused mainly on the prediction of novel antiviral drugs against coronaviruses using graph-based methodology, namely RGCN. This research further utilized the features of both drugs and common potential drug targets found in betacoronaviruses group to deepen understanding of their underlying relation. Results: Our suggested approach prevailed with a high and convincing performance accuracy of 97.30%, which may be utilizedas a top priority to support and advance this field in the prediction and development of novel antiviral treatments against coronaviruses and their variants. Conclusion: We recursively performed experiments using the proposed method on our constructed DCCCvT graph dataset from our c","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140105422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MCHAN: Prediction of Human Microbe-drug Associations Based on Multiview Contrastive Hypergraph Attention Network MCHAN:基于多视角对比超图注意力网络的人类微生物-药物关联预测
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-03-01 DOI: 10.2174/0115748936288616240212073805
Guanghui Li, Ziyan Cao, Cheng Liang, Qiu Xiao, Jiawei Luo
Background: Complex and diverse microbial communities play a pivotal role in human health and have become a new drug target. Exploring the connections between drugs and microbes not only provides profound insights into their mechanisms but also drives progress in drug discovery and repurposing. The use of wet lab experiments to identify associations is time-consuming and laborious. Hence, the advancement of precise and efficient computational methods can effectively improve the efficiency of association identification between microorganisms and drugs. Objective: In this experiment, we propose a new deep learning model, a new multiview comparative hypergraph attention network (MCHAN) method for human microbe–drug association prediction. Methods: First, we fuse multiple similarity matrices to obtain a fused microbial and drug similarity network. By combining graph convolutional networks with attention mechanisms, we extract key information from multiple perspectives. Then, we construct two network topologies based on the above fused data. One topology incorporates the concept of hypernodes to capture implicit relationships between microbes and drugs using virtual nodes to construct a hyperheterogeneous graph. Next, we propose a cross-contrastive learning task that facilitates the simultaneous guidance of graph embeddings from both perspectives, without the need for any labels. This approach allows us to bring nodes with similar features and network topologies closer while pushing away other nodes. Finally, we employ attention mechanisms to merge the outputs of the GCN and predict the associations between drugs and microbes. Results: To confirm the effectiveness of this method, we conduct experiments on three distinct datasets. The results demonstrate that the MCHAN model surpasses other methods in terms of performance. Furthermore, case studies provide additional evidence confirming the consistent predictive accuracy of the MCHAN model. Conclusion: MCHAN is expected to become a valuable tool for predicting potential associations between microbiota and drugs in the future.
背景:复杂多样的微生物群落在人类健康中发挥着举足轻重的作用,并已成为新的药物靶点。探索药物与微生物之间的联系不仅能深入了解它们的作用机制,还能推动药物发现和再利用的进展。使用湿实验室实验来确定关联既费时又费力。因此,精确高效的计算方法可以有效提高微生物与药物之间关联识别的效率。目标:在本实验中,我们提出了一种新的深度学习模型--新的多视图比较超图注意网络(MCHAN)方法,用于人类微生物与药物的关联预测。方法:首先,我们融合多个相似性矩阵,得到一个融合的微生物和药物相似性网络。通过将图卷积网络与注意力机制相结合,我们从多个角度提取了关键信息。然后,我们根据上述融合数据构建两种网络拓扑结构。一种拓扑结合了超节点的概念,利用虚拟节点捕捉微生物和药物之间的隐含关系,从而构建超异构图。接下来,我们提出了一种交叉对比学习任务,有助于同时从两个角度指导图嵌入,而无需任何标签。通过这种方法,我们可以拉近具有相似特征和网络拓扑结构的节点,同时推开其他节点。最后,我们利用注意力机制合并 GCN 的输出,预测药物与微生物之间的关联。结果为了证实这种方法的有效性,我们在三个不同的数据集上进行了实验。结果表明,MCHAN 模型在性能上超越了其他方法。此外,案例研究提供了更多证据,证实了 MCHAN 模型始终如一的预测准确性。结论未来,MCHAN有望成为预测微生物群与药物之间潜在关联的重要工具。
{"title":"MCHAN: Prediction of Human Microbe-drug Associations Based on Multiview Contrastive Hypergraph Attention Network","authors":"Guanghui Li, Ziyan Cao, Cheng Liang, Qiu Xiao, Jiawei Luo","doi":"10.2174/0115748936288616240212073805","DOIUrl":"https://doi.org/10.2174/0115748936288616240212073805","url":null,"abstract":"Background: Complex and diverse microbial communities play a pivotal role in human health and have become a new drug target. Exploring the connections between drugs and microbes not only provides profound insights into their mechanisms but also drives progress in drug discovery and repurposing. The use of wet lab experiments to identify associations is time-consuming and laborious. Hence, the advancement of precise and efficient computational methods can effectively improve the efficiency of association identification between microorganisms and drugs. Objective: In this experiment, we propose a new deep learning model, a new multiview comparative hypergraph attention network (MCHAN) method for human microbe–drug association prediction. Methods: First, we fuse multiple similarity matrices to obtain a fused microbial and drug similarity network. By combining graph convolutional networks with attention mechanisms, we extract key information from multiple perspectives. Then, we construct two network topologies based on the above fused data. One topology incorporates the concept of hypernodes to capture implicit relationships between microbes and drugs using virtual nodes to construct a hyperheterogeneous graph. Next, we propose a cross-contrastive learning task that facilitates the simultaneous guidance of graph embeddings from both perspectives, without the need for any labels. This approach allows us to bring nodes with similar features and network topologies closer while pushing away other nodes. Finally, we employ attention mechanisms to merge the outputs of the GCN and predict the associations between drugs and microbes. Results: To confirm the effectiveness of this method, we conduct experiments on three distinct datasets. The results demonstrate that the MCHAN model surpasses other methods in terms of performance. Furthermore, case studies provide additional evidence confirming the consistent predictive accuracy of the MCHAN model. Conclusion: MCHAN is expected to become a valuable tool for predicting potential associations between microbiota and drugs in the future.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140019928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Network Subgraph-based Method: Alignment-free Technique for Molecular Network Analysis 基于网络子图的方法:分子网络分析的无对齐技术
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-02-22 DOI: 10.2174/0115748936285057240126062220
Efendi Zaenudin, Ezra B. Wijaya, Venugopala Reddy Mekala, Ka-Lok Ng
Objective: We propose a novel method to compare directed networks by decomposing the network into small modules, the so-called network subgraph approach, which is distinct from the network motif approach because it does not depend on null model assumptions. Method: We developed an alignment-free algorithm called the Subgraph Identification Algorithm (SIA), which could generate all subgraphs that have five connected nodes (5-node subgraph). There were 9,364 such modules. Then, we applied the SIA method to examine 17 cancer networks and measured the similarity between the two networks by gauging the similarity level using Jensen- Shannon entropy (HJS). Method: We developed an alignment-free algorithm called the Subgraph Identification Algorithm (SIA), which could generate all subgraphs that have five connected nodes (5-node subgraph). There were 9,364 such modules. Then, we applied the SIA method to examine 17 cancer networks and measured the similarity between the two networks by gauging the similarity level using Jensen- Shannon entropy (HJS). Results:: We identified and examined the biological meaning of 5-node regulatory modules and pairs of cancer networks with the smallest HJS values. The two pairs of networks that show similar patterns are (i) endometrial cancer and hepatocellular carcinoma and (ii) breast cancer and pathways in cancer. Some studies have provided experimental data supporting the 5-node regulatory modules. result: We identify and examine the biological meaning of 5-node regulatory modules and pairs of cancer networks which have the smallest HJS values. These two pairs of networks that show similar patterns are (i) endometrial cancer and hepatocellular carcinoma, and (ii) breast cancer and pathways in cancer. Some literature studies provide experimental data to support the 5-node regulatory modules. Conclusion: Our method is an alignment-free approach that measures the topological similarity of 5-node regulatory modules and aligns two directed networks based on their topology. These modules capture complex interactions among multiple genes that cannot be detected using existing methods that only consider single-gene relations. We analyzed the biological relevance of the regulatory modules and used the subgraph method to identify the modules that shared the same topology across 2 cancer networks out of 17 cancer networks. We validated our findings using evidence from the literature.
目的:我们提出了一种通过将网络分解成小模块来比较有向网络的新方法,即所谓的网络子图方法,这种方法与网络图案方法不同,因为它不依赖于空模型假设。方法:我们开发了一种名为 "子图识别算法"(SIA)的无对齐算法,它可以生成所有具有五个连接节点的子图(5 节点子图)。共有 9364 个这样的模块。然后,我们应用 SIA 方法研究了 17 个癌症网络,并使用詹森-香农熵(HJS)测量了两个网络的相似度。方法:我们开发了一种名为 "子图识别算法(SIA)"的无对齐算法,该算法可以生成所有具有五个连接节点的子图(五节点子图)。共有 9364 个这样的模块。然后,我们应用 SIA 方法研究了 17 个癌症网络,并使用詹森-香农熵(HJS)测量了两个网络的相似度。结果我们确定并研究了 HJS 值最小的 5 节点调控模块和癌症网络对的生物学意义。表现出相似模式的两对网络是:(i) 子宫内膜癌和肝细胞癌;(ii) 乳腺癌和癌症中的通路。一些研究提供了支持 5 节点调控模块的实验数据:我们识别并研究了 5 节点调控模块和 HJS 值最小的癌症网络对的生物学意义。这两对显示出相似模式的网络是:(i) 子宫内膜癌和肝细胞癌;(ii) 乳腺癌和癌症路径。一些文献研究提供了支持 5 节点调控模块的实验数据。结论我们的方法是一种免配准方法,可测量 5 节点调控模块的拓扑相似性,并根据其拓扑结构配准两个有向网络。这些模块捕捉了多个基因之间复杂的相互作用,而现有的方法只考虑单基因关系,无法检测到这些相互作用。我们分析了调控模块的生物学相关性,并使用子图方法从 17 个癌症网络中找出了在 2 个癌症网络中拓扑结构相同的模块。我们利用文献中的证据验证了我们的发现。
{"title":"Network Subgraph-based Method: Alignment-free Technique for Molecular Network Analysis","authors":"Efendi Zaenudin, Ezra B. Wijaya, Venugopala Reddy Mekala, Ka-Lok Ng","doi":"10.2174/0115748936285057240126062220","DOIUrl":"https://doi.org/10.2174/0115748936285057240126062220","url":null,"abstract":"Objective: We propose a novel method to compare directed networks by decomposing the network into small modules, the so-called network subgraph approach, which is distinct from the network motif approach because it does not depend on null model assumptions. Method: We developed an alignment-free algorithm called the Subgraph Identification Algorithm (SIA), which could generate all subgraphs that have five connected nodes (5-node subgraph). There were 9,364 such modules. Then, we applied the SIA method to examine 17 cancer networks and measured the similarity between the two networks by gauging the similarity level using Jensen- Shannon entropy (HJS). Method: We developed an alignment-free algorithm called the Subgraph Identification Algorithm (SIA), which could generate all subgraphs that have five connected nodes (5-node subgraph). There were 9,364 such modules. Then, we applied the SIA method to examine 17 cancer networks and measured the similarity between the two networks by gauging the similarity level using Jensen- Shannon entropy (HJS). Results:: We identified and examined the biological meaning of 5-node regulatory modules and pairs of cancer networks with the smallest HJS values. The two pairs of networks that show similar patterns are (i) endometrial cancer and hepatocellular carcinoma and (ii) breast cancer and pathways in cancer. Some studies have provided experimental data supporting the 5-node regulatory modules. result: We identify and examine the biological meaning of 5-node regulatory modules and pairs of cancer networks which have the smallest HJS values. These two pairs of networks that show similar patterns are (i) endometrial cancer and hepatocellular carcinoma, and (ii) breast cancer and pathways in cancer. Some literature studies provide experimental data to support the 5-node regulatory modules. Conclusion: Our method is an alignment-free approach that measures the topological similarity of 5-node regulatory modules and aligns two directed networks based on their topology. These modules capture complex interactions among multiple genes that cannot be detected using existing methods that only consider single-gene relations. We analyzed the biological relevance of the regulatory modules and used the subgraph method to identify the modules that shared the same topology across 2 cancer networks out of 17 cancer networks. We validated our findings using evidence from the literature.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139953791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A-RFP: An Adaptive Residue Flexibility Prediction Method Improving Protein-ligand Docking Based on Homologous Proteins A-RFP:基于同源蛋白质的自适应残基柔性预测方法,用于改善蛋白质配体对接
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-02-20 DOI: 10.2174/0115748936258790240101062642
Chuqi Lei, Senbiao Fang, Yaohang Li, Fei Guo, Min Li
background: computational molecular docking plays an important role in determining the precise receptor-ligand conformation, which becomes a powerful tool for drug discovery. In the past 30 years, most computational docking methods treat the receptor structure as a rigid body, although flexible docking often yields higher accuracy. The main disadvantage of flexible docking is its significantly higher computational cost. Due to the fact that different protein pock-et residues exhibit different degrees of flexibility, semi-flexible docking methods, balancing rigid docking and flexible docking, have demonstrated success in predicting highly accurate conformations with a relatively low computational cost. method: In our study, the number of flexible pocket residues was assessed by quantitative analysis, and a novel adaptive residue flexibility prediction method, named A-RFP, was proposed to improve the docking performance. Based on the homologous information, a joint strategy is used to predict the pocket residue flexibility by combining RMSD, the distance between the residue sidechain and the ligand, and the sidechain orientation. For each receptor-ligand pair, A-RFP provides a docking conformation with the optimal affinity. result: By analyzing the docking affinities of 3507 target-ligand pairs in 5 different values ranging from 0 to 10, we found there is a general trend that the larger number of flexible residues inevitably improves the docking results by using Autodock Vina. However, a certain number of counterexamples still exist. To validate the effectiveness of A-RFP, the experimental assessment was tested in a small-scale virtual screening on 5 proteins, which confirmed that A-RFP could enhance the docking performance. And the flexible-receptor virtual screening on a low-similarity dataset with 85 receptors validates the accuracy of residue flexibility comprehensive evaluation. Moreover, we studied three receptors with FDA-approved drugs, which further proved A-RFP can play a suitable role in ligand discovery. conclusion: Our analysis confirms that the screening performance of the various number of flexible residues varies wildly across receptors. It suggests that a fine-grained docking method would offset the aforementioned deficiency. Thus, we presented A-RFP, an adaptive pocket residue flexibility prediction method based on homologous information. Without considering computational resources and time costs, A-RFP provides the optimal docking result.
背景:计算分子对接在确定受体-配体的精确构象方面发挥着重要作用,成为药物发现的有力工具。在过去的 30 年中,大多数计算对接方法都将受体结构视为刚体,尽管柔性对接通常能获得更高的精确度。柔性对接的主要缺点是计算成本较高。由于不同的蛋白质受体残基表现出不同程度的柔性,半柔性对接方法在刚性对接和柔性对接之间取得了平衡,成功地以相对较低的计算成本预测了高精度的构象:在我们的研究中,通过定量分析评估了柔性口袋残基的数量,并提出了一种名为 A-RFP 的新型自适应残基柔性预测方法,以提高对接性能。在同源信息的基础上,结合 RMSD、残基侧链与配体之间的距离以及侧链方向,采用联合策略预测口袋残基的柔性。对于每一对受体配体,A-RFP 都能提供一个具有最佳亲和力的对接构象:通过分析 3507 对目标物-配体在 5 个从 0 到 10 的不同数值范围内的对接亲和力,我们发现一个普遍的趋势是,柔性残基的数量越多,使用 Autodock Vina 不可避免地会改善对接结果。但是,仍然存在一定数量的反例。为了验证 A-RFP 的有效性,实验评估在 5 个蛋白质的小规模虚拟筛选中进行了测试,结果证实 A-RFP 可以提高对接性能。在一个包含 85 个受体的低相似性数据集上进行的柔性受体虚拟筛选验证了残基柔性综合评估的准确性。此外,我们还研究了三种与 FDA 批准药物配伍的受体,这进一步证明了 A-RFP 在配体发现中可以发挥合适的作用:我们的分析证实,不同数量的柔性残基在不同受体上的筛选性能差异很大。这表明细粒度对接方法可以弥补上述不足。因此,我们提出了基于同源信息的自适应口袋残基柔性预测方法 A-RFP。在不考虑计算资源和时间成本的情况下,A-RFP 提供了最佳的对接结果。
{"title":"A-RFP: An Adaptive Residue Flexibility Prediction Method Improving Protein-ligand Docking Based on Homologous Proteins","authors":"Chuqi Lei, Senbiao Fang, Yaohang Li, Fei Guo, Min Li","doi":"10.2174/0115748936258790240101062642","DOIUrl":"https://doi.org/10.2174/0115748936258790240101062642","url":null,"abstract":"background: computational molecular docking plays an important role in determining the precise receptor-ligand conformation, which becomes a powerful tool for drug discovery. In the past 30 years, most computational docking methods treat the receptor structure as a rigid body, although flexible docking often yields higher accuracy. The main disadvantage of flexible docking is its significantly higher computational cost. Due to the fact that different protein pock-et residues exhibit different degrees of flexibility, semi-flexible docking methods, balancing rigid docking and flexible docking, have demonstrated success in predicting highly accurate conformations with a relatively low computational cost. method: In our study, the number of flexible pocket residues was assessed by quantitative analysis, and a novel adaptive residue flexibility prediction method, named A-RFP, was proposed to improve the docking performance. Based on the homologous information, a joint strategy is used to predict the pocket residue flexibility by combining RMSD, the distance between the residue sidechain and the ligand, and the sidechain orientation. For each receptor-ligand pair, A-RFP provides a docking conformation with the optimal affinity. result: By analyzing the docking affinities of 3507 target-ligand pairs in 5 different values ranging from 0 to 10, we found there is a general trend that the larger number of flexible residues inevitably improves the docking results by using Autodock Vina. However, a certain number of counterexamples still exist. To validate the effectiveness of A-RFP, the experimental assessment was tested in a small-scale virtual screening on 5 proteins, which confirmed that A-RFP could enhance the docking performance. And the flexible-receptor virtual screening on a low-similarity dataset with 85 receptors validates the accuracy of residue flexibility comprehensive evaluation. Moreover, we studied three receptors with FDA-approved drugs, which further proved A-RFP can play a suitable role in ligand discovery. conclusion: Our analysis confirms that the screening performance of the various number of flexible residues varies wildly across receptors. It suggests that a fine-grained docking method would offset the aforementioned deficiency. Thus, we presented A-RFP, an adaptive pocket residue flexibility prediction method based on homologous information. Without considering computational resources and time costs, A-RFP provides the optimal docking result.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139926820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CFCN: An HLA-peptide Prediction Model based on Taylor ExtensionTheory and Multi-view Learning CFCN:基于泰勒扩展理论和多视角学习的 HLA 肽预测模型
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-02-16 DOI: 10.2174/0115748936299044240202100019
B. Rao, Bing Han, Leyi Wei, Zeyu Zhang, Xinbo Jiang, Balachandran Manavalan
With the increasing development of biotechnology, many cancer solutionshave been proposed nowadays. In recent years, Neo-peptides-based methods have made significantcontributions, with an essential prerequisite of bindings between peptides and HLA molecules.However, the binding is hard to predict, and the accuracy is expected to improve further.Therefore, we propose the Crossed Feature Correction Network (CFCN) with deeplearning method, which can automatically extract and adaptively learn the discriminative featuresin HLA-peptide binding, in order to make more accurate predictions on HLA-peptide bindingtasks. With the fancy structure of encoding and feature extracting process for peptides, as well asthe feature fusion process between fine-grained and coarse-grained level, it shows many advantageson given tasks.The experiment illustrates that CFCN achieves better performances overall, comparedwith other fancy models in many aspects.In addition, we also consider to use multi-view learning methods for the feature fusionprocess, in order to find out further relations among binding features. Eventually, we encapsulateour model as a useful tool for further research on binding tasks.
随着生物技术的不断发展,目前已提出了许多癌症解决方案。因此,我们提出了采用深度学习方法的交叉特征校正网络(Crossed Feature Correction Network,CFCN),它可以自动提取和自适应学习HLA-多肽结合中的判别特征,从而对HLA-多肽结合任务做出更准确的预测。此外,我们还考虑在特征融合过程中使用多视角学习方法,以进一步发现结合特征之间的关系。最终,我们将我们的模型封装成一个有用的工具,用于进一步研究绑定任务。
{"title":"CFCN: An HLA-peptide Prediction Model based on Taylor Extension\u0000Theory and Multi-view Learning","authors":"B. Rao, Bing Han, Leyi Wei, Zeyu Zhang, Xinbo Jiang, Balachandran Manavalan","doi":"10.2174/0115748936299044240202100019","DOIUrl":"https://doi.org/10.2174/0115748936299044240202100019","url":null,"abstract":"\u0000\u0000With the increasing development of biotechnology, many cancer solutions\u0000have been proposed nowadays. In recent years, Neo-peptides-based methods have made significant\u0000contributions, with an essential prerequisite of bindings between peptides and HLA molecules.\u0000However, the binding is hard to predict, and the accuracy is expected to improve further.\u0000\u0000\u0000\u0000Therefore, we propose the Crossed Feature Correction Network (CFCN) with deep\u0000learning method, which can automatically extract and adaptively learn the discriminative features\u0000in HLA-peptide binding, in order to make more accurate predictions on HLA-peptide binding\u0000tasks. With the fancy structure of encoding and feature extracting process for peptides, as well as\u0000the feature fusion process between fine-grained and coarse-grained level, it shows many advantages\u0000on given tasks.\u0000\u0000\u0000\u0000The experiment illustrates that CFCN achieves better performances overall, compared\u0000with other fancy models in many aspects.\u0000\u0000\u0000\u0000In addition, we also consider to use multi-view learning methods for the feature fusion\u0000process, in order to find out further relations among binding features. Eventually, we encapsulate\u0000our model as a useful tool for further research on binding tasks.\u0000","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140454148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sia-m7G: Predicting m7G Sites through the Siamese Neural Network with an Attention Mechanism Sia-m7G:通过具有注意力机制的连体神经网络预测 m7G 位点
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-02-09 DOI: 10.2174/0115748936285540240116065719
Jia Zheng, Yetong Zhou
Background: The chemical modification of RNA plays a crucial role in many biological processes. N7-methylguanosine (m7G), being one of the most important epigenetic modifications, plays an important role in gene expression, processing metabolism, and protein synthesis. Detecting the exact location of m7G sites in the transcriptome is key to understanding their relevant mechanism in gene expression. On the basis of experimentally validated data, several machine learning or deep learning tools have been designed to identify internal m7G sites and have shown advantages over traditional experimental methods in terms of speed, cost-effectiveness and robustness. Aims: In this study, we aim to develop a computational model to help predict the exact location of m7G sites in humans. Objective: Simple and advanced encoding methods and deep learning networks are designed to achieve excellent m7G prediction efficiently. Methods: Three types of feature extractions and six classification algorithms were tested to identify m7G sites. Our final model, named Sia-m7G, adopts one-hot encoding and a delicate Siamese neural network with an attention mechanism. In addition, multiple 10-fold cross-validation tests were conducted to evaluate our predictor. Results: Sia-m7G achieved the highest sensitivity, specificity and accuracy on 10-fold crossvalidation tests compared with the other six m7G predictors. Nucleotide preference and model visualization analyses were conducted to strengthen the interpretability of Sia-m7G and provide a further understanding of m7G site fragments in genomic sequences. Conclusion: Sia-m7G has significant advantages over other classifiers and predictors, which proves the superiority of the Siamese neural network algorithm in identifying m7G sites.
背景:RNA 的化学修饰在许多生物过程中起着至关重要的作用。N7-甲基鸟苷(m7G)是最重要的表观遗传修饰之一,在基因表达、加工代谢和蛋白质合成中发挥着重要作用。检测 m7G 位点在转录组中的确切位置是了解其在基因表达中的相关机制的关键。在实验验证数据的基础上,人们设计了一些机器学习或深度学习工具来识别内部的 m7G 位点,与传统的实验方法相比,这些工具在速度、成本效益和鲁棒性方面都显示出了优势。目的:在本研究中,我们旨在开发一种计算模型,帮助预测人类 m7G 位点的确切位置。目标:通过简单、先进的编码方法和深度分析技术,预测人类 m7G 位点的准确位置:设计简单而先进的编码方法和深度学习网络,以高效实现出色的 m7G 预测。方法:测试了三种特征提取和六种分类算法,以识别 m7G 位点。我们的最终模型被命名为 Sia-m7G,它采用了单次热编码和具有注意机制的精致连体神经网络。此外,我们还进行了多次 10 倍交叉验证测试,以评估我们的预测器。结果与其他六种 m7G 预测因子相比,Sia-m7G 在 10 倍交叉验证测试中的灵敏度、特异性和准确性都是最高的。进行了核苷酸偏好和模型可视化分析,以加强 Sia-m7G 的可解释性,并进一步了解基因组序列中的 m7G 位点片段。结论与其他分类器和预测器相比,Sia-m7G 具有显著优势,这证明了连体神经网络算法在识别 m7G 位点方面的优越性。
{"title":"Sia-m7G: Predicting m7G Sites through the Siamese Neural Network with an Attention Mechanism","authors":"Jia Zheng, Yetong Zhou","doi":"10.2174/0115748936285540240116065719","DOIUrl":"https://doi.org/10.2174/0115748936285540240116065719","url":null,"abstract":"Background: The chemical modification of RNA plays a crucial role in many biological processes. N7-methylguanosine (m7G), being one of the most important epigenetic modifications, plays an important role in gene expression, processing metabolism, and protein synthesis. Detecting the exact location of m7G sites in the transcriptome is key to understanding their relevant mechanism in gene expression. On the basis of experimentally validated data, several machine learning or deep learning tools have been designed to identify internal m7G sites and have shown advantages over traditional experimental methods in terms of speed, cost-effectiveness and robustness. Aims: In this study, we aim to develop a computational model to help predict the exact location of m7G sites in humans. Objective: Simple and advanced encoding methods and deep learning networks are designed to achieve excellent m7G prediction efficiently. Methods: Three types of feature extractions and six classification algorithms were tested to identify m7G sites. Our final model, named Sia-m7G, adopts one-hot encoding and a delicate Siamese neural network with an attention mechanism. In addition, multiple 10-fold cross-validation tests were conducted to evaluate our predictor. Results: Sia-m7G achieved the highest sensitivity, specificity and accuracy on 10-fold crossvalidation tests compared with the other six m7G predictors. Nucleotide preference and model visualization analyses were conducted to strengthen the interpretability of Sia-m7G and provide a further understanding of m7G site fragments in genomic sequences. Conclusion: Sia-m7G has significant advantages over other classifiers and predictors, which proves the superiority of the Siamese neural network algorithm in identifying m7G sites.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139759975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrated Machine Learning Algorithms for Stratification of Patients with Bladder Cancer 用于膀胱癌患者分层的集成机器学习算法
IF 4 3区 生物学 Q1 Mathematics Pub Date : 2024-02-07 DOI: 10.2174/0115748936288453240124082031
Yuanyuan He, Haodong Wei, Siqing Liao, Ruiming Ou, Yuqiang Xiong, Yongchun Zuo, Lei Yang
Background: Bladder cancer is a prevalent malignancy globally, characterized by rising incidence and mortality rates. Stratifying bladder cancer patients into different subtypes is crucial for the effective treatment of this form of cancer. Therefore, there is a need to develop a stratification model specific to bladder cancer. Purpose: This study aims to establish a prognostic prediction model for bladder cancer, with the primary goal of accurately predicting prognosis and treatment outcomes. objective: This study aims to establish a prognostic prediction model for bladder cancer, with the primary goal of accurately predicting prognosis and treatment outcomes. Methods: We collected datasets from 10 bladder cancer samples sourced from the Gene Expression Omnibus (GEO), the Cancer Genome Atlas (TCGA) databases, and IMvigor210 dataset. The machine learning based algorithms were used to generate 96 models for establishing the risk score for each patient. Based on the risk score, all the patients was classified into two different risk score groups. Results: The two groups of bladder cancer patients exhibited significant differences in prognosis, biological functions, and drug sensitivity. Nomogram model demonstrated that the risk score had a robust predictive effect with good clinical utility. Conclusion: The risk score constructed in this study can be utilized to predict the prognosis, response to drug treatment, and immunotherapy of bladder cancer patients, providing assistance for personalized clinical treatment of bladder cancer. other: None
背景:膀胱癌是一种全球流行的恶性肿瘤,发病率和死亡率不断上升。将膀胱癌患者分为不同亚型对有效治疗这种癌症至关重要。因此,有必要开发一种专门针对膀胱癌的分层模型。目的:本研究旨在建立膀胱癌预后预测模型,主要目的是准确预测预后和治疗效果:本研究旨在建立膀胱癌预后预测模型,主要目的是准确预测预后和治疗效果。方法:我们收集了 10 个膀胱癌患者的数据集:我们从基因表达总库(GEO)、癌症基因组图谱(TCGA)数据库和 IMvigor210 数据集中收集了 10 个膀胱癌样本的数据集。利用基于机器学习的算法生成了 96 个模型,为每位患者确定了风险评分。根据风险评分,所有患者被分为两个不同的风险评分组。结果显示两组膀胱癌患者在预后、生物功能和药物敏感性方面存在显著差异。提名图模型表明,风险评分具有很强的预测效果和良好的临床实用性。结论本研究构建的风险评分可用于预测膀胱癌患者的预后、对药物治疗的反应和免疫治疗,为膀胱癌的个性化临床治疗提供帮助。 其他:无
{"title":"Integrated Machine Learning Algorithms for Stratification of Patients with Bladder Cancer","authors":"Yuanyuan He, Haodong Wei, Siqing Liao, Ruiming Ou, Yuqiang Xiong, Yongchun Zuo, Lei Yang","doi":"10.2174/0115748936288453240124082031","DOIUrl":"https://doi.org/10.2174/0115748936288453240124082031","url":null,"abstract":"Background: Bladder cancer is a prevalent malignancy globally, characterized by rising incidence and mortality rates. Stratifying bladder cancer patients into different subtypes is crucial for the effective treatment of this form of cancer. Therefore, there is a need to develop a stratification model specific to bladder cancer. Purpose: This study aims to establish a prognostic prediction model for bladder cancer, with the primary goal of accurately predicting prognosis and treatment outcomes. objective: This study aims to establish a prognostic prediction model for bladder cancer, with the primary goal of accurately predicting prognosis and treatment outcomes. Methods: We collected datasets from 10 bladder cancer samples sourced from the Gene Expression Omnibus (GEO), the Cancer Genome Atlas (TCGA) databases, and IMvigor210 dataset. The machine learning based algorithms were used to generate 96 models for establishing the risk score for each patient. Based on the risk score, all the patients was classified into two different risk score groups. Results: The two groups of bladder cancer patients exhibited significant differences in prognosis, biological functions, and drug sensitivity. Nomogram model demonstrated that the risk score had a robust predictive effect with good clinical utility. Conclusion: The risk score constructed in this study can be utilized to predict the prognosis, response to drug treatment, and immunotherapy of bladder cancer patients, providing assistance for personalized clinical treatment of bladder cancer. other: None","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139760199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Current Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1