Pub Date : 2024-06-06DOI: 10.1016/j.compbiolchem.2024.108119
Meng Wang, Xinyue Yan, Yanan Dong, Xiaoqin Li, Bin Gao
Hepatocellular carcinoma (HCC) is a widespread primary liver cancer with a high fatality rate. Despite several genes with oncogenic effects in HCC have been identified, many remain undiscovered. In this study, we conducted a comprehensive computational analysis to explore the involvement of genes within the same families as known driver genes in HCC. Specifically, we expanded the concept beyond single-gene mutations to encompass gene families sharing homologous structures, integrating various omics data to comprehensively understand gene abnormalities in cancer. Our analysis identified 74 domains with an enriched mutation burden, 404 domain mutation hotspots, and 233 dysregulated driver genes. We observed that specific low-frequency somatic mutations may contribute to HCC occurrence, potentially overlooked by single-gene algorithms. Furthermore, we systematically analyzed how abnormalities in the ubiquitinated proteasome system (UPS) impact HCC, finding that abnormal genes in E3, E2, DUB families, and Degron genes often result in HCC by affecting the stability of oncogenic or tumor suppressor proteins. In conclusion, expanding the exploration of driver genes to include gene families with homologous structures emerges as a promising strategy for uncovering additional oncogenic alterations in HCC.
{"title":"From driver genes to gene families: A computational analysis of oncogenic mutations and ubiquitination anomalies in hepatocellular carcinoma","authors":"Meng Wang, Xinyue Yan, Yanan Dong, Xiaoqin Li, Bin Gao","doi":"10.1016/j.compbiolchem.2024.108119","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108119","url":null,"abstract":"<div><p>Hepatocellular carcinoma (HCC) is a widespread primary liver cancer with a high fatality rate. Despite several genes with oncogenic effects in HCC have been identified, many remain undiscovered. In this study, we conducted a comprehensive computational analysis to explore the involvement of genes within the same families as known driver genes in HCC. Specifically, we expanded the concept beyond single-gene mutations to encompass gene families sharing homologous structures, integrating various omics data to comprehensively understand gene abnormalities in cancer. Our analysis identified 74 domains with an enriched mutation burden, 404 domain mutation hotspots, and 233 dysregulated driver genes. We observed that specific low-frequency somatic mutations may contribute to HCC occurrence, potentially overlooked by single-gene algorithms. Furthermore, we systematically analyzed how abnormalities in the ubiquitinated proteasome system (UPS) impact HCC, finding that abnormal genes in E3, E2, DUB families, and Degron genes often result in HCC by affecting the stability of oncogenic or tumor suppressor proteins. In conclusion, expanding the exploration of driver genes to include gene families with homologous structures emerges as a promising strategy for uncovering additional oncogenic alterations in HCC.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141291963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1016/j.compbiolchem.2024.108117
Tarikul I. Milon , Yuhong Wang , Ryan L. Fontenot , Poorya Khajouie , Francois Villinger , Vijay Raghavan , Wu Xu
Understanding the mechanisms underlying interactions between drugs and target proteins is critical for drug discovery. In our earlier studies, we introduced the Triangular Spatial Relationship (TSR)-based algorithm, which enables the representation of a protein’s 3D structure as a vector of integers (TSR keys). These TSR keys correspond to substructures of the 3D structure of a protein and are computed based on the triangles constructed by all possible triples of Cα atoms within the protein. In this study, we report on a new TSR-based algorithm for probing drug and target interactions. Specifically, we have extended the previous algorithm in three novel directions: TSR keys for representing the 3D structure of a drug or a ligand, cross TSR keys between drugs and their targets and intra-residual TSR keys for phosphorylated amino acids. The outcomes illustrate the key contributions as follows: (i) The TSR-based method, which uses the TSR keys as features, is unique in its capability to interpret hierarchical relationships of drugs as well as drug - target complexes using common and specific TSR keys. (ii) The method can distinguish not only the binding sites from the rest of the protein structures, but also the binding sites of primary targets from those of off-targets. (iii) The method has the potential to correlate the 3D structures of drugs with their functions. (iv) Representation of 3D structures by TSR keys has its unique advantage in terms of ease of making searching for similar substructures across structure datasets easier. In summary, this study presents a novel computational methodology, with significant advantages, for providing insights into the mechanism underlying drug and target interactions.
了解药物与靶蛋白之间的相互作用机制对于药物发现至关重要。在早期的研究中,我们引入了基于三角形空间关系(TSR)的算法,该算法能将蛋白质的三维结构表示为整数向量(TSR 键)。这些 TSR 键对应于蛋白质三维结构的子结构,是根据蛋白质中所有可能的 Cα 原子三元组构建的三角形计算得出的。在本研究中,我们报告了一种基于 TSR 的新算法,用于探测药物与靶点的相互作用。具体来说,我们在三个新方向上扩展了之前的算法:代表药物或配体三维结构的 TSR 键、药物与其靶标之间的交叉 TSR 键以及磷酸化氨基酸的残留内 TSR 键。这些成果说明了以下主要贡献:(i) 基于 TSR 的方法使用 TSR 键作为特征,其独特之处在于能够使用常见和特定的 TSR 键解释药物的层次关系以及药物-靶标复合物。(ii) 该方法不仅能将结合位点与蛋白质结构的其他部分区分开来,还能将主要靶标的结合位点与非靶标的结合位点区分开来。(iii) 该方法有可能将药物的三维结构与其功能联系起来。(iv) 用 TSR 键表示三维结构有其独特的优势,可以更容易地在结构数据集中搜索相似的子结构。总之,本研究提出了一种具有显著优势的新型计算方法,有助于深入了解药物与靶点相互作用的内在机制。
{"title":"Development of a novel representation of drug 3D structures and enhancement of the TSR-based method for probing drug and target interactions","authors":"Tarikul I. Milon , Yuhong Wang , Ryan L. Fontenot , Poorya Khajouie , Francois Villinger , Vijay Raghavan , Wu Xu","doi":"10.1016/j.compbiolchem.2024.108117","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108117","url":null,"abstract":"<div><p>Understanding the mechanisms underlying interactions between drugs and target proteins is critical for drug discovery. In our earlier studies, we introduced the Triangular Spatial Relationship (TSR)-based algorithm, which enables the representation of a protein’s 3D structure as a vector of integers (TSR keys). These TSR keys correspond to substructures of the 3D structure of a protein and are computed based on the triangles constructed by all possible triples of C<sub>α</sub> atoms within the protein. In this study, we report on a new TSR-based algorithm for probing drug and target interactions. Specifically, we have extended the previous algorithm in three novel directions: TSR keys for representing the 3D structure of a drug or a ligand, cross TSR keys between drugs and their targets and intra-residual TSR keys for phosphorylated amino acids. The outcomes illustrate the key contributions as follows: (i) The TSR-based method, which uses the TSR keys as features, is unique in its capability to interpret hierarchical relationships of drugs as well as drug - target complexes using common and specific TSR keys. (ii) The method can distinguish not only the binding sites from the rest of the protein structures, but also the binding sites of primary targets from those of off-targets. (iii) The method has the potential to correlate the 3D structures of drugs with their functions. (iv) Representation of 3D structures by TSR keys has its unique advantage in terms of ease of making searching for similar substructures across structure datasets easier. In summary, this study presents a novel computational methodology, with significant advantages, for providing insights into the mechanism underlying drug and target interactions.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141291964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oxyresveratrol (OXY), a natural stilbenoid in mulberry fruits, is known for its diverse pharmacological properties. However, its clinical use is hindered by low water solubility and limited bioavailability. In the present study, the inclusion complexes of OXY with β-cyclodextrin (βCD) and its three analogs, dimethyl-β-cyclodextrin (DMβCD), hydroxypropyl-β-cyclodextrin (HPβCD) and sulfobutylether-β-cyclodextrin (SBEβCD), were investigated using in silico and in vitro studies. Molecular docking revealed two binding orientations of OXY, namely, 4′,6′-dihydroxyphenyl (A-form) and 5,7-benzenediol ring (B-form). Molecular Dynamics simulations suggested the formation of inclusion complexes with βCDs through two distinct orientations, with OXY/SBEβCD exhibiting maximum atom contacts and the lowest solvent-exposed area in the hydrophobic cavity. These results corresponded well with the highest binding affinity observed in OXY/SBEβCD when assessed using the MM/GBSA method. Beyond traditional simulation methods, Ligand-binding Parallel Cascade Selection Molecular Dynamics method was employed to investigate how the drug enters and accommodates within the hydrophobic cavity. The in silico results aligned with stability constants: SBEβCD (2060 M−1), HPβCD (1860 M−1), DMβCD (1700 M−1), and βCD (1420 M−1). All complexes exhibited a 1:1 binding mode (AL type), with SBEβCD enhancing OXY solubility (25-fold). SEM micrographs, DSC thermograms, FT-IR and 1H NMR spectra confirm the inclusion complex formation, revealing novel surface morphologies, distinctive thermal behaviors, and new peaks. Notably, the inhibitory impact on the proliferation of breast cancer cell lines, MCF-7, exhibited by inclusion complexes particularly OXY/DMβCD, OXY/HPβCD, and OXY/SBEβCD were markedly superior compared to that of OXY alone.
{"title":"Evaluating solubility, stability, and inclusion complexation of oxyresveratrol with various β-cyclodextrin derivatives using advanced computational techniques and experimental validation","authors":"Saba Ali , Aamir Aman , Kowit Hengphasatporn , Lipika Oopkaew , Bunyaporn Todee , Ryo Fujiki , Ryuhei Harada , Yasuteru Shigeta , Kuakarun Krusong , Kiattawee Choowongkomon , Warinthorn Chavasiri , Peter Wolschann , Panupong Mahalapbutr , Thanyada Rungrotmongkol","doi":"10.1016/j.compbiolchem.2024.108111","DOIUrl":"10.1016/j.compbiolchem.2024.108111","url":null,"abstract":"<div><p>Oxyresveratrol (OXY), a natural stilbenoid in mulberry fruits, is known for its diverse pharmacological properties. However, its clinical use is hindered by low water solubility and limited bioavailability. In the present study, the inclusion complexes of OXY with β-cyclodextrin (βCD) and its three analogs, dimethyl-β-cyclodextrin (DMβCD), hydroxypropyl-β-cyclodextrin (HPβCD) and sulfobutylether-β-cyclodextrin (SBEβCD), were investigated using <em>in silico</em> and <em>in vitro</em> studies. Molecular docking revealed two binding orientations of OXY, namely, 4′,6′-dihydroxyphenyl (A-form) and 5,7-benzenediol ring (B-form). Molecular Dynamics simulations suggested the formation of inclusion complexes with βCDs through two distinct orientations, with OXY/SBEβCD exhibiting maximum atom contacts and the lowest solvent-exposed area in the hydrophobic cavity. These results corresponded well with the highest binding affinity observed in OXY/SBEβCD when assessed using the MM/GBSA method. Beyond traditional simulation methods, Ligand-binding Parallel Cascade Selection Molecular Dynamics method was employed to investigate how the drug enters and accommodates within the hydrophobic cavity. The <em>in silico</em> results aligned with stability constants: SBEβCD (2060 M<sup>−1</sup>), HPβCD (1860 M<sup>−1</sup>), DMβCD (1700 M<sup>−1</sup>), and βCD (1420 M<sup>−1</sup>). All complexes exhibited a 1:1 binding mode (A<sub>L</sub> type), with SBEβCD enhancing OXY solubility (25-fold). SEM micrographs, DSC thermograms, FT-IR and <sup>1</sup>H NMR spectra confirm the inclusion complex formation, revealing novel surface morphologies, distinctive thermal behaviors, and new peaks. Notably, the inhibitory impact on the proliferation of breast cancer cell lines, MCF-7, exhibited by inclusion complexes particularly OXY/DMβCD, OXY/HPβCD, and OXY/SBEβCD were markedly superior compared to that of OXY alone.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141232803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-31DOI: 10.1016/j.compbiolchem.2024.108114
Bahar Çi̇ftçi̇ , Ramazan Teki̇n
There are billions of virus species worldwide, and viruses, the smallest parasitic entities, pose a serious threat. Therefore, fighting associated disorders requires an understanding of the genetic structure of viruses. Considering the wide diversity and rapid evolution of viruses, there is a critical need to quickly and accurately classify viral species and their potential hosts to better understand transmission dynamics, facilitating the development of targeted therapies. Recognizing this, this study has investigated the classes of RNA viruses based on their genomic sequences using Machine Learning (ML) and Deep Learning (DL) models. The PhyVirus dataset, consisting of pathogenic Single-stranded RNA viruses of Baltimore group four (+ssRNA) and five (-ssRNA) with different hosts and species, was analyzed. The dataset containing viral gene sequences was analyzed using the K-Mer coding technique, which is based on base words of various lengths. The study used classical ML algorithms (Random Forest, Gradient Boosting and Extra Trees) and the Fully Connected Deep Neural Network, a Deep Learning algorithm, to predict viral families and hosts. Detailed analyses were performed on the classifier performance in scenarios with different train-test ratios and different word lengths (k-values) for K-Mer. The observed results show that Fully Connected Deep Neural Network has a high success rate of 99.60 % in predicting virus families. In predicting virus hosts, the Extra Trees classifier achieved the highest success rate of 81.53 %. This study is considered to be the first classification study in the literature on this dataset, which has a very large family and host diversity consisting of gene sequences of Single-stranded RNA viruses. Our detailed investigations on how varying word lengths based on K-Mer coding in gene sequences affect the classification into viral families and hosts make this study particularly valuable. This study shows that ML and DL methods have the potential to produce valuable results in phylogenetic studies. In addition, the results and high-performance values show that these methods can be successfully used in regenerative applications of gene sequences or in studies such as the elimination of losses in gene sequences.
{"title":"Prediction of viral families and hosts of single-stranded RNA viruses based on K-Mer coding from phylogenetic gene sequences","authors":"Bahar Çi̇ftçi̇ , Ramazan Teki̇n","doi":"10.1016/j.compbiolchem.2024.108114","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108114","url":null,"abstract":"<div><p>There are billions of virus species worldwide, and viruses, the smallest parasitic entities, pose a serious threat. Therefore, fighting associated disorders requires an understanding of the genetic structure of viruses. Considering the wide diversity and rapid evolution of viruses, there is a critical need to quickly and accurately classify viral species and their potential hosts to better understand transmission dynamics, facilitating the development of targeted therapies. Recognizing this, this study has investigated the classes of RNA viruses based on their genomic sequences using Machine Learning (ML) and Deep Learning (DL) models. The PhyVirus dataset, consisting of pathogenic Single-stranded RNA viruses of Baltimore group four (+ssRNA) and five (-ssRNA) with different hosts and species, was analyzed. The dataset containing viral gene sequences was analyzed using the K-Mer coding technique, which is based on base words of various lengths. The study used classical ML algorithms (Random Forest, Gradient Boosting and Extra Trees) and the Fully Connected Deep Neural Network, a Deep Learning algorithm, to predict viral families and hosts. Detailed analyses were performed on the classifier performance in scenarios with different train-test ratios and different word lengths (k-values) for K-Mer. The observed results show that Fully Connected Deep Neural Network has a high success rate of 99.60 % in predicting virus families. In predicting virus hosts, the Extra Trees classifier achieved the highest success rate of 81.53 %. This study is considered to be the first classification study in the literature on this dataset, which has a very large family and host diversity consisting of gene sequences of Single-stranded RNA viruses. Our detailed investigations on how varying word lengths based on K-Mer coding in gene sequences affect the classification into viral families and hosts make this study particularly valuable. This study shows that ML and DL methods have the potential to produce valuable results in phylogenetic studies. In addition, the results and high-performance values show that these methods can be successfully used in regenerative applications of gene sequences or in studies such as the elimination of losses in gene sequences.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141291931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taste is crucial in driving food choice and preference. Umami is one of the basic tastes defined by characteristic deliciousness and mouthfulness that it imparts to foods. Identification of ingredients to enhance umami taste is of significant value to food industry. Various models have been shown to predict umami taste using feature encodings derived from traditional molecular descriptors such as amphiphilic pseudo-amino acid composition, dipeptide composition, and composition-transition-distribution. Highest reported accuracy of 90.5 % was recently achieved through novel model architecture. Here, we propose use of biological sequence transformers such as ProtBert and ESM2, trained on the Uniref databases, as the feature encoders block. With combination of 2 encoders and 2 classifiers, 4 model architectures were developed. Among the 4 models, ProtBert-CNN model outperformed other models with accuracy of 95 % on 5-fold cross validation data and 94 % on independent data.
{"title":"UmamiPreDL: Deep learning model for umami taste prediction of peptides using BERT and CNN","authors":"Arun Pandiyan Indiran , Humaira Fatima , Sampriti Chattopadhyay , Sureshkumar Ramadoss , Yashwanth Radhakrishnan","doi":"10.1016/j.compbiolchem.2024.108116","DOIUrl":"10.1016/j.compbiolchem.2024.108116","url":null,"abstract":"<div><p>Taste is crucial in driving food choice and preference. Umami is one of the basic tastes defined by characteristic deliciousness and mouthfulness that it imparts to foods. Identification of ingredients to enhance umami taste is of significant value to food industry. Various models have been shown to predict umami taste using feature encodings derived from traditional molecular descriptors such as amphiphilic pseudo-amino acid composition, dipeptide composition, and composition-transition-distribution. Highest reported accuracy of 90.5 % was recently achieved through novel model architecture. Here, we propose use of biological sequence transformers such as ProtBert and ESM2, trained on the Uniref databases, as the feature encoders block. With combination of 2 encoders and 2 classifiers, 4 model architectures were developed. Among the 4 models, ProtBert-CNN model outperformed other models with accuracy of 95 % on 5-fold cross validation data and 94 % on independent data.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141186564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-29DOI: 10.1016/j.compbiolchem.2024.108106
Xiaolei Zhang, Juan Liu, Feng Yang, Qiang Zhang, Zhihui Yang, Hayat Ali Shah
Bioretrosynthesis problem is to predict synthetic routes using substrates for given natural products (NPs). However, the huge number of metabolic reactions leads to a combinatorial explosion of searching space, which is high time-consuming and costly. Here, we propose a framework called BioRetro to predict bioretrosynthesis pathways using a one-step bioretrosynthesis network, termed HybridMLP combined with AND-OR tree heuristic search. The HybridMLP predicts precursors that will produce the target NPs, while the AND-OR tree generates the iterative multi-step biosynthetic pathways. The one-step bioretrosynthesis prediction experiments are conducted on MetaNetX dataset by using HybridMLP, which achieves 46.5%, 74.6%, 81.6% in terms of the top-1, top-5, top-10 accuracies. The great performance demonstrates the effectiveness of HybridMLP in one-step bioretrosynthesis. Besides, the evaluation of two benchmark datasets reveals that BioRetro can significantly improve the speed and success rate in predicting biosynthesis pathways. In addition, the BioRetro is further shown to find the synthetic pathway of compounds, such as ginsenoside F1 with the same substrates as reported but different enzymes, which may be the novel potential enzyme to have better catalytic performance.
{"title":"Planning biosynthetic pathways of target molecules based on metabolic reaction prediction and AND-OR tree search","authors":"Xiaolei Zhang, Juan Liu, Feng Yang, Qiang Zhang, Zhihui Yang, Hayat Ali Shah","doi":"10.1016/j.compbiolchem.2024.108106","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108106","url":null,"abstract":"<div><p>Bioretrosynthesis problem is to predict synthetic routes using substrates for given natural products (NPs). However, the huge number of metabolic reactions leads to a combinatorial explosion of searching space, which is high time-consuming and costly. Here, we propose a framework called BioRetro to predict bioretrosynthesis pathways using a one-step bioretrosynthesis network, termed HybridMLP combined with AND-OR tree heuristic search. The HybridMLP predicts precursors that will produce the target NPs, while the AND-OR tree generates the iterative multi-step biosynthetic pathways. The one-step bioretrosynthesis prediction experiments are conducted on MetaNetX dataset by using HybridMLP, which achieves 46.5%, 74.6%, 81.6% in terms of the top-1, top-5, top-10 accuracies. The great performance demonstrates the effectiveness of HybridMLP in one-step bioretrosynthesis. Besides, the evaluation of two benchmark datasets reveals that BioRetro can significantly improve the speed and success rate in predicting biosynthesis pathways. In addition, the BioRetro is further shown to find the synthetic pathway of compounds, such as ginsenoside F1 with the same substrates as reported but different enzymes, which may be the novel potential enzyme to have better catalytic performance.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141243737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1016/j.compbiolchem.2024.108113
Jia-Lin Cui , Hua Li , Qi He , Bin-Yan Jin , Zhe Liu , Xiao-Ming Zhang , Li Zhang
The integration of artificial intelligence (AI) into smart agriculture boosts production and management efficiency, facilitating sustainable agricultural development. In intensive agricultural management, adopting eco-friendly and effective pesticides is crucial to promote green agricultural practices. However, exploring new insecticides species is a difficult and time-consuming task that involves significant risks. Enhancing compound druggability in the lead discovery phase could considerably shorten the discovery cycle, accelerating insecticides research and development. The Insecticide Activity Prediction (IAPred) model, a novel classic artificial intelligence-based method for evaluating the potential insecticidal activity of unknown functional compounds, is introduced in this study. The IAPred model utilized 27 insecticide-likeness features from PaDEL descriptors and employed an ensemble of Support Vector Machine (SVM) and Random Forest (RF) algorithms using the hard-vote mechanism, achieving an accuracy rate of 86 %. Notably, the IAPred model outperforms current models by accurately predicting the efficacy of novel insecticides such as nicofluprole, overcoming the limitations inherent in existing insecticide structures. Our research presents a practical approach for discovering and optimizing novel insecticide lead compounds quickly and efficiently.
将人工智能(AI)融入智慧农业,可以提高生产和管理效率,促进农业可持续发展。在集约化农业管理中,采用环保、高效的杀虫剂对于促进绿色农业实践至关重要。然而,探索新的杀虫剂品种是一项艰巨而耗时的任务,而且涉及重大风险。在先导发现阶段提高化合物的可药性可大大缩短发现周期,加快杀虫剂的研发。杀虫活性预测(IAPred)模型是一种基于人工智能的新型经典方法,用于评估未知功能化合物的潜在杀虫活性。IAPred 模型利用了 PaDEL 描述符中的 27 个杀虫相似性特征,并采用了支持向量机(SVM)和随机森林(RF)算法的集合,使用了硬投票机制,准确率达到了 86%。值得注意的是,IAPred 模型准确预测了新型杀虫剂(如烟碱氟虫腈等)的药效,克服了现有杀虫剂结构固有的局限性,优于现有模型。我们的研究为快速高效地发现和优化新型杀虫剂先导化合物提供了一种实用方法。
{"title":"Integrating classic AI and agriculture: A novel model for predicting insecticide-likeness to enhance efficiency in insecticide development","authors":"Jia-Lin Cui , Hua Li , Qi He , Bin-Yan Jin , Zhe Liu , Xiao-Ming Zhang , Li Zhang","doi":"10.1016/j.compbiolchem.2024.108113","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108113","url":null,"abstract":"<div><p>The integration of artificial intelligence (AI) into smart agriculture boosts production and management efficiency, facilitating sustainable agricultural development. In intensive agricultural management, adopting eco-friendly and effective pesticides is crucial to promote green agricultural practices. However, exploring new insecticides species is a difficult and time-consuming task that involves significant risks. Enhancing compound druggability in the lead discovery phase could considerably shorten the discovery cycle, accelerating insecticides research and development. The Insecticide Activity Prediction (IAPred) model, a novel classic artificial intelligence-based method for evaluating the potential insecticidal activity of unknown functional compounds, is introduced in this study. The IAPred model utilized 27 insecticide-likeness features from PaDEL descriptors and employed an ensemble of Support Vector Machine (SVM) and Random Forest (RF) algorithms using the hard-vote mechanism, achieving an accuracy rate of 86 %. Notably, the IAPred model outperforms current models by accurately predicting the efficacy of novel insecticides such as nicofluprole, overcoming the limitations inherent in existing insecticide structures. Our research presents a practical approach for discovering and optimizing novel insecticide lead compounds quickly and efficiently.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141291962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1016/j.compbiolchem.2024.108112
Naveen Kumar V, T. Tamilanban
Venous leg ulcers (VLUs) pose a growing healthcare challenge due to aging, obesity, and sedentary lifestyles. Despite various treatments available, addressing the complex nature of VLUs remains difficult. In this context, this study investigates repurposing boronated drugs to inhibit arginase 1 activity for VLU treatment. The molecular docking study conducted by Schrodinger GLIDE targeted the binuclear manganese cluster of arginase 1 enzyme (2PHO). Further, the ligand-protein complex was subjected to molecular dynamic studies at 500 ns in Gromacs-2019.4. Trajectory analysis was performed using the GROMACS simulation package of protein RMSD, RMSF, RG, SASA, and H-Bond. The docking study revealed intriguing results where the tavaborole showed a better docking score (-3.957 Kcal/mol) compared to the substrate L-arginine (-3.379 Kcal/mol) and standard L-norvaline (-3.141 Kcal/mol). Tavaborole interaction with aspartic acid ultimately suggests that the drug molecule binds to the catalytic site of arginase 1, potentially influencing the enzyme's function. The dynamics study revealed the compounds' stability and compactness of the protein throughout the simulation. The RMSD, RMSF, SASA, RG, inter and intra H-bond, PCA, FEL, and MMBSA studies affirmed the ligand-protein and protein complex flexibility, compactness, binding energy, van der waals energy, and solvation dynamics. These results revealed the stability and the interaction of the ligand with the catalytic site of arginase 1 enzyme, triggering the study towards the VLU treatment.
{"title":"Computational therapeutic repurposing of tavaborole targeting arginase-1 for venous leg ulcer","authors":"Naveen Kumar V, T. Tamilanban","doi":"10.1016/j.compbiolchem.2024.108112","DOIUrl":"https://doi.org/10.1016/j.compbiolchem.2024.108112","url":null,"abstract":"<div><p>Venous leg ulcers (VLUs) pose a growing healthcare challenge due to aging, obesity, and sedentary lifestyles. Despite various treatments available, addressing the complex nature of VLUs remains difficult. In this context, this study investigates repurposing boronated drugs to inhibit arginase 1 activity for VLU treatment. The molecular docking study conducted by Schrodinger GLIDE targeted the binuclear manganese cluster of arginase 1 enzyme (2PHO). Further, the ligand-protein complex was subjected to molecular dynamic studies at 500 ns in Gromacs-2019.4. Trajectory analysis was performed using the GROMACS simulation package of protein RMSD, RMSF, RG, SASA, and H-Bond. The docking study revealed intriguing results where the tavaborole showed a better docking score (-3.957 Kcal/mol) compared to the substrate L-arginine (-3.379 Kcal/mol) and standard L-norvaline (-3.141 Kcal/mol). Tavaborole interaction with aspartic acid ultimately suggests that the drug molecule binds to the catalytic site of arginase 1, potentially influencing the enzyme's function. The dynamics study revealed the compounds' stability and compactness of the protein throughout the simulation. The RMSD, RMSF, SASA, RG, inter and intra H-bond, PCA, FEL, and MMBSA studies affirmed the ligand-protein and protein complex flexibility, compactness, binding energy, van der waals energy, and solvation dynamics. These results revealed the stability and the interaction of the ligand with the catalytic site of arginase 1 enzyme, triggering the study towards the VLU treatment.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141249921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-23DOI: 10.1016/j.compbiolchem.2024.108099
Jiedong Wei , Yijia Zhang , Xingwang Li , Mingyu Lu , Hongfei Lin
The combination of deep learning and the medical field has recently achieved great success, particularly in recommending medicine for patients. However, patients’ clinical records often contain repeated medical information that can significantly impact their health condition. Most existing methods for modeling longitudinal patient information overlook the impact of individual diagnoses and procedures on the patient’s health, resulting in insufficient patient representation and limited accuracy of medicine recommendations. Therefore, we propose a medicine recommendation model called KEAN, which is based on an attention aggregation network and enhanced graph convolution. Specifically, KEAN can aggregate individual diagnoses and procedures in patient visits to capture significant features that affect patients’ diseases. We further incorporate medicine knowledge from complex medicine combinations, reduce drug–drug interactions (DDIs), and recommend medicines that are beneficial to patients’ health. The experimental results on the MIMIC-III dataset demonstrate that our model outperforms existing advanced methods, which highlights the effectiveness of the proposed method.
{"title":"Knowledge enhanced attention aggregation network for medicine recommendation","authors":"Jiedong Wei , Yijia Zhang , Xingwang Li , Mingyu Lu , Hongfei Lin","doi":"10.1016/j.compbiolchem.2024.108099","DOIUrl":"10.1016/j.compbiolchem.2024.108099","url":null,"abstract":"<div><p>The combination of deep learning and the medical field has recently achieved great success, particularly in recommending medicine for patients. However, patients’ clinical records often contain repeated medical information that can significantly impact their health condition. Most existing methods for modeling longitudinal patient information overlook the impact of individual diagnoses and procedures on the patient’s health, resulting in insufficient patient representation and limited accuracy of medicine recommendations. Therefore, we propose a medicine recommendation model called KEAN, which is based on an attention aggregation network and enhanced graph convolution. Specifically, KEAN can aggregate individual diagnoses and procedures in patient visits to capture significant features that affect patients’ diseases. We further incorporate medicine knowledge from complex medicine combinations, reduce drug–drug interactions (DDIs), and recommend medicines that are beneficial to patients’ health. The experimental results on the MIMIC-III dataset demonstrate that our model outperforms existing advanced methods, which highlights the effectiveness of the proposed method.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141137592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-22DOI: 10.1016/j.compbiolchem.2024.108110
Mohamed Abd Elaziz , Abdelghani Dahou , Ahmad O. Aseeri , Ahmed A. Ewees , Mohammed A.A. Al-qaness , Rehab Ali Ibrahim
The recent advances in artificial intelligence modern approaches can play vital roles in the Internet of Medical Things (IoMT). Automatic diagnosis is one of the most important topics in the IoMT, including cancer diagnosis. Breast cancer is one of the top causes of death among women. Accurate diagnosis and early detection of breast cancer can improve the survival rate of patients. Deep learning models have demonstrated outstanding potential in accurately detecting and diagnosing breast cancer. This paper proposes a novel technology for breast cancer detection using CrossViT as the deep learning model and an enhanced version of the Growth Optimizer algorithm (MGO) as the feature selection method. CrossVit is a hybrid deep learning model that combines the strengths of both convolutional neural networks (CNNs) and transformers. The MGO is a meta-heuristic algorithm that selects the most relevant features from a large pool of features to enhance the performance of the model. The developed approach was evaluated on three publicly available breast cancer datasets and achieved competitive performance compared to other state-of-the-art methods. The results show that the combination of CrossViT and the MGO can effectively identify the most informative features for breast cancer detection, potentially assisting clinicians in making accurate diagnoses and improving patient outcomes. The MGO algorithm improves accuracy by approximately 1.59% on INbreast, 5.00% on MIAS, and 0.79% on MiniDDSM compared to other methods on each respective dataset. The developed approach can also be utilized to improve the Quality of Service (QoS) in the healthcare system as a deployable IoT-based intelligent solution or a decision-making assistance service, enhancing the efficiency and precision of the diagnosis.
{"title":"Cross vision transformer with enhanced Growth Optimizer for breast cancer detection in IoMT environment","authors":"Mohamed Abd Elaziz , Abdelghani Dahou , Ahmad O. Aseeri , Ahmed A. Ewees , Mohammed A.A. Al-qaness , Rehab Ali Ibrahim","doi":"10.1016/j.compbiolchem.2024.108110","DOIUrl":"10.1016/j.compbiolchem.2024.108110","url":null,"abstract":"<div><p>The recent advances in artificial intelligence modern approaches can play vital roles in the Internet of Medical Things (IoMT). Automatic diagnosis is one of the most important topics in the IoMT, including cancer diagnosis. Breast cancer is one of the top causes of death among women. Accurate diagnosis and early detection of breast cancer can improve the survival rate of patients. Deep learning models have demonstrated outstanding potential in accurately detecting and diagnosing breast cancer. This paper proposes a novel technology for breast cancer detection using CrossViT as the deep learning model and an enhanced version of the Growth Optimizer algorithm (MGO) as the feature selection method. CrossVit is a hybrid deep learning model that combines the strengths of both convolutional neural networks (CNNs) and transformers. The MGO is a meta-heuristic algorithm that selects the most relevant features from a large pool of features to enhance the performance of the model. The developed approach was evaluated on three publicly available breast cancer datasets and achieved competitive performance compared to other state-of-the-art methods. The results show that the combination of CrossViT and the MGO can effectively identify the most informative features for breast cancer detection, potentially assisting clinicians in making accurate diagnoses and improving patient outcomes. The MGO algorithm improves accuracy by approximately 1.59% on INbreast, 5.00% on MIAS, and 0.79% on MiniDDSM compared to other methods on each respective dataset. The developed approach can also be utilized to improve the Quality of Service (QoS) in the healthcare system as a deployable IoT-based intelligent solution or a decision-making assistance service, enhancing the efficiency and precision of the diagnosis.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141143989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}