首页 > 最新文献

Computational Biology and Chemistry最新文献

英文 中文
Leveraging protein language model embeddings and logistic regression for efficient and accurate in-silico acidophilic proteins classification 利用蛋白质语言模型嵌入和逻辑回归实现高效、准确的嗜酸性蛋白质室内分类。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-26 DOI: 10.1016/j.compbiolchem.2024.108163

The increasing demand for eco-friendly technologies in biotechnology necessitates effective and sustainable catalysts. Acidophilic proteins, functioning optimally in highly acidic environments, hold immense promise for various applications, including food production, biofuels, and bioremediation. However, limited knowledge about these proteins hinders their exploration. This study addresses this gap by employing in silico methods utilizing computational tools and machine learning. We propose a novel approach to predict acidophilic proteins using protein language models (PLMs), accelerating discovery without extensive lab work. Our investigation highlights the potential of PLMs in understanding and harnessing acidophilic proteins for scientific and industrial advancements. We introduce the ACE model, which combines a simple Logistic Regression model with embeddings derived from protein sequences processed by the ProtT5 PLM. This model achieves high performance on an independent test set, with accuracy (0.91), F1-score (0.93), and Matthew's correlation coefficient (0.76). To our knowledge, this is the first application of pre-trained PLM embeddings for acidophilic protein classification. The ACE model serves as a powerful tool for exploring protein acidophilicity, paving the way for future advancements in protein design and engineering.

生物技术领域对生态友好型技术的需求与日俱增,这就需要高效、可持续的催化剂。嗜酸性蛋白质可在高酸性环境中发挥最佳功能,在食品生产、生物燃料和生物修复等各种应用领域大有可为。然而,对这些蛋白质的有限了解阻碍了对它们的探索。本研究通过利用计算工具和机器学习的硅学方法解决了这一问题。我们提出了一种利用蛋白质语言模型(PLMs)预测嗜酸性蛋白质的新方法,无需大量的实验室工作就能加速蛋白质的发现。我们的研究凸显了蛋白质语言模型在理解和利用嗜酸性蛋白质促进科学和工业进步方面的潜力。我们介绍了 ACE 模型,该模型将简单的逻辑回归模型与由 ProtT5 PLM 处理的蛋白质序列得出的嵌入相结合。该模型在独立测试集上取得了很高的性能,准确率(0.91)、F1 分数(0.93)和马修相关系数(0.76)。据我们所知,这是首次将预训练的 PLM 嵌入应用于嗜酸性蛋白质分类。ACE 模型是探索蛋白质嗜酸性的有力工具,为蛋白质设计和工程学的未来发展铺平了道路。
{"title":"Leveraging protein language model embeddings and logistic regression for efficient and accurate in-silico acidophilic proteins classification","authors":"","doi":"10.1016/j.compbiolchem.2024.108163","DOIUrl":"10.1016/j.compbiolchem.2024.108163","url":null,"abstract":"<div><p>The increasing demand for eco-friendly technologies in biotechnology necessitates effective and sustainable catalysts. Acidophilic proteins, functioning optimally in highly acidic environments, hold immense promise for various applications, including food production, biofuels, and bioremediation. However, limited knowledge about these proteins hinders their exploration. This study addresses this gap by employing <em>in silico</em> methods utilizing computational tools and machine learning. We propose a novel approach to predict acidophilic proteins using protein language models (PLMs), accelerating discovery without extensive lab work. Our investigation highlights the potential of PLMs in understanding and harnessing acidophilic proteins for scientific and industrial advancements. We introduce the ACE model, which combines a simple Logistic Regression model with embeddings derived from protein sequences processed by the ProtT5 PLM. This model achieves high performance on an independent test set, with accuracy (0.91), F1-score (0.93), and Matthew's correlation coefficient (0.76). To our knowledge, this is the first application of pre-trained PLM embeddings for acidophilic protein classification. The ACE model serves as a powerful tool for exploring protein acidophilicity, paving the way for future advancements in protein design and engineering.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141891325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative mitochondrial genomics analysis of selected species of Schizothoracinae sub family to explore the differences at mitochondrial DNA level 对五步蛇亚科部分物种进行线粒体基因组学比较分析,探索线粒体 DNA 水平的差异
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-26 DOI: 10.1016/j.compbiolchem.2024.108165

A comprehensive analysis of the whole mitochondrial genomes of the Schizothoracinae subfamily of the family Cyprinidae has been revealed for the first time. The species analyzed include Schizothorax niger, Schizothorax esocinus, Schizothorax labiatus and Schizothorax plagoistomus. The total mitochondrial DNA (mtDNA) length was determined to be 16585 bp, 16583 bp, 16582 bp and 16576 bp, respectively with 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and 2 non-coding area genes. The combined mean base compositions of the four species were as follows: A: 29.91 % T: 25.47 % G: 17.65 % C 27.01 %. The range of the GC content is 45–44 %, respectively. All protein coding genes (PCGs) commenced with the typical ATG codon, except for the cytochrome c oxidase subunit 1 (COX1) gene with GTG. The analysis of vital amino acid biosynthesis genes (COX1, ATPase 6, ATPase 8) in four different species revealed no significant differences. All 13 PCGs had Ka/Ks ratios that were all lesser than one, demonstrating purifying selection on those molecules. These tRNA genes were predicted to fold into the typical cloverleaf secondary structures with normal base pairing and ranged in size from 66 to 75 nucleotides. Additionally, the phylogenetic tree analysis revealed that S. esocinus species that was most alike to S. labiatus. This study provides critical data for phylogenetic analysis of the Schizothoracinae subfamily, which will help to resolve taxonomic difficulties and identify evolutionary links. Detailed mtDNA data are an invaluable resource for studying genetic diversity, population structure, and gene flow. Understanding genetic makeup can help inform conservation plans, identify unique populations, and track genetic variation to ensure effective preservation.

对鲤科Schizothoracinae亚科的整个线粒体基因组的全面分析首次被揭示。被分析的物种包括黑鲷(Schizothorax niger)、埃索鲷(Schizothorax esocinus)、马唇鲈(Schizothorax labiatus)和鲈鱼(Schizothorax plagoistomus)。经测定,线粒体 DNA(mtDNA)总长度分别为 16585 bp、16583 bp、16582 bp 和 16576 bp,包括 13 个蛋白质编码基因、2 个 rRNA 基因、22 个 tRNA 基因和 2 个非编码区基因。这四个物种的碱基组成组合平均值如下:A: 29.91 % T: 25.47 % G: 17.65 % C 27.01 %。GC 含量范围分别为 45-44%。除细胞色素 c 氧化酶亚基 1(COX1)基因为 GTG 外,所有蛋白质编码基因(PCGs)均以典型的 ATG 密码子开始。对四个不同物种的重要氨基酸生物合成基因(COX1、ATPase 6、ATPase 8)的分析表明,它们之间没有显著差异。所有 13 个 PCG 的 Ka/Ks 比值均小于 1,这表明这些分子具有纯化选择作用。据预测,这些 tRNA 基因折叠成典型的苜蓿叶二级结构,碱基配对正常,大小从 66 个核苷酸到 75 个核苷酸不等。此外,系统发生树分析表明,S. esocinus 与 S. labiatus 的相似度最高。这项研究为五步蛇亚科的系统发育分析提供了关键数据,有助于解决分类上的困难和确定进化上的联系。详细的mtDNA数据是研究遗传多样性、种群结构和基因流的宝贵资源。了解基因构成有助于为保护计划提供信息,确定独特的种群,并跟踪基因变异以确保有效保护。
{"title":"Comparative mitochondrial genomics analysis of selected species of Schizothoracinae sub family to explore the differences at mitochondrial DNA level","authors":"","doi":"10.1016/j.compbiolchem.2024.108165","DOIUrl":"10.1016/j.compbiolchem.2024.108165","url":null,"abstract":"<div><p>A comprehensive analysis of the whole mitochondrial genomes of the <em>Schizothoracinae</em> subfamily of the family <em>Cyprinidae</em> has been revealed for the first time<em>.</em> The species analyzed include <em>Schizothorax niger, Schizothorax esocinus, Schizothorax labiatus</em> and <em>Schizothorax plagoistomus</em>. The total mitochondrial DNA (mtDNA) length was determined to be 16585 bp, 16583 bp, 16582 bp and 16576 bp, respectively with 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and 2 non-coding area genes. The combined mean base compositions of the four species were as follows: A: 29.91 % T: 25.47 % G: 17.65 % C 27.01 %. The range of the GC content is 45–44 %, respectively. All protein coding genes (PCGs) commenced with the typical ATG codon, except for the cytochrome c oxidase subunit 1 (COX1) gene with GTG. The analysis of vital amino acid biosynthesis genes (COX1, ATPase 6, ATPase 8) in four different species revealed no significant differences. All 13 PCGs had Ka/Ks ratios that were all lesser than one, demonstrating purifying selection on those molecules. These tRNA genes were predicted to fold into the typical cloverleaf secondary structures with normal base pairing and ranged in size from 66 to 75 nucleotides. Additionally, the phylogenetic tree analysis revealed that <em>S. esocinus</em> species that was most alike to <em>S. labiatus</em>. This study provides critical data for phylogenetic analysis of the Schizothoracinae subfamily, which will help to resolve taxonomic difficulties and identify evolutionary links. Detailed mtDNA data are an invaluable resource for studying genetic diversity, population structure, and gene flow. Understanding genetic makeup can help inform conservation plans, identify unique populations, and track genetic variation to ensure effective preservation.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141846286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring SALL4 as a significant prognostic marker in breast cancer and its association with progression pathways involved in cancer genesis 探索作为乳腺癌重要预后标志物的 SALL4 及其与癌症发生过程中的进展途径的关联。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-26 DOI: 10.1016/j.compbiolchem.2024.108164

Breast carcinoma is the leading factor in women's cancer-related fatalities. Due to its numerous inherent molecular subtypes, breast cancer is an extremely diverse illness. The human epidermal growth factor receptor 2 (HER2) positive subtypes stands out among these subtypes as being especially prone to cancer development and illness recurrence. The regulation of embryonic stem cells' pluripotency and self-renewal is carried out by the SALL4 (Spalt-like transcription factor 4) family member. Numerous molecular pathways operating at the transcriptional, post-transcriptional, and epigenomic levels regulate the expression of SALL4. Many transcription factors control the expression of SALL4, with STAT3 being the primary regulator in hepatocellular carcinoma (HCC) and breast carcinoma. Moreover, this oncogene has been connected to a number of cellular functions, including invasion, apoptosis, proliferation, and resistance to therapy. Reduced patient survival rates and a worse prognosis have been linked to higher levels of SALL4. In order to target the undruggable SALL4 that is overexpressed in breast carcinoma, we investigated the prognostic levels of SALL4 in breast carcinoma and its interaction with various related proteins. Using TIMER 2.0 analysis, the expression pattern of SALL4 was investigated across all TCGA datasets. The research revealed that SALL4 expression was elevated in various cancers. The UALCAN findings demonstrated that SALL4 was overexpressed in all tumor samples including breast cancer especially TNBC (Triple negative breast cancer). The web-based ENRICHR program was used for gene ontology analysis that revealed SALL4 was actively involved in the development of the nervous system, positive regulation of stem cell proliferation, regulation of stem cell proliferation, regulation of the activin receptor signaling pathway, regulation of transcription using DNA templates, miRNA metabolic processes, and regulation of transcription by RNA Polymerase I. Using the STRING database, we analyzed the interaction and involvement of SALL4 with other abruptly activated proteins and used Cytoscape 3.8.0 for visualization. Additionally, using bc-GenExMiner, we studied the impact of SALL4 on pathways abruptly activated in different breast cancer subtypes that revealed SALL4 was highly correlated with WNT2B, NOTCH4, AKT3, and PIK3CA. Furthermore, to target SALL4, we evaluated and analyzed the impact of CLP and its analogues, revealing promising outcomes.

乳腺癌是导致女性癌症死亡的首要因素。由于其固有的分子亚型众多,乳腺癌是一种极其多样化的疾病。在这些亚型中,人类表皮生长因子受体2(HER2)阳性亚型尤为突出,特别容易发生癌变和复发。胚胎干细胞的多能性和自我更新是由SALL4(Spalt样转录因子4)家族成员调控的。在转录、转录后和表观基因组水平上运行的众多分子途径调控着SALL4的表达。许多转录因子控制着 SALL4 的表达,其中 STAT3 是肝细胞癌(HCC)和乳腺癌的主要调节因子。此外,这种癌基因还与多种细胞功能有关,包括侵袭、凋亡、增殖和抗药性。患者生存率降低和预后恶化与 SALL4 水平升高有关。为了靶向乳腺癌中过表达的不可药用的 SALL4,我们研究了乳腺癌中 SALL4 的预后水平及其与各种相关蛋白的相互作用。我们使用 TIMER 2.0 分析方法研究了所有 TCGA 数据集中 SALL4 的表达模式。研究发现,SALL4 在各种癌症中的表达都有所升高。UALCAN 的研究结果表明,SALL4 在所有肿瘤样本中都有过表达,包括乳腺癌,尤其是 TNBC(三阴性乳腺癌)。我们利用基于网络的 ENRICHR 程序进行了基因本体分析,发现 SALL4 积极参与了神经系统的发育、干细胞增殖的正向调节、干细胞增殖的调节、激活素受体信号通路的调节、使用 DNA 模板的转录调节、miRNA 代谢过程以及 RNA 聚合酶 I 的转录调节。此外,我们还利用bc-GenExMiner研究了SALL4对不同乳腺癌亚型中突然激活的通路的影响,结果显示SALL4与WNT2B、NOTCH4、AKT3和PIK3CA高度相关。此外,为了靶向 SALL4,我们评估并分析了 CLP 及其类似物的影响,结果令人鼓舞。
{"title":"Exploring SALL4 as a significant prognostic marker in breast cancer and its association with progression pathways involved in cancer genesis","authors":"","doi":"10.1016/j.compbiolchem.2024.108164","DOIUrl":"10.1016/j.compbiolchem.2024.108164","url":null,"abstract":"<div><p>Breast carcinoma is the leading factor in women's cancer-related fatalities. Due to its numerous inherent molecular subtypes, breast cancer is an extremely diverse illness. The human epidermal growth factor receptor 2 (HER2) positive subtypes stands out among these subtypes as being especially prone to cancer development and illness recurrence. The regulation of embryonic stem cells' pluripotency and self-renewal is carried out by the SALL4 (Spalt-like transcription factor 4) family member. Numerous molecular pathways operating at the transcriptional, post-transcriptional, and epigenomic levels regulate the expression of SALL4. Many transcription factors control the expression of SALL4, with STAT3 being the primary regulator in hepatocellular carcinoma (HCC) and breast carcinoma. Moreover, this oncogene has been connected to a number of cellular functions, including invasion, apoptosis, proliferation, and resistance to therapy. Reduced patient survival rates and a worse prognosis have been linked to higher levels of SALL4. In order to target the undruggable SALL4 that is overexpressed in breast carcinoma, we investigated the prognostic levels of SALL4 in breast carcinoma and its interaction with various related proteins. Using TIMER 2.0 analysis, the expression pattern of SALL4 was investigated across all TCGA datasets. The research revealed that SALL4 expression was elevated in various cancers. The UALCAN findings demonstrated that SALL4 was overexpressed in all tumor samples including breast cancer especially TNBC (Triple negative breast cancer). The web-based ENRICHR program was used for gene ontology analysis that revealed SALL4 was actively involved in the development of the nervous system, positive regulation of stem cell proliferation, regulation of stem cell proliferation, regulation of the activin receptor signaling pathway, regulation of transcription using DNA templates, miRNA metabolic processes, and regulation of transcription by RNA Polymerase I. Using the STRING database, we analyzed the interaction and involvement of SALL4 with other abruptly activated proteins and used Cytoscape 3.8.0 for visualization. Additionally, using bc-GenExMiner, we studied the impact of SALL4 on pathways abruptly activated in different breast cancer subtypes that revealed SALL4 was highly correlated with WNT2B, NOTCH4, AKT3, and PIK3CA. Furthermore, to target SALL4, we evaluated and analyzed the impact of CLP and its analogues, revealing promising outcomes.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141891324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel radial basis neural network for the Zika virus spreading model 用于寨卡病毒传播模型的新型径向基神经网络
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-25 DOI: 10.1016/j.compbiolchem.2024.108162

The motive of current investigations is to design a novel radial basis neural network stochastic structure to present the numerical representations of the Zika virus spreading model (ZVSM). The mathematical ZVSM is categorized into humans and vectors based on the susceptible S(q), exposed E(q), infected I(q) and recovered R(q), i.e., SEIR. The stochastic performances are designed using the radial basis activation function, feed forward neural network, twenty-two numbers of neurons along with the optimization of Bayesian regularization in order to solve the ZVSM. A dataset is achieved using the explicit Runge-Kutta scheme, which is used to reduce the mean square error (MSE) based on the process of training for solving the nonlinear ZVSM. The division of the data is categorized into training, which is taken as 78 %, while 11 % for both authentication and testing. Three different cases of the nonlinear ZVSM have been taken, while the scheme’s correctness is performed through the matching of the results. Furthermore, the reliability of the scheme is observed by applying different performances of regression, MSE, error histograms and state transition.

当前研究的动机是设计一种新型径向基神经网络随机结构,以呈现寨卡病毒传播模型(ZVSM)的数值表示。数学上的寨卡病毒传播模型根据易感者 S(q)、暴露者 E(q)、感染者 I(q)和康复者 R(q),即 SEIR,分为人类和载体。为了求解 ZVSM,使用径向基激活函数、前馈神经网络、22 个神经元以及贝叶斯正则化优化来设计随机性能。使用显式 Runge-Kutta 方案实现数据集,该方案用于减少基于非线性 ZVSM 解法训练过程的均方误差(MSE)。数据分为训练数据和验证数据,训练数据占 78%,验证数据占 11%。非线性 ZVSM 有三种不同的情况,而方案的正确性是通过结果的匹配来实现的。此外,通过应用回归、MSE、误差直方图和状态转换的不同性能来观察该方案的可靠性。
{"title":"A novel radial basis neural network for the Zika virus spreading model","authors":"","doi":"10.1016/j.compbiolchem.2024.108162","DOIUrl":"10.1016/j.compbiolchem.2024.108162","url":null,"abstract":"<div><p>The motive of current investigations is to design a novel radial basis neural network stochastic structure to present the numerical representations of the Zika virus spreading model (ZVSM). The mathematical ZVSM is categorized into humans and vectors based on the susceptible <em>S</em>(<em>q</em>), exposed <em>E</em>(<em>q</em>), infected <em>I</em>(<em>q</em>) and recovered <em>R</em>(<em>q</em>), i.e., SEIR. The stochastic performances are designed using the radial basis activation function, feed forward neural network, twenty-two numbers of neurons along with the optimization of Bayesian regularization in order to solve the ZVSM. A dataset is achieved using the explicit Runge-Kutta scheme, which is used to reduce the mean square error (MSE) based on the process of training for solving the nonlinear ZVSM. The division of the data is categorized into training, which is taken as 78 %, while 11 % for both authentication and testing. Three different cases of the nonlinear ZVSM have been taken, while the scheme’s correctness is performed through the matching of the results. Furthermore, the reliability of the scheme is observed by applying different performances of regression, MSE, error histograms and state transition.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141844030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MMCL-CPI: A multi-modal compound-protein interaction prediction model incorporating contrastive learning pre-training MMCL-CPI:结合对比学习预训练的多模态化合物-蛋白质相互作用预测模型
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-25 DOI: 10.1016/j.compbiolchem.2024.108137

Motivation

Compound-protein interaction (CPI) prediction plays a crucial role in drug discovery and drug repositioning. Early researchers relied on time-consuming and labor-intensive wet laboratory experiments. However, the advent of deep learning has significantly accelerated this progress. Most existing deep learning methods utilize deep neural networks to extract compound features from sequences and graphs, either separately or in combination. Our team’s previous research has demonstrated that compound images contain valuable information that can be leveraged for CPI task. However, there is a scarcity of multimodal methods that effectively combine sequence and image representations of compounds in CPI. Currently, the use of text-image pairs for contrastive language-image pre-training is a popular approach in the multimodal field. Further research is needed to explore how the integration of sequence and image representations can enhance the accuracy of CPI task.

Results

This paper presents a novel method called MMCL-CPI, which encompasses two key highlights: 1) Firstly, we propose extracting compound features from two modalities: one-dimensional SMILES and two-dimensional images. This approach enables us to capture both sequence and spatial features, enhancing the prediction accuracy for CPI. Based on this, we design a novel multimodal model. 2) Secondly, we introduce a multimodal pre-training strategy that leverages comparative learning on a large-scale unlabeled dataset to establish the correspondence between SMILES string and compound’s image. This pre-training approach significantly improves compound feature representations for downstream CPI task. Our method has shown competitive results on multiple datasets.

动机化合物-蛋白质相互作用(CPI)预测在药物发现和药物重新定位中起着至关重要的作用。早期的研究人员依赖于耗时耗力的湿实验室实验。然而,深度学习的出现大大加快了这一进程。现有的深度学习方法大多利用深度神经网络从序列和图形中单独或组合提取化合物特征。我们团队之前的研究表明,复合图像包含有价值的信息,可用于 CPI 任务。然而,在 CPI 中有效结合化合物序列和图像表征的多模态方法还很匮乏。目前,使用文本-图像对进行语言-图像对比预训练是多模态领域的一种流行方法。本文提出了一种名为 MMCL-CPI 的新方法,该方法有两大亮点:1)首先,我们建议从两种模式中提取复合特征:一维 SMILES 和二维图像。这种方法可以同时捕捉序列和空间特征,从而提高 CPI 的预测准确性。在此基础上,我们设计了一个新颖的多模态模型。2) 其次,我们引入了一种多模态预训练策略,利用大规模无标记数据集上的比较学习来建立 SMILES 字符串与化合物图像之间的对应关系。这种预训练方法大大改进了下游 CPI 任务的复合特征表征。我们的方法已在多个数据集上显示出具有竞争力的结果。
{"title":"MMCL-CPI: A multi-modal compound-protein interaction prediction model incorporating contrastive learning pre-training","authors":"","doi":"10.1016/j.compbiolchem.2024.108137","DOIUrl":"10.1016/j.compbiolchem.2024.108137","url":null,"abstract":"<div><h3>Motivation</h3><p>Compound-protein interaction (CPI) prediction plays a crucial role in drug discovery and drug repositioning. Early researchers relied on time-consuming and labor-intensive wet laboratory experiments. However, the advent of deep learning has significantly accelerated this progress. Most existing deep learning methods utilize deep neural networks to extract compound features from sequences and graphs, either separately or in combination. Our team’s previous research has demonstrated that compound images contain valuable information that can be leveraged for CPI task. However, there is a scarcity of multimodal methods that effectively combine sequence and image representations of compounds in CPI. Currently, the use of text-image pairs for contrastive language-image pre-training is a popular approach in the multimodal field. Further research is needed to explore how the integration of sequence and image representations can enhance the accuracy of CPI task.</p></div><div><h3>Results</h3><p>This paper presents a novel method called MMCL-CPI, which encompasses two key highlights: 1) Firstly, we propose extracting compound features from two modalities: one-dimensional SMILES and two-dimensional images. This approach enables us to capture both sequence and spatial features, enhancing the prediction accuracy for CPI. Based on this, we design a novel multimodal model. 2) Secondly, we introduce a multimodal pre-training strategy that leverages comparative learning on a large-scale unlabeled dataset to establish the correspondence between SMILES string and compound’s image. This pre-training approach significantly improves compound feature representations for downstream CPI task. Our method has shown competitive results on multiple datasets.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141846992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Insights into the radiation and oxidative stress mechanisms in genus Deinococcus 揭示去势球菌属的辐射和氧化应激机制
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-25 DOI: 10.1016/j.compbiolchem.2024.108161

Deinococcus species, noted for their exceptional resistance to DNA-damaging environmental stresses, have piqued scientists' interest for decades. This study dives into the complex mechanisms underpinning radiation resistance in the Deinococcus genus. We have examined the genomes of 82 Deinococcus species and classified radiation-resistance proteins manually into five unique curated categories: DNA repair, oxidative stress defense, Ddr and Ppr proteins, regulatory proteins, and miscellaneous resistance components. This classification reveals important information about the various molecular mechanisms used by these extremophiles which have been less explored so far. We also investigated the presence or lack of these proteins in the context of phylogenetic relationships, core, and pan-genomes, which offered light on the evolutionary dynamics of radiation resistance. This comprehensive study provides a deeper understanding of the genetic underpinnings of radiation resistance in the Deinococcus genus, with potential implications for understanding similar mechanisms in other organisms using an interactomics approach. Finally, this study reveals the complexities of radiation resistance mechanisms, providing a comprehensive understanding of the genetic components that allow Deinococcus species to flourish under harsh environments. The findings add to our understanding of the larger spectrum of stress adaption techniques in bacteria and may have applications in sectors ranging from biotechnology to environmental research.

德氏球菌因其对破坏 DNA 的环境压力具有超强的抵抗力而备受关注,几十年来一直吸引着科学家们的兴趣。这项研究深入探讨了去势球菌属抗辐射性的复杂机制。我们研究了 82 个去势球菌物种的基因组,并将抗辐射蛋白质手动分为五个独特的类别:DNA 修复、氧化应激防御、Ddr 和 Ppr 蛋白、调节蛋白以及其他抗性成分。这一分类揭示了有关这些嗜极生物所使用的各种分子机制的重要信息,而迄今为止对这些机制的探索还较少。我们还结合系统发育关系、核心基因组和泛基因组研究了这些蛋白质的存在与否,从而揭示了抗辐射的进化动态。这项全面的研究加深了人们对去势球菌属抗辐射性遗传基础的理解,对利用相互作用组学方法理解其他生物的类似机制具有潜在的意义。最后,这项研究揭示了抗辐射机制的复杂性,让我们全面了解了让德氏球菌物种在严酷环境下繁衍生息的基因成分。这些发现加深了我们对更广泛的细菌应激适应技术的了解,可能会应用于从生物技术到环境研究等各个领域。
{"title":"Insights into the radiation and oxidative stress mechanisms in genus Deinococcus","authors":"","doi":"10.1016/j.compbiolchem.2024.108161","DOIUrl":"10.1016/j.compbiolchem.2024.108161","url":null,"abstract":"<div><p>Deinococcus species, noted for their exceptional resistance to DNA-damaging environmental stresses, have piqued scientists' interest for decades. This study dives into the complex mechanisms underpinning radiation resistance in the Deinococcus genus. We have examined the genomes of 82 Deinococcus species and classified radiation-resistance proteins manually into five unique curated categories: DNA repair, oxidative stress defense, Ddr and Ppr proteins, regulatory proteins, and miscellaneous resistance components. This classification reveals important information about the various molecular mechanisms used by these extremophiles which have been less explored so far. We also investigated the presence or lack of these proteins in the context of phylogenetic relationships, core, and pan-genomes, which offered light on the evolutionary dynamics of radiation resistance. This comprehensive study provides a deeper understanding of the genetic underpinnings of radiation resistance in the Deinococcus genus, with potential implications for understanding similar mechanisms in other organisms using an interactomics approach. Finally, this study reveals the complexities of radiation resistance mechanisms, providing a comprehensive understanding of the genetic components that allow Deinococcus species to flourish under harsh environments. The findings add to our understanding of the larger spectrum of stress adaption techniques in bacteria and may have applications in sectors ranging from biotechnology to environmental research.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141850846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CSSP-2.0: A refined consensus method for accurate protein secondary structure prediction CSSP-2.0:用于准确预测蛋白质二级结构的完善共识方法。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-23 DOI: 10.1016/j.compbiolchem.2024.108158

Studying the relationship between sequences and their corresponding three-dimensional structure assists structural biologists in solving the protein-folding problem. Despite several experimental and in-silico approaches, still understanding or decoding the three-dimensional structures from the sequence remains a mystery. In such cases, the accuracy of the structure prediction plays an indispensable role. To address this issue, an updated web server (CSSP-2.0) has been created to improve the accuracy of our previous version of CSSP by deploying the existing algorithms. It uses input as probabilities and predicts the consensus for the secondary structure as a highly accurate three-state Q3 (helix, strand, and coil). This prediction is achieved using six recent top-performing methods: MUFOLD-SS, RaptorX, PSSpred v4, PSIPRED, JPred v4, and Porter 5.0. CSSP-2.0 validation includes datasets involving various protein classes from the PDB, CullPDB, and AlphaFold databases. Our results indicate a significant improvement in the accuracy of the consensus Q3 prediction. Using CSSP-2.0, crystallographers can sort out the stable regular secondary structures from the entire complex structure, which would aid in inferring the functional annotation of hypothetical proteins. The web server is freely available at https://bioserver3.physics.iisc.ac.in/cgi-bin/cssp-2/

研究序列与相应三维结构之间的关系有助于结构生物学家解决蛋白质折叠问题。尽管采用了多种实验和室内方法,但从序列中理解或解码三维结构仍然是一个谜。在这种情况下,结构预测的准确性起着不可或缺的作用。为了解决这个问题,我们创建了一个更新的网络服务器(CSSP-2.0),通过部署现有算法来提高 CSSP 先前版本的准确性。它将输入作为概率,并以高精度的三态 Q3(螺旋、链和线圈)预测二级结构的共识。这一预测是通过最近六种性能最佳的方法实现的:MUFOLD-SS、RaptorX、PSSpred v4、PSIPRED、JPred v4 和 Porter 5.0。CSSP-2.0 验证包括来自 PDB、CullPDB 和 AlphaFold 数据库的各种蛋白质类别的数据集。我们的结果表明,共识 Q3 预测的准确性有了显著提高。利用 CSSP-2.0,晶体学家可以从整个复合结构中筛选出稳定的规则二级结构,这将有助于推断假定蛋白质的功能注释。该网络服务器可在 https://bioserver3.physics.iisc.ac.in/cgi-bin/cssp-2/ 免费获取。
{"title":"CSSP-2.0: A refined consensus method for accurate protein secondary structure prediction","authors":"","doi":"10.1016/j.compbiolchem.2024.108158","DOIUrl":"10.1016/j.compbiolchem.2024.108158","url":null,"abstract":"<div><p>Studying the relationship between sequences and their corresponding three-dimensional structure assists structural biologists in solving the protein-folding problem. Despite several experimental and <em>in-silico</em> approaches, still understanding or decoding the three-dimensional structures from the sequence remains a mystery. In such cases, the accuracy of the structure prediction plays an indispensable role. To address this issue, an updated web server (CSSP-2.0) has been created to improve the accuracy of our previous version of CSSP by deploying the existing algorithms. It uses input as probabilities and predicts the consensus for the secondary structure as a highly accurate three-state Q3 (helix, strand, and coil). This prediction is achieved using six recent top-performing methods: MUFOLD-SS, RaptorX, PSSpred v4, PSIPRED, JPred v4, and Porter 5.0. CSSP-2.0 validation includes datasets involving various protein classes from the PDB, CullPDB, and AlphaFold databases. Our results indicate a significant improvement in the accuracy of the consensus Q3 prediction. Using CSSP-2.0, crystallographers can sort out the stable regular secondary structures from the entire complex structure, which would aid in inferring the functional annotation of hypothetical proteins. The web server is freely available at <span><span>https://bioserver3.physics.iisc.ac.in/cgi-bin/cssp-2/</span><svg><path></path></svg></span></p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141763295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational exploration of Ganoderma lucidum metabolites as potential anti-atherosclerotic agents: Insights from molecular docking and dynamics simulations 灵芝代谢物作为潜在抗动脉粥样硬化药物的计算探索:分子对接和动力学模拟的启示
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-23 DOI: 10.1016/j.compbiolchem.2024.108160

Ganoderma lucidum is a unique form of fungus utilized in Chinese medicine for various therapies as it exhibits a wide range of pharmacological activity. In this study, the purpose is to evaluate the possible drug-like qualities of the metabolites of G. lucidium as well as the impact that these metabolites have on the pathways involved in atherosclerosis. Throughout our research, a total of 17 compounds were chosen based on their drug-like properties. These compounds were then utilized in the subsequent networking and docking simulations. According to the findings, the compound ganodone has a maximum binding energy of −7.243 Kcal/mol. In terms of the binding energy, it has been discovered that the compound cianidanol has the lowest value. Based on the findings of the molecular docking investigations, it was determined that TNF, AKT1, SRC, and STAT3 exhibited a higher affinity for the complex. To determine this, molecular dynamics simulation was performed for about 100 nanoseconds. Following the completion of the GO functional analysis, it was discovered that the target genes were involved in the processes of protein binding, ATP binding, enzyme binding, and protein tyrosine kinase activity. Overall, the study results provide a view of possible metabolites that may have an impact on disease progression.

灵芝是一种独特的真菌,具有广泛的药理活性,被中医用于各种治疗。本研究的目的是评估灵芝代谢物可能具有的类药物特性,以及这些代谢物对动脉粥样硬化相关途径的影响。在整个研究过程中,我们根据类似药物的特性共选择了 17 种化合物。这些化合物被用于随后的联网和对接模拟。根据研究结果,化合物甘诺酮的最大结合能为-7.243 Kcal/mol。在结合能方面,发现杉烷醇化合物的结合能值最低。根据分子对接研究的结果,确定 TNF、AKT1、SRC 和 STAT3 对复合物的亲和力较高。为了确定这一点,进行了大约 100 纳秒的分子动力学模拟。在完成 GO 功能分析后,发现目标基因参与了蛋白质结合、ATP 结合、酶结合和蛋白质酪氨酸激酶活性等过程。总之,研究结果提供了一种可能影响疾病进展的代谢物的观点。
{"title":"Computational exploration of Ganoderma lucidum metabolites as potential anti-atherosclerotic agents: Insights from molecular docking and dynamics simulations","authors":"","doi":"10.1016/j.compbiolchem.2024.108160","DOIUrl":"10.1016/j.compbiolchem.2024.108160","url":null,"abstract":"<div><p><em><strong>Ganoderma lucidum</strong></em> is a unique form of fungus utilized in Chinese medicine for various therapies as it exhibits a wide range of pharmacological activity. In this study, the purpose is to evaluate the possible drug-like qualities of the metabolites of <em>G. lucidium</em> as well as the impact that these metabolites have on the pathways involved in atherosclerosis. Throughout our research, a total of 17 compounds were chosen based on their drug-like properties. These compounds were then utilized in the subsequent networking and docking simulations. According to the findings, the compound ganodone has a maximum binding energy of −7.243 Kcal/mol. In terms of the binding energy, it has been discovered that the compound cianidanol has the lowest value. Based on the findings of the molecular docking investigations, it was determined that TNF, AKT1, SRC, and STAT3 exhibited a higher affinity for the complex. To determine this, molecular dynamics simulation was performed for about 100 nanoseconds. Following the completion of the GO functional analysis, it was discovered that the target genes were involved in the processes of protein binding, ATP binding, enzyme binding, and protein tyrosine kinase activity. Overall, the study results provide a view of possible metabolites that may have an impact on disease progression.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141853003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multitargeted molecular docking and dynamics simulation studies of 1,3,4-thiadiazoles synthesised from (R)-carvone against specific tumour protein markers: An In-silico study of two diastereoisomers 由 (R)-Carvone 合成的 1,3,4-噻二唑针对特定肿瘤蛋白标记物的多靶向分子对接和动力学模拟研究:两种非对映异构体的室内研究
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-23 DOI: 10.1016/j.compbiolchem.2024.108159

In the present work, we describe the synthesis of new 1,3,4-thiadiazole derivatives from natural (R)-carvone in three steps including, dichloro-cyclopropanation, a condensation with thiosemicarbazide and then a 1,3-dipolar cycloaddition reaction with various nitrilimines. the targeted compounds were structurally identified by 1H & 13C NMR and HRMS analyses. The cytotoxic assay demonstrated that some synthesized novel compounds were potent on certain cancer cell lines. Molecular modeling studies were undertaken to rationalize the wet lab study results. Furthermore, molecular docking was performed to unveil the binding potential of the most active derivatives, 3a and 6c, to caspase-3 and COX-2. The stabilities of the protein-compound complexes obtained from the docking were evaluated using MD simulation. Furthermore, FMO and related parameters of the active compounds and their stereoisomers were examined through DFT studies. The docking study showed compound 6c had a higher binding potential than caspase-3. However, the binding strength of 6c was found to be less than that of the standard drug, doxorubicin, as it formed lower conventional hydrogen bonds. On the other hand, compound 3a had a higher binding potential to COX-2. However, the binding potential 3a was much lower than that of the standard COX-2 inhibitor, celecoxib. The MD simulation demonstrated that the caspase-3-6c complex was less stable than the caspase-3-doxorubicin complex. In contrast, the COX-2-3a complex was stable, and 3a was anticipated to remain inside the protein's binding pocket. The DFT study showed that 3a had higher chemical stability than 6c. The electron exchange capacity, chemical stability, and molecular orbital distributions of the stereoisomers of the active compounds were also found to be alike.

在本研究中,我们介绍了从天然(R)-香芹酮出发,通过三步合成新的 1,3,4-噻二唑衍生物的过程,包括二氯环丙烷化反应、与硫代氨基脲的缩合反应,以及与各种氮亚胺的 1,3-二极环加成反应。细胞毒性试验表明,一些合成的新型化合物对某些癌细胞株具有特效。为了使湿法实验室研究结果更加合理,我们进行了分子建模研究。此外,还进行了分子对接,以揭示活性最强的衍生物 3a 和 6c 与 caspase-3 和 COX-2 的结合潜力。利用 MD 模拟评估了对接得到的蛋白质-化合物复合物的稳定性。此外,还通过 DFT 研究考察了活性化合物及其立体异构体的 FMO 和相关参数。对接研究表明,化合物 6c 比 caspase-3 具有更高的结合潜力。然而,研究发现 6c 的结合强度低于标准药物多柔比星,因为它形成的常规氢键较低。另一方面,化合物 3a 与 COX-2 的结合力较高。然而,3a 的结合潜力远低于标准 COX-2 抑制剂塞来昔布。MD 模拟表明,caspase-3-6c 复合物的稳定性低于 caspase-3-doxorubicin 复合物。相比之下,COX-2-3a 复合物则比较稳定,预计 3a 会留在蛋白质的结合袋中。DFT 研究表明,3a 的化学稳定性高于 6c。研究还发现,活性化合物立体异构体的电子交换能力、化学稳定性和分子轨道分布也很相似。
{"title":"Multitargeted molecular docking and dynamics simulation studies of 1,3,4-thiadiazoles synthesised from (R)-carvone against specific tumour protein markers: An In-silico study of two diastereoisomers","authors":"","doi":"10.1016/j.compbiolchem.2024.108159","DOIUrl":"10.1016/j.compbiolchem.2024.108159","url":null,"abstract":"<div><p>In the present work, we describe the synthesis of new 1,3,4-thiadiazole derivatives from natural (R)-carvone in three steps including, dichloro-cyclopropanation, a condensation with thiosemicarbazide and then a 1,3-dipolar cycloaddition reaction with various nitrilimines. the targeted compounds were structurally identified by <sup>1</sup>H &amp; <sup>13</sup>C NMR and HRMS analyses. The cytotoxic assay demonstrated that some synthesized novel compounds were potent on certain cancer cell lines. Molecular modeling studies were undertaken to rationalize the wet lab study results. Furthermore, molecular docking was performed to unveil the binding potential of the most active derivatives, <strong>3a</strong> and <strong>6c</strong>, to caspase-3 and COX-2. The stabilities of the protein-compound complexes obtained from the docking were evaluated using MD simulation. Furthermore, FMO and related parameters of the active compounds and their stereoisomers were examined through DFT studies. The docking study showed compound <strong>6c</strong> had a higher binding potential than caspase-3. However, the binding strength of <strong>6c</strong> was found to be less than that of the standard drug, doxorubicin, as it formed lower conventional hydrogen bonds. On the other hand, compound <strong>3a</strong> had a higher binding potential to COX-2. However, the binding potential <strong>3a</strong> was much lower than that of the standard COX-2 inhibitor, celecoxib. The MD simulation demonstrated that the caspase-3-<strong>6c</strong> complex was less stable than the caspase-3-doxorubicin complex. In contrast, the COX-2-<strong>3a</strong> complex was stable, and <strong>3a</strong> was anticipated to remain inside the protein's binding pocket. The DFT study showed that <strong>3a</strong> had higher chemical stability than <strong>6c</strong>. The electron exchange capacity, chemical stability, and molecular orbital distributions of the stereoisomers of the active compounds were also found to be alike.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141844164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating the anti-lung cancer properties of Zhuang medicine Cycas revoluta Thunb. leaves targeting ion channels and transporters through a comprehensive strategy 以离子通道和转运体为靶点的综合策略研究壮药苏铁叶片的抗肺癌特性
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-19 DOI: 10.1016/j.compbiolchem.2024.108156

Background

Cycas revoluta Thunb., known for its ornamental, economic, and medicinal value, has leaves often discarded as waste. However, in ethnic regions of China, the leaves (CRL) are used in folk medicine for anti-tumor properties, particularly for regulating pathways related to cancer. Recent studies on ion channels and transporters (ICTs) highlight their therapeutic potential against cancer, making it vital to identify CRL’s active constituents targeting ICTs in lung cancer.

Purpose

This study aims to uncover bioactive substances in CRL and their mechanisms in regulating ICTs for lung cancer treatment using network pharmacology, bioinformatics, molecular docking, molecular dynamics (MD) simulations, in vitro cell assays and HPLC.

Methods

We analyzed 62 CRL compounds, predicted targets using PubChem and SwissTargetPrediction, identified lung cancer and ICT targets via GeneCards, and visualized overlaps with R software. Interaction networks were constructed using Cytoscape and STRING. Gene expression, GO, and KEGG analyses were performed using R software. TCGA data provided insights into differential, correlation, survival, and immune analyses. Key interactions were validated through molecular docking and MD simulations. Main biflavonoids were quantified using HPLC, and in vitro cell viability assays were conducted for key biflavonoids.

Results

Venn diagram analysis identified 52 intersecting targets and ten active CRL compounds. The PPI network highlighted seven key targets. GO and KEGG analysis showed CRL-targeted ICTs involved in synaptic transmission, GABAergic synapse, and proteoglycans in cancer. Differential expression and correlation analysis revealed significant differences in five core targets in lung cancer tissues. Survival analysis linked EGFR and GABRG2 with overall survival, and immune infiltration analysis associated the core targets with most immune cell types. Molecular docking indicated strong binding of CRL ingredients to core targets. HPLC revealed amentoflavone as the most abundant biflavonoid, followed by hinokiflavone, sciadopitysin, and podocarpusflavone A. MD simulations showed that podocarpusflavone A and amentoflavone had better binding stability with GABRG2, and the cell viability assay also proved that they had better anti-lung cancer potential.

Conclusions

This study identified potential active components, targets, and pathways of CRL-targeted ICTs for lung cancer treatment, suggesting CRL’s utility in drug development and its potential beyond industrial waste.

背景:苏铁(Cycas revoluta Thunb.)以其观赏价值、经济价值和药用价值而闻名,其叶子经常被当作废物丢弃。然而,在中国少数民族地区,叶子(CRL)因其抗肿瘤特性,特别是调节与癌症有关的途径而被用于民间医药。目的:本研究旨在利用网络药理学、生物信息学、分子对接、分子动力学(MD)模拟、体外细胞试验和高效液相色谱法,揭示CRL中的生物活性物质及其调节ICTs治疗肺癌的机制:我们分析了62种CRL化合物,使用PubChem和SwissTargetPrediction预测了靶点,通过GeneCards确定了肺癌和ICT靶点,并使用R软件对重叠进行了可视化。使用 Cytoscape 和 STRING 构建了相互作用网络。使用 R 软件进行了基因表达、GO 和 KEGG 分析。TCGA 数据为差异、相关、存活和免疫分析提供了洞察力。通过分子对接和 MD 模拟验证了关键的相互作用。利用高效液相色谱对主要的双黄酮类化合物进行了定量,并对关键的双黄酮类化合物进行了体外细胞活力测定:结果:维恩图分析确定了 52 个交叉靶标和 10 个活性 CRL 化合物。PPI 网络突出了七个关键靶点。GO和KEGG分析表明,CRL靶向的ICTs涉及突触传递、GABA能突触和癌症中的蛋白聚糖。差异表达和相关性分析显示,肺癌组织中的五个核心靶点存在显著差异。生存期分析将表皮生长因子受体和 GABRG2 与总生存期联系起来,而免疫浸润分析则将核心靶点与大多数免疫细胞类型联系起来。分子对接表明,CRL 成分与核心靶点有很强的结合力。MD 模拟显示,荚果黄酮 A 和芒果黄酮与 GABRG2 的结合稳定性更好,细胞活力测定也证明它们具有更好的抗肺癌潜力:本研究发现了CRL靶向ICTs治疗肺癌的潜在活性成分、靶点和途径,表明CRL在药物开发中的实用性及其超越工业废物的潜力。
{"title":"Investigating the anti-lung cancer properties of Zhuang medicine Cycas revoluta Thunb. leaves targeting ion channels and transporters through a comprehensive strategy","authors":"","doi":"10.1016/j.compbiolchem.2024.108156","DOIUrl":"10.1016/j.compbiolchem.2024.108156","url":null,"abstract":"<div><h3>Background</h3><p><em>Cycas revoluta</em> Thunb., known for its ornamental, economic, and medicinal value, has leaves often discarded as waste. However, in ethnic regions of China, the leaves (CRL) are used in folk medicine for anti-tumor properties, particularly for regulating pathways related to cancer. Recent studies on ion channels and transporters (ICTs) highlight their therapeutic potential against cancer, making it vital to identify CRL’s active constituents targeting ICTs in lung cancer.</p></div><div><h3>Purpose</h3><p>This study aims to uncover bioactive substances in CRL and their mechanisms in regulating ICTs for lung cancer treatment using network pharmacology, bioinformatics, molecular docking, molecular dynamics (MD) simulations, <em>in vitro</em> cell assays and HPLC.</p></div><div><h3>Methods</h3><p>We analyzed 62 CRL compounds, predicted targets using PubChem and SwissTargetPrediction, identified lung cancer and ICT targets via GeneCards, and visualized overlaps with R software. Interaction networks were constructed using Cytoscape and STRING. Gene expression, GO, and KEGG analyses were performed using R software. TCGA data provided insights into differential, correlation, survival, and immune analyses. Key interactions were validated through molecular docking and MD simulations. Main biflavonoids were quantified using HPLC, and in vitro cell viability assays were conducted for key biflavonoids.</p></div><div><h3>Results</h3><p>Venn diagram analysis identified 52 intersecting targets and ten active CRL compounds. The PPI network highlighted seven key targets. GO and KEGG analysis showed CRL-targeted ICTs involved in synaptic transmission, GABAergic synapse, and proteoglycans in cancer. Differential expression and correlation analysis revealed significant differences in five core targets in lung cancer tissues. Survival analysis linked EGFR and GABRG2 with overall survival, and immune infiltration analysis associated the core targets with most immune cell types. Molecular docking indicated strong binding of CRL ingredients to core targets. HPLC revealed amentoflavone as the most abundant biflavonoid, followed by hinokiflavone, sciadopitysin, and podocarpusflavone A. MD simulations showed that podocarpusflavone A and amentoflavone had better binding stability with GABRG2, and the cell viability assay also proved that they had better anti-lung cancer potential.</p></div><div><h3>Conclusions</h3><p>This study identified potential active components, targets, and pathways of CRL-targeted ICTs for lung cancer treatment, suggesting CRL’s utility in drug development and its potential beyond industrial waste.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141790276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Biology and Chemistry
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1