首页 > 最新文献

Computational Biology and Chemistry最新文献

英文 中文
Discovering novel targets of abscisic acid using computational approaches 利用计算方法发现脱落酸的新靶标。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-19 DOI: 10.1016/j.compbiolchem.2024.108157

Abscisic acid (ABA) is a crucial plant hormone that is naturally produced in various mammalian tissues and holds significant potential as a therapeutic molecule in humans. ABA is selected for this study due to its known roles in essential human metabolic processes, such as glucose homeostasis, immune responses, cardiovascular system, and inflammation regulation. Despite its known importance, the molecular mechanism underlying ABA's action remain largely unexplored. This study employed computational techniques to identify potential human ABA receptors. We screened 64 candidate molecules using online servers and performed molecular docking to assess binding affinity and interaction types with ABA. The stability and dynamics of the best complexes were investigated using molecular dynamics simulation over a 100 ns time period. Root mean square fluctuations (RMSF), root mean square deviation (RMSD), solvent-accessible surface area (SASA), radius of gyration (Rg), free energy landscape (FEL), and principal component analysis (PCA) were analyzed. Next, the molecular mechanics Poisson–Boltzmann surface area (MM-PBSA) method was employed to calculate the binding energies of the complexes based on the simulated data. Our study successfully pinpointed four key receptors responsible for ABA signaling (androgen receptor, glucocorticoid receptor, mineralocorticoid receptor, and retinoic acid receptor beta) that have a strong affinity for binding with ABA and remained structurally stable throughout the simulations. The simulations with Hydralazine as an unrelated ligand were conducted to validate the specificity of the identified receptors for ABA. The findings of this study can contribute to further experimental validation and a better understanding of how ABA functions in humans.

脱落酸(ABA)是一种重要的植物激素,可在各种哺乳动物组织中自然产生,并具有作为人类治疗分子的巨大潜力。本研究之所以选择脱落酸,是因为它在葡萄糖稳态、免疫反应、心血管系统和炎症调节等人体重要代谢过程中发挥着已知的作用。尽管已知其重要性,但 ABA 作用的分子机制在很大程度上仍未得到探索。本研究采用计算技术来鉴定潜在的人类 ABA 受体。我们利用在线服务器筛选了 64 个候选分子,并进行了分子对接,以评估与 ABA 的结合亲和力和相互作用类型。我们利用分子动力学模拟研究了最佳复合物在 100 ns 时间段内的稳定性和动态性。分析了均方根波动(RMSF)、均方根偏差(RMSD)、可溶解表面积(SASA)、回旋半径(Rg)、自由能景观(FEL)和主成分分析(PCA)。然后,根据模拟数据,采用分子力学泊松-玻尔兹曼表面积(MM-PBSA)方法计算复合物的结合能。我们的研究成功地确定了负责 ABA 信号传导的四个关键受体(雄激素受体、糖皮质激素受体、矿质皮质激素受体和视黄酸受体 beta),它们与 ABA 的结合亲和力很强,并且在整个模拟过程中保持结构稳定。为了验证已确定的受体对 ABA 的特异性,还将肼屈嗪作为非相关配体进行了模拟。本研究的发现有助于进一步的实验验证,并有助于更好地了解 ABA 在人体中的作用。
{"title":"Discovering novel targets of abscisic acid using computational approaches","authors":"","doi":"10.1016/j.compbiolchem.2024.108157","DOIUrl":"10.1016/j.compbiolchem.2024.108157","url":null,"abstract":"<div><p>Abscisic acid (ABA) is a crucial plant hormone that is naturally produced in various mammalian tissues and holds significant potential as a therapeutic molecule in humans. ABA is selected for this study due to its known roles in essential human metabolic processes, such as glucose homeostasis, immune responses, cardiovascular system, and inflammation regulation. Despite its known importance, the molecular mechanism underlying ABA's action remain largely unexplored. This study employed computational techniques to identify potential human ABA receptors. We screened 64 candidate molecules using online servers and performed molecular docking to assess binding affinity and interaction types with ABA. The stability and dynamics of the best complexes were investigated using molecular dynamics simulation over a 100 ns time period. Root mean square fluctuations (RMSF), root mean square deviation (RMSD), solvent-accessible surface area (SASA), radius of gyration (Rg), free energy landscape (FEL), and principal component analysis (PCA) were analyzed. Next, the molecular mechanics Poisson–Boltzmann surface area (MM-PBSA) method was employed to calculate the binding energies of the complexes based on the simulated data. Our study successfully pinpointed four key receptors responsible for ABA signaling (androgen receptor, glucocorticoid receptor, mineralocorticoid receptor, and retinoic acid receptor beta) that have a strong affinity for binding with ABA and remained structurally stable throughout the simulations. The simulations with Hydralazine as an unrelated ligand were conducted to validate the specificity of the identified receptors for ABA. The findings of this study can contribute to further experimental validation and a better understanding of how ABA functions in humans.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141763262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-scale DNA language model improves 6 mA binding sites prediction 多尺度 DNA 语言模型改进了 6 mA 结合位点预测。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-18 DOI: 10.1016/j.compbiolchem.2024.108129

DNA methylation at the N6 position of adenine (N6-methyladenine, 6 mA), which refers to the attachment of a methyl group to the N6 site of the adenine (A) of DNA, is an important epigenetic modification in prokaryotic and eukaryotic genomes. Accurately predicting the 6 mA binding sites can provide crucial insights into gene regulation, DNA repair, disease development and so on. Wet experiments are commonly used for analyzing 6 mA binding sites. However, they suffer from high cost and expensive time. Therefore, various deep learning methods have been widely used to predict 6 mA binding sites recently. In this study, we develop a framework based on multi-scale DNA language model named "iDNA6mA-MDL". "iDNA6mA-MDL" integrates multiple kmers and the nucleotide property and frequency method for feature embedding, which can capture a full range of DNA sequence context information. At the prediction stage, it also leverages DNABERT to compensate for the incomplete capture of global DNA information. Experiments show that our framework obtains average AUC of 0.981 on a classic 6 mA rice gene dataset, going beyond all existing advanced models under fivefold cross-validations. Moreover, "iDNA6mA-MDL" outperforms most of the popular state-of-the-art methods on another 11 6 mA datasets, demonstrating its effectiveness in 6 mA binding sites prediction.

腺嘌呤 N6 位点的 DNA 甲基化(N6-methyladenine,6 mA)是指在 DNA 的腺嘌呤(A)的 N6 位点上附着一个甲基,是原核生物和真核生物基因组中重要的表观遗传修饰。准确预测 6 mA 结合位点可以为基因调控、DNA 修复、疾病发展等提供重要的启示。湿法实验通常用于分析 6 mA 结合位点。然而,湿法实验成本高、耗时长。因此,近来各种深度学习方法被广泛用于预测 6 mA 结合位点。在本研究中,我们开发了一个基于多尺度DNA语言模型的框架,命名为 "iDNA6mA-MDL"。"iDNA6mA-MDL "整合了多个kmers和核苷酸性质与频率方法进行特征嵌入,可以捕捉DNA序列的全方位上下文信息。在预测阶段,它还利用 DNABERT 来弥补全局 DNA 信息捕获的不完整。实验表明,在经典的 6 mA 水稻基因数据集上,我们的框架获得了 0.981 的平均 AUC,在五倍交叉验证下超越了所有现有的高级模型。此外,"iDNA6mA-MDL "在另外 11 个 6 mA 数据集上的表现也优于大多数流行的先进方法,证明了它在 6 mA 结合位点预测方面的有效性。
{"title":"Multi-scale DNA language model improves 6 mA binding sites prediction","authors":"","doi":"10.1016/j.compbiolchem.2024.108129","DOIUrl":"10.1016/j.compbiolchem.2024.108129","url":null,"abstract":"<div><p>DNA methylation at the N6 position of adenine (N6-methyladenine, 6 mA), which refers to the attachment of a methyl group to the N6 site of the adenine (A) of DNA, is an important epigenetic modification in prokaryotic and eukaryotic genomes. Accurately predicting the 6 mA binding sites can provide crucial insights into gene regulation, DNA repair, disease development and so on. Wet experiments are commonly used for analyzing 6 mA binding sites. However, they suffer from high cost and expensive time. Therefore, various deep learning methods have been widely used to predict 6 mA binding sites recently. In this study, we develop a framework based on multi-scale DNA language model named \"iDNA6mA-MDL\". \"iDNA6mA-MDL\" integrates multiple kmers and the nucleotide property and frequency method for feature embedding, which can capture a full range of DNA sequence context information. At the prediction stage, it also leverages DNABERT to compensate for the incomplete capture of global DNA information. Experiments show that our framework obtains average AUC of 0.981 on a classic 6 mA rice gene dataset, going beyond all existing advanced models under fivefold cross-validations. Moreover, \"iDNA6mA-MDL\" outperforms most of the popular state-of-the-art methods on another 11 6 mA datasets, demonstrating its effectiveness in 6 mA binding sites prediction.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141790277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification of imidazole-based small molecules to combat cognitive disability caused by Alzheimer’s disease: A molecular docking and MD simulations based approach 鉴定咪唑类小分子以防治阿尔茨海默病引起的认知障碍:基于分子对接和 MD 模拟的方法
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-18 DOI: 10.1016/j.compbiolchem.2024.108152

Alzheimer's disease (AD) is a chronic neurodegenerative disorder that is the primary cause of dementia. It is characterised by the gradual loss of brain cells, which results in memory loss and cognitive dysfunction. One of the hallmarks of AD is an abnormally upregulated glutaminyl-peptide cyclotransferase (QPCT or QC) enzyme. Not only AD, but QC has also been implicated with pathological conditions like Huntington's disease (HD), melanomas, carcinomas, atherosclerosis, and septic arthritis. Therefore, the inhibition of QC emerged as a potential strategy for preventing multiple pathological conditions. Considering this, we screened a library of 153,536 imidazole-based compounds against a doubly mutant (Y115E-Y117E) QC target. Molecular docking based virtual screening and absorption, distribution, metabolism, excretion/toxicity (ADME/T) predictions identified five compounds, namely 118981836, 136459842, 139388116, 139388226, and 139958725. Furthermore, molecular dynamics (MD) simulations of 500 ns were conducted to investigate the behaviour of the identified compounds with the target receptor. The results were compared to the co-ligand by analysing RMSD, RMSF, and SASA parameters. To our knowledge, this is the first computational study that employed a protein with double mutation to identify new imidazole-based QC-inhibitors.

阿尔茨海默病(AD)是一种慢性神经退行性疾病,是痴呆症的主要病因。其特征是脑细胞逐渐丧失,导致记忆力减退和认知功能障碍。注意力缺失症的特征之一是谷氨酰胺酰肽环转酶(QPCT 或 QC)异常上调。不仅是注意力缺失症,QC 还与亨廷顿氏病(HD)、黑色素瘤、癌症、动脉粥样硬化和化脓性关节炎等病症有关。因此,抑制 QC 成为预防多种病症的一种潜在策略。有鉴于此,我们针对双突变(Y115E-Y117E)的 QC 靶点筛选了一个由 153,536 个咪唑类化合物组成的化合物库。基于分子对接的虚拟筛选和吸收、分布、代谢、排泄/毒性(ADME/T)预测确定了五个化合物,即 118981836、136459842、139388116、139388226 和 139958725。此外,还进行了 500 ns 的分子动力学(MD)模拟,以研究已确定的化合物与目标受体的行为。通过分析 RMSD、RMSF 和 SASA 参数,将结果与共配体进行了比较。据我们所知,这是首次利用双突变蛋白质来鉴定新的咪唑类 QC 抑制剂的计算研究。
{"title":"Identification of imidazole-based small molecules to combat cognitive disability caused by Alzheimer’s disease: A molecular docking and MD simulations based approach","authors":"","doi":"10.1016/j.compbiolchem.2024.108152","DOIUrl":"10.1016/j.compbiolchem.2024.108152","url":null,"abstract":"<div><p>Alzheimer's disease (AD) is a chronic neurodegenerative disorder that is the primary cause of dementia. It is characterised by the gradual loss of brain cells, which results in memory loss and cognitive dysfunction. One of the hallmarks of AD is an abnormally upregulated glutaminyl-peptide cyclotransferase (QPCT or QC) enzyme. Not only AD, but QC has also been implicated with pathological conditions like Huntington's disease (HD), melanomas, carcinomas, atherosclerosis, and septic arthritis. Therefore, the inhibition of QC emerged as a potential strategy for preventing multiple pathological conditions. Considering this, we screened a library of 153,536 imidazole-based compounds against a doubly mutant (Y115E-Y117E) QC target. Molecular docking based virtual screening and absorption, distribution, metabolism, excretion/toxicity (ADME/T) predictions identified five compounds, namely <strong>118981836</strong>, <strong>136459842</strong>, <strong>139388116</strong>, <strong>139388226</strong>, and <strong>139958725</strong>. Furthermore, molecular dynamics (MD) simulations of 500 ns were conducted to investigate the behaviour of the identified compounds with the target receptor. The results were compared to the co-ligand by analysing RMSD, RMSF, and SASA parameters. To our knowledge, this is the first computational study that employed a protein with double mutation to identify new imidazole-based QC-inhibitors.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141736377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring potential molecular targets and therapeutic efficacy of beauvericin in triple-negative breast cancer cells 探索三阴性乳腺癌细胞中熊果苷的潜在分子靶点和疗效。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-18 DOI: 10.1016/j.compbiolchem.2024.108154

Triple negative breast cancer (TNBC) presents a significant global health concern due to its aggressive nature, high mortality rate and limited treatment options, highlighting the urgent need for targeted therapies. Beauvericin, a bioactive fungal secondary metabolite, possess significant anticancer potential, although its molecular targets in cancer cells remain unexplored. This study has investigated possible molecular targets of beauvericin and its therapeutic insights in TNBC cells. In silico studies using molecular docking and MD simulation predicted the molecular targets of beauvericin. The identified targets included MRP-1 (ABCC1), HDAC-1, HDAC-2, LCK and SYK with average binding energy of −90.1, −44.3, −72.1, −105 and −60.8 KJ/mol, respectively, implying its multifaceted roles in reversing drug resistance, inhibiting epigenetic modulators and oncogenic tyrosine kinases. Beauvericin has significantly reduced the viability of MDA-MB-231 and MDA-MB-468 cells, with IC50 concentrations of 4.4 and 3.9 µM, while concurrently elevating the intracellular ROS by 9.0 and 7.9 folds, respectively. Subsequent reduction of mitochondrial transmembrane potential in TNBC cells, has confirmed the induction of oxidative stress, leading to apoptotic cell death, as observed by flow cytometric analyses. Beauvericin has also arrested cell cycle at G1-phase and impaired the spheroid formation and clonal expansion abilities of TNBC cells. The viability of spheroids was reduced upon beauvericin treatment, exhibiting IC50 concentrations of 10.3 and 6.2 µM in MDA-MB-468 and MDA-MB-231 cells, respectively. In conclusion, beauvericin has demonstrated promising therapeutic potential against TNBC cells through possible inhibition of MRP-1 (ABCC1), HDAC-1, HDAC-2, LCK and SYK.

三阴性乳腺癌(TNBC)具有侵袭性强、死亡率高、治疗方案有限等特点,是全球关注的重大健康问题,因此迫切需要靶向疗法。Beauvericin 是一种具有生物活性的真菌次生代谢产物,具有显著的抗癌潜力,但其在癌细胞中的分子靶点仍未得到探索。本研究探讨了蒲公英苷可能的分子靶点及其对 TNBC 细胞的治疗作用。利用分子对接和 MD 模拟进行的硅学研究预测了蒲公英苷的分子靶点。确定的靶点包括MRP-1 (ABCC1)、HDAC-1、HDAC-2、LCK和SYK,其平均结合能分别为-90.1、-44.3、-72.1、-105和-60.8 KJ/mol。Beauvericin 能显著降低 MDA-MB-231 和 MDA-MB-468 细胞的活力,IC50 浓度分别为 4.4 和 3.9 µM,同时使细胞内 ROS 分别升高 9.0 和 7.9 倍。随后 TNBC 细胞线粒体跨膜电位的降低证实了氧化应激的诱导,从而导致细胞凋亡。蒲公英还能使细胞周期停滞在 G1 期,损害 TNBC 细胞的球形形成和克隆扩增能力。在MDA-MB-468和MDA-MB-231细胞中,紫杉醇素的IC50浓度分别为10.3 µM和6.2 µM。总之,蒲公英苷通过抑制MRP-1 (ABCC1)、HDAC-1、HDAC-2、LCK和SYK,对TNBC细胞具有良好的治疗潜力。
{"title":"Exploring potential molecular targets and therapeutic efficacy of beauvericin in triple-negative breast cancer cells","authors":"","doi":"10.1016/j.compbiolchem.2024.108154","DOIUrl":"10.1016/j.compbiolchem.2024.108154","url":null,"abstract":"<div><p>Triple negative breast cancer (TNBC) presents a significant global health concern due to its aggressive nature, high mortality rate and limited treatment options, highlighting the urgent need for targeted therapies. Beauvericin, a bioactive fungal secondary metabolite, possess significant anticancer potential, although its molecular targets in cancer cells remain unexplored. This study has investigated possible molecular targets of beauvericin and its therapeutic insights in TNBC cells. <em>In silico</em> studies using molecular docking and MD simulation predicted the molecular targets of beauvericin. The identified targets included MRP-1 (ABCC1), HDAC-1, HDAC-2, LCK and SYK with average binding energy of −90.1, −44.3, −72.1, −105 and −60.8 KJ/mol, respectively, implying its multifaceted roles in reversing drug resistance, inhibiting epigenetic modulators and oncogenic tyrosine kinases. Beauvericin has significantly reduced the viability of MDA-MB-231 and MDA-MB-468 cells, with IC<sub>50</sub> concentrations of 4.4 and 3.9 µM, while concurrently elevating the intracellular ROS by 9.0 and 7.9 folds, respectively. Subsequent reduction of mitochondrial transmembrane potential in TNBC cells, has confirmed the induction of oxidative stress, leading to apoptotic cell death, as observed by flow cytometric analyses. Beauvericin has also arrested cell cycle at G1-phase and impaired the spheroid formation and clonal expansion abilities of TNBC cells. The viability of spheroids was reduced upon beauvericin treatment, exhibiting IC<sub>50</sub> concentrations of 10.3 and 6.2 µM in MDA-MB-468 and MDA-MB-231 cells, respectively. In conclusion, beauvericin has demonstrated promising therapeutic potential against TNBC cells through possible inhibition of MRP-1 (ABCC1), HDAC-1, HDAC-2, LCK and SYK.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141728467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unraveling the intricate physiological processes dysregulated in CHD-affected and Dan-Lou tablet-treated individuals 揭示受冠心病影响和服用丹络片的人体内失调的复杂生理过程
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-17 DOI: 10.1016/j.compbiolchem.2024.108151

Coronary heart disease (CHD), a multifactorial cardiovascular condition, arises from the accumulation of atherosclerotic plaque in the coronary arteries, resulting in compromised blood flow to the heart and complications such as angina, myocardial infarction, or heart failure. Addressing global prevalence, risk factors, and genetics is crucial for effective management. The current study aims to identify molecular biomarkers for CHD by scrutinizing the expression patterns of differentially expressed genes (DEGs), utilizing various bioinformatic tools. In this investigation, a total of 24 samples underwent examination using the GEO2R tool. These included eight samples from individuals before treatment (GSM5434123–30), eight samples from patients after Dan-Lou tablet treatment (GSM5434131–38), and eight samples from healthy control subjects (GSM5434139–46). A suite of bioinformatics tools was used to detect enriched genes within the network, namely, Cytoscape (v3.10.1) and Molecular Complex Detection (MCODE). Functional analysis of the DEGs was conducted via clusterProfiler, a R-based package, and ClueGO. 182 and 174 DEGs corresponding to untreated and treated patient sample groups were functionally annotated for gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) terms. ARF6 gene dysregulation was implicated in the myeloid cell apoptotic process (GO:0033028), regulation of actin cytoskeleton (hsa:04810), and other vital cellular functions. The myeloid cell apoptotic process (GO:0033028) was also observed to be regulated by the differential expression of the STAT5B gene. Additionally, STAT5B was found to be associated with the regulation of erythrocyte differentiation (GO:0045646). Providing targeted therapy based on the patient's idiosyncratic gene expression profiles could lead to the curing of various disorders in the near future.

冠心病(CHD)是一种多因素心血管疾病,由冠状动脉中的动脉粥样硬化斑块堆积引起,导致心脏血流受阻,引发心绞痛、心肌梗死或心力衰竭等并发症。解决全球发病率、风险因素和遗传学问题对于有效管理至关重要。目前的研究旨在利用各种生物信息学工具,通过仔细研究差异表达基因(DEGs)的表达模式,找出冠心病的分子生物标志物。在这项研究中,共使用 GEO2R 工具检查了 24 份样本。其中包括 8 份治疗前的个人样本(GSM5434123-30)、8 份丹参片治疗后的患者样本(GSM5434131-38)和 8 份健康对照组样本(GSM5434139-46)。研究人员使用了一套生物信息学工具来检测网络中的富集基因,即 Cytoscape (v3.10.1) 和 Molecular Complex Detection (MCODE)。通过基于 R 的软件包 clusterProfiler 和 ClueGO 对 DEGs 进行了功能分析。根据基因本体论(GO)和京都基因组百科全书(KEGG)术语,分别对未治疗和治疗患者样本组的182个和174个DEGs进行了功能注释。ARF6基因失调与髓系细胞凋亡过程(GO:0033028)、肌动蛋白细胞骨架调控(hsa:04810)和其他重要细胞功能有关。还观察到髓系细胞凋亡过程(GO:0033028)受 STAT5B 基因差异表达的调控。此外,研究还发现 STAT5B 与红细胞分化(GO:0045646)的调控有关。在不久的将来,根据患者的特异性基因表达谱提供靶向治疗可能会治愈各种疾病。
{"title":"Unraveling the intricate physiological processes dysregulated in CHD-affected and Dan-Lou tablet-treated individuals","authors":"","doi":"10.1016/j.compbiolchem.2024.108151","DOIUrl":"10.1016/j.compbiolchem.2024.108151","url":null,"abstract":"<div><p>Coronary heart disease (CHD), a multifactorial cardiovascular condition, arises from the accumulation of atherosclerotic plaque in the coronary arteries, resulting in compromised blood flow to the heart and complications such as angina, myocardial infarction, or heart failure. Addressing global prevalence, risk factors, and genetics is crucial for effective management. The current study aims to identify molecular biomarkers for CHD by scrutinizing the expression patterns of differentially expressed genes (DEGs), utilizing various bioinformatic tools. In this investigation, a total of 24 samples underwent examination using the GEO2R tool. These included eight samples from individuals before treatment (GSM5434123–30), eight samples from patients after Dan-Lou tablet treatment (GSM5434131–38), and eight samples from healthy control subjects (GSM5434139–46). A suite of bioinformatics tools was used to detect enriched genes within the network, namely, Cytoscape (v3.10.1) and Molecular Complex Detection (MCODE). Functional analysis of the DEGs was conducted via clusterProfiler, a R-based package, and ClueGO. 182 and 174 DEGs corresponding to untreated and treated patient sample groups were functionally annotated for gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) terms. <em>ARF6</em> gene dysregulation was implicated in the myeloid cell apoptotic process (GO:0033028), regulation of actin cytoskeleton (hsa:04810), and other vital cellular functions. The myeloid cell apoptotic process (GO:0033028) was also observed to be regulated by the differential expression of the <em>STAT5B</em> gene. Additionally, <em>STAT5B</em> was found to be associated with the regulation of erythrocyte differentiation (GO:0045646). Providing targeted therapy based on the patient's idiosyncratic gene expression profiles could lead to the curing of various disorders in the near future.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141838556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The antiviral drug Ribavirin effectively modulates the amyloid transformation of α-Synuclein protein 抗病毒药物利巴韦林能有效调节α-突触核蛋白的淀粉样转化
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-16 DOI: 10.1016/j.compbiolchem.2024.108155

α-Synuclein (α-syn) is an intrinsically disordered protein, linked genetically and neuropathologically to Parkinson's disease where this protein aggregates within the brain. Hence, identifying compounds capable of impeding α-syn aggregation puts forward a promising approach for the development of disease-modifying therapies. Herein, we investigated the efficacy of Ribavirin, an FDA-approved compound, in curtailing α-syn amyloid transformation, employing an array of bioinformatic tools and systematic analysis using biophysical techniques. Ribavirin shows a dose dependent anti-aggregation propensity where it effectively subdued the formation of mature fibrillar aggregates of α-syn, where even at the lowest concentration there was a 69 % reduction in the ThT maxima. Ribavirin averts the formation of mature fibrillar aggregates by interacting with the NAC domain of α-syn. Ribavirin redirects the amyloid transformation of α-syn by emanating aggregates of lower order with reduced cross β-sheet signature and revokes the formation of on-pathway amyloids. Collectively, our study puts forward the novel potency of Ribavirin as a promising molecule for therapeutic intervention in Parkinson’s disease.

α-突触核蛋白(α-syn)是一种内在紊乱的蛋白质,在遗传学和神经病理学上与帕金森病有关,这种蛋白质会在大脑内聚集。因此,确定能够阻碍α-syn聚集的化合物是开发疾病调节疗法的一个很有前景的方法。在此,我们利用一系列生物信息学工具和生物物理技术的系统分析,研究了利巴韦林(一种美国 FDA 批准的化合物)在抑制 α-syn 淀粉样蛋白转化方面的功效。利巴韦林显示出剂量依赖性抗聚集倾向,它能有效抑制α-syn成熟纤维状聚集体的形成,即使在最低浓度下,ThT最大值也降低了69%。利巴韦林通过与 α-syn 的 NAC 结构域相互作用,避免了成熟纤维状聚集体的形成。利巴韦林通过产生具有较低交叉β片特征的低阶聚集体,重定向了α-syn的淀粉样转化,并阻止了路径上淀粉样的形成。总之,我们的研究提出了利巴韦林的新功效,它是治疗帕金森病的一种有前途的分子。
{"title":"The antiviral drug Ribavirin effectively modulates the amyloid transformation of α-Synuclein protein","authors":"","doi":"10.1016/j.compbiolchem.2024.108155","DOIUrl":"10.1016/j.compbiolchem.2024.108155","url":null,"abstract":"<div><p>α-Synuclein (α-syn) is an intrinsically disordered protein, linked genetically and neuropathologically to Parkinson's disease where this protein aggregates within the brain. Hence, identifying compounds capable of impeding α-syn aggregation puts forward a promising approach for the development of disease-modifying therapies. Herein, we investigated the efficacy of Ribavirin, an FDA-approved compound, in curtailing α-syn amyloid transformation, employing an array of bioinformatic tools and systematic analysis using biophysical techniques. Ribavirin shows a dose dependent anti-aggregation propensity where it effectively subdued the formation of mature fibrillar aggregates of α-syn, where even at the lowest concentration there was a 69 % reduction in the ThT maxima. Ribavirin averts the formation of mature fibrillar aggregates by interacting with the NAC domain of α-syn. Ribavirin redirects the amyloid transformation of α-syn by emanating aggregates of lower order with reduced cross β-sheet signature and revokes the formation of on-pathway amyloids. Collectively, our study puts forward the novel potency of Ribavirin as a promising molecule for therapeutic intervention in Parkinson’s disease.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141716309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SwissADME studies and Density Functional Theory (DFT) approaches of methyl substituted curcumin derivatives 姜黄素甲基取代衍生物的 SwissADME 研究和密度泛函理论 (DFT) 方法
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-15 DOI: 10.1016/j.compbiolchem.2024.108153

Research suggests curcumin's safety and efficacy, prompting interest in its use for treating and preventing various human diseases. The current study aimed to predict drag ability of methyl substituted curcumin derivatives (BL1 to BL4) using SwissADME and Density Functional Theory (DFT) approaches. The curcumin derivatives investigated mostly adhere to Lipinski's rule of five, with molecular properties including MW, F. Csp3, nHBA, nHBD, and TPSA falling within acceptable limits. The compounds demonstrating high lipophilicity while poor water solubility. The pharmacokinetic evaluation revealed favorable gastrointestinal absorption and blood-brain barrier permeation while none were identified as substrates for P-glycoprotein, however, revealed inhibitory actions against various cytochrome P450 enzymes. Additionally, all derivatives exhibited a consistent bioavailability score of 0.55. Similarly, the DFT computations of the compounds of the curcumin derivatives were conducted at B3LYP/6–311 G** level to predict and then assess the key electronic characteristics underlying the bioactivity. Accordingly, the BL4 molecule (ΔEgap= 4.105 eV) would prefer to interact with the external molecular system more than the other molecules due to having the biggest energy gap. The ΔNmax (2.328 eV) and Δεback-donat. (-0.446 eV) scores implied that BL1 would have more charge transfer capability and the lowest stability via back donation among the compounds. In short, the derivative (BL1 to BL4) exhibited strong extrinsic therapeutic properties and therefore stand eligible for further in vitro and in vivo studies.

研究表明姜黄素具有安全性和有效性,从而引起了人们对其用于治疗和预防各种人类疾病的兴趣。本研究旨在利用 SwissADME 和密度泛函理论(DFT)方法预测姜黄素甲基取代衍生物(BL1 至 BL4)的拖曳能力。所研究的姜黄素衍生物大多符合李宾斯基五定律,其分子性质包括分子量、F. Csp3、nHBA、nHBD 和 TPSA,均在可接受的范围内。这些化合物具有较高的亲脂性,但水溶性较差。药代动力学评估显示,这些化合物具有良好的胃肠道吸收和血脑屏障渗透性,但没有一种被鉴定为 P 糖蛋白的底物,不过却显示出对各种细胞色素 P450 酶的抑制作用。此外,所有衍生物的生物利用度得分均为 0.55。同样,我们在 B3LYP/6-311 G** 水平上对姜黄素衍生物化合物进行了 DFT 计算,以预测并评估生物活性的关键电子特性。据此,BL4 分子(ΔEgap= 4.105 eV)由于具有最大的能隙,比其他分子更倾向于与外部分子系统相互作用。ΔNmax(2.328 eV)和 Δεback-donat.(-0.446 eV) 的得分表明,BL1 具有更强的电荷转移能力,并且是所有化合物中通过反向捐赠实现稳定性最低的。总之,衍生物(BL1 至 BL4)表现出很强的外在治疗特性,因此有资格进行进一步的体外和体内研究。
{"title":"SwissADME studies and Density Functional Theory (DFT) approaches of methyl substituted curcumin derivatives","authors":"","doi":"10.1016/j.compbiolchem.2024.108153","DOIUrl":"10.1016/j.compbiolchem.2024.108153","url":null,"abstract":"<div><p>Research suggests curcumin's safety and efficacy, prompting interest in its use for treating and preventing various human diseases. The current study aimed to predict drag ability of methyl substituted curcumin derivatives (<strong>BL1</strong> to <strong>BL4</strong>) using SwissADME and Density Functional Theory <strong>(</strong>DFT) approaches. The curcumin derivatives investigated mostly adhere to Lipinski's rule of five, with molecular properties including MW, F. Csp3, nHBA, nHBD, and TPSA falling within acceptable limits. The compounds demonstrating high lipophilicity while poor water solubility. The pharmacokinetic evaluation revealed favorable gastrointestinal absorption and blood-brain barrier permeation while none were identified as substrates for P-glycoprotein, however, revealed inhibitory actions against various cytochrome P450 enzymes. Additionally, all derivatives exhibited a consistent bioavailability score of 0.55. Similarly, the DFT computations of the compounds of the curcumin derivatives were conducted at B3LYP/6–311 G** level to predict and then assess the key electronic characteristics underlying the bioactivity. Accordingly, the BL4 molecule (ΔE<sub>gap</sub>= 4.105 eV) would prefer to interact with the external molecular system more than the other molecules due to having the biggest energy gap. The ΔN<sub>max</sub> (2.328 eV) and Δε<sub>back-donat.</sub> (-0.446 eV) scores implied that BL1 would have more charge transfer capability and the lowest stability via back donation among the compounds. In short, the derivative (<strong>BL1</strong> to <strong>BL4</strong>) exhibited strong extrinsic therapeutic properties and therefore stand eligible for further in vitro and in vivo studies.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141701518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Small-group originating model: Optimized individual-level GWAS simulation featured by SLiM and using open-access data 小群体起源模型:以 SLiM 为特色并使用开放获取数据的优化个体水平 GWAS 模拟
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-15 DOI: 10.1016/j.compbiolchem.2024.108147

The development of analytical methods for Genome-wide Association Studies (GWAS) has outpaced the evolution of simulation techniques and pipelines. This disparity underscores the importance of innovative simulation methods that can keep pace with the rapidly increasing scale of GWAS. The median sample size of GWAS over the past ten years has exceeded 50,000 individuals, a trend that emphasizes the need for simulation tools capable of generating data on a similar or larger scale. This paper introduces a novel method, the small-group originating (SGO) model, utilizing the SLiM software for simulating individual-level GWAS data. Our standardized protocol facilitates the generation of tens of thousands of pseudo-individuals with millions of variants from small (30−90) open-access datasets.

SGO stands out, especially when compared to the widely-used resampling method in HapGen, showcasing superior simulation efficiency for large sample sizes (> 13,000) of unrelated individuals. This capability is particularly relevant given the current trajectory towards larger GWAS, necessitating tools that can simulate datasets reflective of this growth. Additionally, SGO provides customization options and can model dynamic life cycles and mating across generations, positioning it as a highly promising alternative for GWAS simulations.

In a case study, sensitivity analyses of chromosome-level principal component analysis and kinship coefficient estimation were conducted. The results highlighted the poor robustness of chromosome-level quality control (QC) indexes and the uneven distribution of population structure across chromosomes and ancestries, advocating for the caution against relying solely on chromosome-level QC statistics.

With its flexible and efficient approach to generating pseudo GWAS data, our standardized SGO protocol emerges as a crucial asset for method development, power analysis, and benchmarking in GWAS research. It is especially vital in the context of accommodating the demands for large-scale simulations, aligning with the current and future scale of GWAS.

全基因组关联研究(GWAS)分析方法的发展速度超过了模拟技术和管道的发展速度。这种差距凸显了创新模拟方法的重要性,它能跟上全基因组关联研究规模迅速扩大的步伐。在过去十年中,GWAS 的样本量中位数已超过 50,000 个个体,这一趋势强调了对能够生成类似或更大规模数据的模拟工具的需求。本文介绍了一种新方法--小群体起源(SGO)模型,利用 SLiM 软件模拟个体水平的 GWAS 数据。与 HapGen 中广泛使用的重采样方法相比,SGO 尤为突出,它对大样本量(13,000 个)非相关个体的模拟效率更高。鉴于目前全球基因组研究正朝着大型化的方向发展,因此需要能模拟反映这一发展的数据集的工具,而这一功能就显得尤为重要。此外,SGO 还提供定制选项,并能模拟动态生命周期和跨代交配,使其成为极有前途的 GWAS 模拟替代工具。在一项案例研究中,对染色体级主成分分析和亲缘关系系数估计进行了敏感性分析。结果表明,染色体水平的质量控制(QC)指标稳健性差,而且种群结构在染色体和祖先间分布不均,因此主张谨慎对待单纯依赖染色体水平的 QC 统计。它在满足大规模模拟需求方面尤为重要,符合 GWAS 当前和未来的规模。
{"title":"Small-group originating model: Optimized individual-level GWAS simulation featured by SLiM and using open-access data","authors":"","doi":"10.1016/j.compbiolchem.2024.108147","DOIUrl":"10.1016/j.compbiolchem.2024.108147","url":null,"abstract":"<div><p>The development of analytical methods for Genome-wide Association Studies (GWAS) has outpaced the evolution of simulation techniques and pipelines. This disparity underscores the importance of innovative simulation methods that can keep pace with the rapidly increasing scale of GWAS. The median sample size of GWAS over the past ten years has exceeded 50,000 individuals, a trend that emphasizes the need for simulation tools capable of generating data on a similar or larger scale. This paper introduces a novel method, the small-group originating (SGO) model, utilizing the SLiM software for simulating individual-level GWAS data. Our standardized protocol facilitates the generation of tens of thousands of pseudo-individuals with millions of variants from small (30−90) open-access datasets.</p><p>SGO stands out, especially when compared to the widely-used resampling method in HapGen, showcasing superior simulation efficiency for large sample sizes (&gt; 13,000) of unrelated individuals. This capability is particularly relevant given the current trajectory towards larger GWAS, necessitating tools that can simulate datasets reflective of this growth. Additionally, SGO provides customization options and can model dynamic life cycles and mating across generations, positioning it as a highly promising alternative for GWAS simulations.</p><p>In a case study, sensitivity analyses of chromosome-level principal component analysis and kinship coefficient estimation were conducted. The results highlighted the poor robustness of chromosome-level quality control (QC) indexes and the uneven distribution of population structure across chromosomes and ancestries, advocating for the caution against relying solely on chromosome-level QC statistics.</p><p>With its flexible and efficient approach to generating pseudo GWAS data, our standardized SGO protocol emerges as a crucial asset for method development, power analysis, and benchmarking in GWAS research. It is especially vital in the context of accommodating the demands for large-scale simulations, aligning with the current and future scale of GWAS.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141700321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantum-level machine learning calculations of Levodopa 左旋多巴的量子级机器学习计算
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-14 DOI: 10.1016/j.compbiolchem.2024.108146

Many drug molecules contain functional groups, resulting in a torsional barrier corresponding to rotation around the bond linking the fragments. In medicinal chemistry and pharmaceutical sciences, inclusive of drug design studies, the exact calculation of the potential energy surface (PES) of these molecular torsions is extremely important and precious. Machine learning (ML), including deep learning (DL), is currently one of the most rapidly evolving tools in computer-aided drug discovery and molecular simulations. In this work, we used ANI-1x neural network potential as a quantum-level ML to predict the PESs of the L-3,4-dihydroxyphenylalanine (Levodopa) antiparkinsonian drug molecule. The electronic energies and structural parameters calculated by density functional theory (DFT) using the wB97X method and all possible Pople's basis sets indicated the 6–31G(d) basis set, when used with the wB97X functional, exhibits behavior similar to that of the ANI-1x model. The vibrational frequencies investigation showed a linear correlation between DFT and ML data. All ANI-1x calculations were completed quickly in a very short computing time. From this perspective, we expect the ANI-1x dataset applied in this work to be appreciably efficient and effective in computational structure-based drug design studies.

许多药物分子都含有功能基团,从而产生与围绕连接片段的键旋转相对应的扭转障碍。在药物化学和制药科学(包括药物设计研究)中,精确计算这些分子扭转的势能面(PES)极为重要和珍贵。机器学习(ML),包括深度学习(DL),是目前计算机辅助药物发现和分子模拟领域发展最迅速的工具之一。在这项工作中,我们使用 ANI-1x 神经网络势作为量子级 ML 来预测 L-3,4-二羟基苯丙氨酸(左旋多巴)抗帕金森病药物分子的 PES。密度泛函理论(DFT)使用 wB97X 方法和所有可能的波普尔基集计算出的电子能量和结构参数表明,6-31G(d) 基集与 wB97X 函数一起使用时,表现出与 ANI-1x 模型相似的行为。振动频率调查显示,DFT 和 ML 数据之间存在线性相关。所有 ANI-1x 计算都在很短的计算时间内快速完成。从这个角度来看,我们希望本研究中应用的 ANI-1x 数据集能够在基于结构的药物设计计算研究中发挥显著的效率和效果。
{"title":"Quantum-level machine learning calculations of Levodopa","authors":"","doi":"10.1016/j.compbiolchem.2024.108146","DOIUrl":"10.1016/j.compbiolchem.2024.108146","url":null,"abstract":"<div><p>Many drug molecules contain functional groups, resulting in a torsional barrier corresponding to rotation around the bond linking the fragments. In medicinal chemistry and pharmaceutical sciences, inclusive of drug design studies, the exact calculation of the potential energy surface (PES) of these molecular torsions is extremely important and precious. Machine learning (ML), including deep learning (DL), is currently one of the most rapidly evolving tools in computer-aided drug discovery and molecular simulations. In this work, we used ANI-1x neural network potential as a quantum-level ML to predict the PESs of the L-3,4-dihydroxyphenylalanine (Levodopa) antiparkinsonian drug molecule. The electronic energies and structural parameters calculated by density functional theory (DFT) using the wB97X method and all possible Pople's basis sets indicated the 6–31G(d) basis set, when used with the wB97X functional, exhibits behavior similar to that of the ANI-1x model. The vibrational frequencies investigation showed a linear correlation between DFT and ML data. All ANI-1x calculations were completed quickly in a very short computing time. From this perspective, we expect the ANI-1x dataset applied in this work to be appreciably efficient and effective in computational structure-based drug design studies.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141690845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subclassification of lung adenocarcinoma through comprehensive multi-omics data to benefit survival outcomes 通过全面的多组学数据对肺腺癌进行亚分类以改善生存结果
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-07-14 DOI: 10.1016/j.compbiolchem.2024.108150

Objectives

Lung adenocarcinoma (LUAD) is the most common subtype of non-small cell lung cancer. Understanding the molecular mechanisms underlying tumor progression is of great clinical significance. This study aims to identify novel molecular markers associated with LUAD subtypes, with the goal of improving the precision of LUAD subtype classification. Additionally, optimization efforts are directed towards enhancing insights from the perspective of patient survival analysis.

Materials and methods

We propose an innovative feature-selection approach that focuses on LUAD classification, which is comprehensive and robust. The proposed method integrates multi-omics data from The Cancer Genome Atlas (TCGA) and leverages a synergistic combination of max-relevance and min-redundancy, least absolute shrinkage and selection operator, and Boruta algorithms. These selected features were deployed in six machine-learning classifiers: logistic regression, random forest, support vector machine, naive Bayes, k-Nearest Neighbor, and XGBoost.

Results

The proposed approach achieved an area under the receiver operating characteristic curve (AUC) of 0.9958 for LR. Notably, the accuracy and AUC of a composite model incorporating copy number, methylation, as well as RNA- sequencing data for expression of exons, genes, and miRNA mature strands surpassed the accuracy and AUC metrics of models with single-omics data or other multi-omics combinations. Survival analyses, revealed the SVM classifier to elicit optimal classification, outperforming that achieved by TCGA. To enhance model interpretability, SHapley Additive exPlanations (SHAP) values were utilized to elucidate the impact of each feature on the predictions. Gene Ontology (GO) enrichment analysis identified significant biological processes, molecular functions, and cellular components associated with LUAD subtypes.

Conclusion

In summary, our feature selection process, based on TCGA multi-omics data and combined with multiple machine learning classifiers, proficiently identifies molecular subtypes of lung adenocarcinoma and their corresponding significant genes. Our method could enhance the early detection and diagnosis of LUAD, expedite the development of targeted therapies and, ultimately, lengthen patient survival.

目的肺腺癌(LUAD)是非小细胞肺癌中最常见的亚型。了解肿瘤进展的分子机制具有重要的临床意义。本研究旨在鉴定与 LUAD 亚型相关的新型分子标记物,以提高 LUAD 亚型分类的精确度。材料与方法 我们提出了一种创新的特征选择方法,该方法侧重于 LUAD 分类,具有全面性和稳健性。所提出的方法整合了癌症基因组图谱(TCGA)中的多组学数据,并利用了最大相关性和最小冗余性、最小绝对收缩和选择算子以及 Boruta 算法的协同组合。这些选定的特征被部署在六种机器学习分类器中:逻辑回归、随机森林、支持向量机、天真贝叶斯、k-近邻和 XGBoost。值得注意的是,包含拷贝数、甲基化以及外显子、基因和 miRNA 成熟链表达的 RNA 测序数据的复合模型的准确度和 AUC 均超过了单一组学数据或其他多组学组合模型的准确度和 AUC 指标。生存分析表明,SVM 分类器的分类效果最佳,超过了 TCGA 的分类效果。为了提高模型的可解释性,我们使用了SHAPLE Additive exPlanations(SHAP)值来阐明每个特征对预测的影响。基因本体(GO)富集分析确定了与 LUAD 亚型相关的重要生物过程、分子功能和细胞成分。 结论综上所述,我们的特征选择过程基于 TCGA 多组学数据,并与多种机器学习分类器相结合,能熟练识别肺腺癌的分子亚型及其相应的重要基因。我们的方法可以提高肺腺癌的早期发现和诊断率,加快靶向疗法的开发,并最终延长患者的生存期。
{"title":"Subclassification of lung adenocarcinoma through comprehensive multi-omics data to benefit survival outcomes","authors":"","doi":"10.1016/j.compbiolchem.2024.108150","DOIUrl":"10.1016/j.compbiolchem.2024.108150","url":null,"abstract":"<div><h3>Objectives</h3><p>Lung adenocarcinoma (LUAD) is the most common subtype of non-small cell lung cancer. Understanding the molecular mechanisms underlying tumor progression is of great clinical significance. This study aims to identify novel molecular markers associated with LUAD subtypes, with the goal of improving the precision of LUAD subtype classification. Additionally, optimization efforts are directed towards enhancing insights from the perspective of patient survival analysis.</p></div><div><h3>Materials and methods</h3><p>We propose an innovative feature-selection approach that focuses on LUAD classification, which is comprehensive and robust. The proposed method integrates multi-omics data from The Cancer Genome Atlas (TCGA) and leverages a synergistic combination of max-relevance and min-redundancy, least absolute shrinkage and selection operator, and Boruta algorithms. These selected features were deployed in six machine-learning classifiers: logistic regression, random forest, support vector machine, naive Bayes, k-Nearest Neighbor, and XGBoost.</p></div><div><h3>Results</h3><p>The proposed approach achieved an area under the receiver operating characteristic curve (AUC) of 0.9958 for LR. Notably, the accuracy and AUC of a composite model incorporating copy number, methylation, as well as RNA- sequencing data for expression of exons, genes, and miRNA mature strands surpassed the accuracy and AUC metrics of models with single-omics data or other multi-omics combinations. Survival analyses, revealed the SVM classifier to elicit optimal classification, outperforming that achieved by TCGA. To enhance model interpretability, SHapley Additive exPlanations (SHAP) values were utilized to elucidate the impact of each feature on the predictions. Gene Ontology (GO) enrichment analysis identified significant biological processes, molecular functions, and cellular components associated with LUAD subtypes.</p></div><div><h3>Conclusion</h3><p>In summary, our feature selection process, based on TCGA multi-omics data and combined with multiple machine learning classifiers, proficiently identifies molecular subtypes of lung adenocarcinoma and their corresponding significant genes. Our method could enhance the early detection and diagnosis of LUAD, expedite the development of targeted therapies and, ultimately, lengthen patient survival.</p></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141623390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Biology and Chemistry
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1