首页 > 最新文献

Computational Biology and Chemistry最新文献

英文 中文
Integrating (deep) machine learning and cheminformatics for predicting human intestinal absorption of small molecules 整合(深度)机器学习和化学信息学,预测人体肠道对小分子的吸收情况
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-28 DOI: 10.1016/j.compbiolchem.2024.108270
Orchid Baruah , Upashya Parasar , Anirban Borphukan , Bikram Phukan , Pankaj Bharali , Selvaraman Nagamani , Hridoy Jyoti Mahanta
The oral route is the most preferred route for drug delivery, due to which the largest share of the pharmaceutical market is represented by oral drugs. Human intestinal absorption (HIA) is closely related to oral bioavailability making it an important factor in predicting drug absorption. In this study, we focus on predicting drug permeability at HIA as a marker for oral bioavailability. A set of 2648 compounds were collected from some early as well as recent works and curated to build a robust dataset. Five machine learning (ML) algorithms have been trained with a set of molecular descriptors of these compounds which have been selected after rigorous feature engineering. Additionally, two deep learning models - graph convolution neural network (GCNN) and graph attention network (GAT) based model were developed using the same set of compounds to exploit the predictability with automated extracted features. The numerical analyses show that out the five ML models, Random forest and LightGBM could predict with an accuracy of 87.71 % and 86.04 % on the test set and 81.43 % and 77.30 % with the external validation set respectively. Whereas with the GCNN and GAT based models, the final accuracy achieved was 77.69 % and 78.58 % on test set and 79.29 % and 79.42 % on the external validation set respectively. We believe deployment of these models for screening oral drugs can provide promising results and therefore deposited the dataset and models on the GitHub platform (https://github.com/hridoy69/HIA).
口服途径是最受欢迎的给药途径,因此口服药物在医药市场中占有最大份额。人体肠道吸收(HIA)与口服生物利用度密切相关,因此是预测药物吸收的一个重要因素。在本研究中,我们将重点放在预测药物在 HIA 的渗透性,以此作为口服生物利用度的标志。我们从一些早期和近期的研究中收集了 2648 种化合物,并对其进行了整理,从而建立了一个强大的数据集。经过严格的特征工程筛选,使用这些化合物的一组分子描述符训练了五种机器学习(ML)算法。此外,还使用同一组化合物开发了两种深度学习模型--基于图卷积神经网络(GCNN)和图注意网络(GAT)的模型,以利用自动提取的特征进行预测。数值分析表明,在五个 ML 模型中,随机森林和 LightGBM 在测试集上的预测准确率分别为 87.71 % 和 86.04 %,在外部验证集上的预测准确率分别为 81.43 % 和 77.30 %。而基于 GCNN 和 GAT 的模型在测试集上的最终准确率分别为 77.69 % 和 78.58 %,在外部验证集上的准确率分别为 79.29 % 和 79.42 %。我们相信,将这些模型用于筛选口服药物能带来可喜的结果,因此将数据集和模型存入了 GitHub 平台 (https://github.com/hridoy69/HIA)。
{"title":"Integrating (deep) machine learning and cheminformatics for predicting human intestinal absorption of small molecules","authors":"Orchid Baruah ,&nbsp;Upashya Parasar ,&nbsp;Anirban Borphukan ,&nbsp;Bikram Phukan ,&nbsp;Pankaj Bharali ,&nbsp;Selvaraman Nagamani ,&nbsp;Hridoy Jyoti Mahanta","doi":"10.1016/j.compbiolchem.2024.108270","DOIUrl":"10.1016/j.compbiolchem.2024.108270","url":null,"abstract":"<div><div>The oral route is the most preferred route for drug delivery, due to which the largest share of the pharmaceutical market is represented by oral drugs. Human intestinal absorption (HIA) is closely related to oral bioavailability making it an important factor in predicting drug absorption. In this study, we focus on predicting drug permeability at HIA as a marker for oral bioavailability. A set of 2648 compounds were collected from some early as well as recent works and curated to build a robust dataset. Five machine learning (ML) algorithms have been trained with a set of molecular descriptors of these compounds which have been selected after rigorous feature engineering. Additionally, two deep learning models - graph convolution neural network (GCNN) and graph attention network (GAT) based model were developed using the same set of compounds to exploit the predictability with automated extracted features. The numerical analyses show that out the five ML models, Random forest and LightGBM could predict with an accuracy of 87.71 % and 86.04 % on the test set and 81.43 % and 77.30 % with the external validation set respectively. Whereas with the GCNN and GAT based models, the final accuracy achieved was 77.69 % and 78.58 % on test set and 79.29 % and 79.42 % on the external validation set respectively. We believe deployment of these models for screening oral drugs can provide promising results and therefore deposited the dataset and models on the GitHub platform (<span><span>https://github.com/hridoy69/HIA</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108270"},"PeriodicalIF":2.6,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI screening and molecular dynamic simulation-driven identification of novel inhibitors of TGFßR1 for pancreatic cancer therapy 通过人工智能筛选和分子动态模拟鉴定用于胰腺癌治疗的新型 TGFßR1 抑制剂。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-28 DOI: 10.1016/j.compbiolchem.2024.108262
Samvedna Singh , Kiran Bharat Lokhande , Aman Chandra Kaushik , Ashutosh Singh , Shakti Sahi
Pancreatic cancer, with a 5-year survival rate below 10 %, is one of the deadliest malignancies. The TGF-ß pathway plays a crucial role in this disease, making it a key target for therapeutic intervention. Clinical trials targeting TGF-β have faced challenges of toxicity and limited efficacy, highlighting the need for more potent small molecule inhibitors. We selected TGFßR1 as the drug target to inhibit TGF-ß signaling in pancreatic cancer. A multi-faceted approach was employed, commencing with AI-driven screening techniques to rapidly identify potential TGFßR1 inhibitors from vast compound libraries, including the ZINC and ChEMBL databases. AI-screened compounds were further validated through structure-based high-throughput virtual screening (HTVS) to evaluate their binding affinity to TGFßR1. In addition to this, a dedicated library of anticancer compounds (65,000 compounds) and protein kinase inhibitors (36,324 compounds) were also used for HTVS. Subsequently, pharmacokinetic profiling narrowed the selection to 40 hit compounds. Five hit compounds were chosen based on binding affinity, non-bonded interactions, stereochemistry, and pharmacokinetic profiles for molecular dynamics (MD) simulations. Trajectory analysis showed that residues HIS283, ASP351, LYS232, SER280, ILE211, and LYS213 within TGFßR1's active site are crucial for ligand binding through hydrogen bonds and hydrophobic interactions. Principal component analysis (PCA) and Dynamic cross-correlation matrix (DCCM) analysis were used to evaluate the receptor's dynamic response to the hit compounds. The simulation data revealed that compounds 1, 2, 3, 4, and 5 formed stable complexes with TGFßR1. Notably, post-MDS MM-GBSA analysis showed that compounds 4 and 5 exhibited exceptionally strong binding energies of −81.0 kcal/mol and −85.5 kcal/mol, respectively. The comprehensive computational analysis confirms compounds 4 and 5 as promising TGFßR1 hits with potential therapeutic applications in development of new treatments for pancreatic cancer.
胰腺癌的 5 年生存率低于 10%,是最致命的恶性肿瘤之一。TGF-ß 通路在这种疾病中起着至关重要的作用,因此成为治疗干预的关键靶点。针对 TGF-β 的临床试验面临着毒性和疗效有限的挑战,这凸显了对更强效小分子抑制剂的需求。我们选择 TGFßR1 作为抑制胰腺癌 TGF-ß 信号转导的药物靶点。我们采用了一种多方面的方法,首先利用人工智能驱动的筛选技术,从庞大的化合物库(包括 ZINC 和 ChEMBL 数据库)中快速识别出潜在的 TGFßR1 抑制剂。通过基于结构的高通量虚拟筛选(HTVS)进一步验证了人工智能筛选出的化合物,以评估它们与 TGFßR1 的结合亲和力。除此之外,HTVS 还使用了专门的抗癌化合物(65,000 个化合物)和蛋白激酶抑制剂(36,324 个化合物)库。随后,药代动力学分析将选择范围缩小到 40 个命中化合物。根据结合亲和力、非键相互作用、立体化学和药代动力学特征,选择了五个命中化合物进行分子动力学(MD)模拟。轨迹分析表明,TGFßR1 活性位点内的 HIS283、ASP351、LYS232、SER280、ILE211 和 LYS213 等残基通过氢键和疏水相互作用对配体的结合至关重要。研究人员采用主成分分析(PCA)和动态交叉相关矩阵(DCCM)分析来评估受体对命中化合物的动态响应。模拟数据显示,化合物 1、2、3、4 和 5 与 TGFßR1 形成了稳定的复合物。值得注意的是,MDS MM-GBSA 后分析表明,化合物 4 和 5 的结合能特别强,分别为 -81.0 kcal/mol 和 -85.5 kcal/mol。综合计算分析证实,化合物 4 和 5 是很有希望的 TGFßR1 靶点,具有开发胰腺癌新疗法的潜在治疗用途。
{"title":"AI screening and molecular dynamic simulation-driven identification of novel inhibitors of TGFßR1 for pancreatic cancer therapy","authors":"Samvedna Singh ,&nbsp;Kiran Bharat Lokhande ,&nbsp;Aman Chandra Kaushik ,&nbsp;Ashutosh Singh ,&nbsp;Shakti Sahi","doi":"10.1016/j.compbiolchem.2024.108262","DOIUrl":"10.1016/j.compbiolchem.2024.108262","url":null,"abstract":"<div><div>Pancreatic cancer, with a 5-year survival rate below 10 %, is one of the deadliest malignancies. The TGF-ß pathway plays a crucial role in this disease, making it a key target for therapeutic intervention. Clinical trials targeting TGF-β have faced challenges of toxicity and limited efficacy, highlighting the need for more potent small molecule inhibitors. We selected TGFßR1 as the drug target to inhibit TGF-ß signaling in pancreatic cancer. A multi-faceted approach was employed, commencing with AI-driven screening techniques to rapidly identify potential TGFßR1 inhibitors from vast compound libraries, including the ZINC and ChEMBL databases. AI-screened compounds were further validated through structure-based high-throughput virtual screening (HTVS) to evaluate their binding affinity to TGFßR1. In addition to this, a dedicated library of anticancer compounds (65,000 compounds) and protein kinase inhibitors (36,324 compounds) were also used for HTVS. Subsequently, pharmacokinetic profiling narrowed the selection to 40 hit compounds. Five hit compounds were chosen based on binding affinity, non-bonded interactions, stereochemistry, and pharmacokinetic profiles for molecular dynamics (MD) simulations. Trajectory analysis showed that residues HIS283, ASP351, LYS232, SER280, ILE211, and LYS213 within TGFßR1's active site are crucial for ligand binding through hydrogen bonds and hydrophobic interactions. Principal component analysis (PCA) and Dynamic cross-correlation matrix (DCCM) analysis were used to evaluate the receptor's dynamic response to the hit compounds. The simulation data revealed that compounds 1, 2, 3, 4, and 5 formed stable complexes with TGFßR1. Notably, post-MDS MM-GBSA analysis showed that compounds 4 and 5 exhibited exceptionally strong binding energies of −81.0 kcal/mol and −85.5 kcal/mol, respectively. The comprehensive computational analysis confirms compounds 4 and 5 as promising TGFßR1 hits with potential therapeutic applications in development of new treatments for pancreatic cancer.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108262"},"PeriodicalIF":2.6,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structure-based screening of FDA-approved drugs and molecular dynamics simulation to identify potential leukocyte antigen related protein (PTP-LAR) inhibitors 基于结构筛选 FDA 批准的药物和分子动力学模拟,以确定潜在的白细胞抗原相关蛋白 (PTP-LAR) 抑制剂。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-28 DOI: 10.1016/j.compbiolchem.2024.108264
Shan Du , Xin-Xin Zhang , Xiang Gao, Yan-Bin He
Leukocyte antigen related protein (LAR), a member of the PTP family, has become a potential target for exploring therapeutic interventions for various complex diseases, including neurodegenerative diseases. The reuse of FDA-approved drugs offers a promising approach for rapidly identifying potential LAR inhibitors. In this study, we conducted a structure-based virtual screening of FDA-approved drugs from ZINC database and selected candidate compounds based on their binding affinity and interactions with LAR. Our research revealed that the candidate compound ZINC6716957 exhibited excellent binding affinity to the binding pocket of LAR, formed interactions with key residues at the active site, and demonstrated low toxicity. To further understand the binding dynamics and interaction mechanisms, the 100-ns molecular dynamics simulations were performed. Post-dynamics analyses (RMSD, RMSF, SASA, hydrogen bond, binding free energy and free energy landscape) indicated that the compound ZINC6716957 stabilized the structure of LAR and the residues (Tyr1355, Arg1431, Lys1433, Arg1528, Tyr1563 and Thr1567) played a vital role in stabilizing the conformational changes of protein. In conclusion, the identified compound ZINC6716957 possessed robust inhibitory activity on LAR and merited extensive research, potentially unleashing its significant therapeutic potential in the treatment of complex diseases, particularly neurodegenerative disorders.
白细胞抗原相关蛋白(LAR)是 PTP 家族的成员之一,已成为探索各种复杂疾病(包括神经退行性疾病)治疗干预措施的潜在靶点。美国食品药物管理局(FDA)批准药物的再利用为快速鉴定潜在的 LAR 抑制剂提供了一种很有前景的方法。在本研究中,我们从 ZINC 数据库中对 FDA 批准的药物进行了基于结构的虚拟筛选,并根据其与 LAR 的结合亲和力和相互作用筛选出候选化合物。我们的研究发现,候选化合物 ZINC6716957 与 LAR 的结合口袋具有极佳的结合亲和力,与活性位点的关键残基形成了相互作用,并表现出较低的毒性。为了进一步了解结合动力学和相互作用机制,我们进行了 100-ns 分子动力学模拟。后动力学分析(RMSD、RMSF、SASA、氢键、结合自由能和自由能景观)表明,化合物 ZINC6716957 稳定了 LAR 的结构,其中的残基(Tyr1355、Arg1431、Lys1433、Arg1528、Tyr1563 和 Thr1567)在稳定蛋白质构象变化中发挥了重要作用。总之,所发现的化合物 ZINC6716957 对 LAR 具有很强的抑制活性,值得广泛研究,有望在治疗复杂疾病,尤其是神经退行性疾病方面释放出巨大的治疗潜力。
{"title":"Structure-based screening of FDA-approved drugs and molecular dynamics simulation to identify potential leukocyte antigen related protein (PTP-LAR) inhibitors","authors":"Shan Du ,&nbsp;Xin-Xin Zhang ,&nbsp;Xiang Gao,&nbsp;Yan-Bin He","doi":"10.1016/j.compbiolchem.2024.108264","DOIUrl":"10.1016/j.compbiolchem.2024.108264","url":null,"abstract":"<div><div>Leukocyte antigen related protein (LAR), a member of the PTP family, has become a potential target for exploring therapeutic interventions for various complex diseases, including neurodegenerative diseases. The reuse of FDA-approved drugs offers a promising approach for rapidly identifying potential LAR inhibitors. In this study, we conducted a structure-based virtual screening of FDA-approved drugs from ZINC database and selected candidate compounds based on their binding affinity and interactions with LAR. Our research revealed that the candidate compound ZINC6716957 exhibited excellent binding affinity to the binding pocket of LAR, formed interactions with key residues at the active site, and demonstrated low toxicity. To further understand the binding dynamics and interaction mechanisms, the 100-ns molecular dynamics simulations were performed. Post-dynamics analyses (RMSD, RMSF, SASA, hydrogen bond, binding free energy and free energy landscape) indicated that the compound ZINC6716957 stabilized the structure of LAR and the residues (Tyr1355, Arg1431, Lys1433, Arg1528, Tyr1563 and Thr1567) played a vital role in stabilizing the conformational changes of protein. In conclusion, the identified compound ZINC6716957 possessed robust inhibitory activity on LAR and merited extensive research, potentially unleashing its significant therapeutic potential in the treatment of complex diseases, particularly neurodegenerative disorders.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108264"},"PeriodicalIF":2.6,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142568192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating pH-induced conformational switch in PIM-1: An integrated multi spectroscopic and MD simulation study PIM-1中pH值诱导构象转换的研究:多光谱和MD模拟综合研究。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-28 DOI: 10.1016/j.compbiolchem.2024.108265
Aanchal Rathi , Saba Noor , Shama Khan , Faizya Khan , Farah Anjum , Anam Ashraf , Aaliya Taiyab , Asimul Islam , Md. Imtaiyaz Hassan , Mohammad Mahfuzul Haque
PIM-1 is a Ser/Thr kinase, which has been extensively studied as a potential target for cancer therapy due to its significant roles in various cancers, including prostate and breast cancers. Given its importance in cancer, researchers are investigating the structure of PIM-1 for pharmacological inhibition to discover therapeutic intervention. This study examines structural and conformational changes in PIM-1 across different pH using various spectroscopic and computational techniques. Spectroscopic results indicate that PIM-1 maintains its secondary and tertiary structure within the pH range of 7.0–9.0. However, protein aggregation occurs in the acidic pH range of 5.0–6.0. Additionally, kinase assays suggested that PIM-1 activity is optimal within the pH range of 7.0–9.0. Subsequently, we performed a 100 ns all-atom molecular dynamics (MD) simulation to see the effect of pH on PIM-1 structural stability at the molecular level. MD simulation analysis revealed that PIM-1 retains its native conformation in alkaline conditions, with some residual fluctuations in acidic conditions as well. A strong correlation was observed between our MD simulation, spectroscopic, and enzymatic activity studies. Understanding the pH-dependent structural changes of PIM-1 can provide insights into its role in disease conditions and cellular homeostasis, particularly regarding protein function under varying pH conditions.
PIM-1 是一种 Ser/Thr 激酶,由于其在前列腺癌和乳腺癌等多种癌症中的重要作用,已被广泛研究为癌症治疗的潜在靶点。鉴于 PIM-1 在癌症中的重要作用,研究人员正在研究 PIM-1 的结构,以便通过药理抑制发现治疗干预措施。本研究利用各种光谱和计算技术研究了 PIM-1 在不同 pH 值下的结构和构象变化。光谱结果表明,PIM-1 在 7.0-9.0 的 pH 值范围内保持其二级和三级结构。然而,在 5.0-6.0 的酸性 pH 值范围内,蛋白质会发生聚集。此外,激酶测定表明,PIM-1 在 pH 值为 7.0-9.0 的范围内具有最佳活性。随后,我们进行了 100 ns 的全原子分子动力学(MD)模拟,以了解 pH 值对 PIM-1 分子结构稳定性的影响。MD 模拟分析表明,PIM-1 在碱性条件下保持原生构象,在酸性条件下也有一些残余波动。我们的 MD 模拟、光谱和酶活性研究之间存在很强的相关性。了解 PIM-1 结构的 pH 依赖性变化可以帮助人们深入了解它在疾病和细胞稳态中的作用,尤其是在不同 pH 条件下的蛋白质功能。
{"title":"Investigating pH-induced conformational switch in PIM-1: An integrated multi spectroscopic and MD simulation study","authors":"Aanchal Rathi ,&nbsp;Saba Noor ,&nbsp;Shama Khan ,&nbsp;Faizya Khan ,&nbsp;Farah Anjum ,&nbsp;Anam Ashraf ,&nbsp;Aaliya Taiyab ,&nbsp;Asimul Islam ,&nbsp;Md. Imtaiyaz Hassan ,&nbsp;Mohammad Mahfuzul Haque","doi":"10.1016/j.compbiolchem.2024.108265","DOIUrl":"10.1016/j.compbiolchem.2024.108265","url":null,"abstract":"<div><div>PIM-1 is a Ser/Thr kinase, which has been extensively studied as a potential target for cancer therapy due to its significant roles in various cancers, including prostate and breast cancers. Given its importance in cancer, researchers are investigating the structure of PIM-1 for pharmacological inhibition to discover therapeutic intervention. This study examines structural and conformational changes in PIM-1 across different pH using various spectroscopic and computational techniques. Spectroscopic results indicate that PIM-1 maintains its secondary and tertiary structure within the pH range of 7.0–9.0. However, protein aggregation occurs in the acidic pH range of 5.0–6.0. Additionally, kinase assays suggested that PIM-1 activity is optimal within the pH range of 7.0–9.0. Subsequently, we performed a 100 ns all-atom molecular dynamics (MD) simulation to see the effect of pH on PIM-1 structural stability at the molecular level. MD simulation analysis revealed that PIM-1 retains its native conformation in alkaline conditions, with some residual fluctuations in acidic conditions as well. A strong correlation was observed between our MD simulation, spectroscopic, and enzymatic activity studies. Understanding the pH-dependent structural changes of PIM-1 can provide insights into its role in disease conditions and cellular homeostasis, particularly regarding protein function under varying pH conditions.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108265"},"PeriodicalIF":2.6,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142568003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implications of trinodal inhibitions and drug repurposing in MAPK pathway: A putative remedy for breast cancer MAPK 通路中的三联抑制和药物再利用的意义:乳腺癌的可能治疗方法
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-24 DOI: 10.1016/j.compbiolchem.2024.108255
Shalini Majumder , Ekarsi Lodh , Tapan Chowdhury
Breast cancer has been one of the supreme causes of cancer-related deaths among women worldwide. To make the case even more compounded, due to innate or acquired causes, cancer cells often develop resistance against the available chemotherapy or monotargeted treatments. This resistance is concomitant with increased activation of the MAPK (mitogen-activated protein kinase) signaling pathway. This study simultaneously targets three imperative intermediates in this pathway using molecular docking and real-time simulation. Docking was performed via the integrated AutoDock Vina 1.1.2 & 1.2.5 of the PyRx software, while the Discovery Studio (BIOVIA) v24.1.0.23298 was utilized to conduct the simulation. The aim is to investigate the therapeutic prospects of known potential inhibitors of the targeted intermediates and repurposable drugs to comprehend the effectiveness of targeting these trinodes simultaneously. The target points were deemed to be PDPK1 (3-phosphoinositide-dependent protein kinase 1), ERK1/2 (extracellular signal-related protein kinases 1/2), and mTOR (mammalian target of Rapamycin). Our study reveals that out of the candidate inhibitors chosen for each node, MP7 exhibited the most superior binding affinities for all three: −10.918 kcal/mol for PDPK1, −10.224 kcal/mol for ERK1, −10.134 kcal/mol for ERK2, and −9.2 kcal/mol for mTOR (via AutoDock Vina 1, .2.5). Some scores with MP7 were often higher than the available single-targeted drugs for different nodes in the MAPK pathway. Additionally, a total of 1867 repurposed analgesic, antibiotic, and antiparasitic drugs, including Zavegepant (−13.399 kcal/mol for PDPK1), Adozelesin (−11.74 kcal/mol for mTOR) and Modoflaner (−11.29 kcal/mol for PDPK1), showed promising binding energetics while targeting our triad points than other compounds used. This approach prompts for mitigating not only breast cancer but other elusive diseases as well, with state-of-the-art multitargeted therapies coupled with bioinformatic strategies.
乳腺癌一直是全球妇女因癌症死亡的主要原因之一。更为严重的是,由于先天或后天的原因,癌细胞通常会对现有的化疗或单一靶向治疗产生抗药性。这种抗药性与 MAPK(丝裂原活化蛋白激酶)信号通路的激活增加同时发生。本研究通过分子对接和实时模拟,同时针对这一途径中的三个必要中间体进行了研究。对接是通过 PyRx 软件中集成的 AutoDock Vina 1.1.2 和 1.2.5 进行的,而模拟则是利用 Discovery Studio (BIOVIA) v24.1.0.23298 进行的。目的是研究靶向中间体的已知潜在抑制剂和可再利用药物的治疗前景,以了解同时靶向这些三节点的有效性。靶点被认为是 PDPK1(3-磷酸肌醇依赖性蛋白激酶 1)、ERK1/2(细胞外信号相关蛋白激酶 1/2)和 mTOR(雷帕霉素哺乳动物靶点)。我们的研究显示,在为每个节点选择的候选抑制剂中,MP7 对所有三种抑制剂都表现出最优越的结合亲和力:对 PDPK1 为 -10.918 kcal/mol,对 ERK1 为 -10.224 kcal/mol,对 ERK2 为 -10.134 kcal/mol,对 mTOR 为 -9.2 kcal/mol(通过 AutoDock Vina 1,.2.5)。对于 MAPK 通路的不同节点,MP7 的一些得分往往高于现有的单一靶向药物。此外,包括 Zavegepant(对 PDPK1 为 -13.399 kcal/mol)、Adozelesin(对 mTOR 为 -11.74 kcal/mol)和 Modoflaner(对 PDPK1 为 -11.29 kcal/mol)在内的总共 1867 种再利用镇痛药、抗生素和抗寄生虫药物,在靶向我们的三联体点时显示出了比其他化合物更好的结合能。这种方法不仅能缓解乳腺癌,还能通过最先进的多靶点疗法和生物信息学策略缓解其他难以治愈的疾病。
{"title":"Implications of trinodal inhibitions and drug repurposing in MAPK pathway: A putative remedy for breast cancer","authors":"Shalini Majumder ,&nbsp;Ekarsi Lodh ,&nbsp;Tapan Chowdhury","doi":"10.1016/j.compbiolchem.2024.108255","DOIUrl":"10.1016/j.compbiolchem.2024.108255","url":null,"abstract":"<div><div>Breast cancer has been one of the supreme causes of cancer-related deaths among women worldwide. To make the case even more compounded, due to innate or acquired causes, cancer cells often develop resistance against the available chemotherapy or monotargeted treatments. This resistance is concomitant with increased activation of the MAPK (mitogen-activated protein kinase) signaling pathway. This study simultaneously targets three imperative intermediates in this pathway using molecular docking and real-time simulation. Docking was performed via the integrated AutoDock Vina 1.1.2 &amp; 1.2.5 of the PyRx software, while the Discovery Studio (BIOVIA) v24.1.0.23298 was utilized to conduct the simulation. The aim is to investigate the therapeutic prospects of known potential inhibitors of the targeted intermediates and repurposable drugs to comprehend the effectiveness of targeting these trinodes simultaneously. The target points were deemed to be PDPK1 (3-phosphoinositide-dependent protein kinase 1), ERK1/2 (extracellular signal-related protein kinases 1/2), and mTOR (mammalian target of Rapamycin). Our study reveals that out of the candidate inhibitors chosen for each node, MP7 exhibited the most superior binding affinities for all three: −10.918 kcal/mol for PDPK1, −10.224 kcal/mol for ERK1, −10.134 kcal/mol for ERK2, and −9.2 kcal/mol for mTOR (via AutoDock Vina 1, .2.5). Some scores with MP7 were often higher than the available single-targeted drugs for different nodes in the MAPK pathway. Additionally, a total of 1867 repurposed analgesic, antibiotic, and antiparasitic drugs, including Zavegepant (−13.399 kcal/mol for PDPK1), Adozelesin (−11.74 kcal/mol for mTOR) and Modoflaner (−11.29 kcal/mol for PDPK1), showed promising binding energetics while targeting our triad points than other compounds used. This approach prompts for mitigating not only breast cancer but other elusive diseases as well, with state-of-the-art multitargeted therapies coupled with bioinformatic strategies.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108255"},"PeriodicalIF":2.6,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Key genes and pathways in the molecular landscape of pancreatic ductal adenocarcinoma: A bioinformatics and machine learning study 胰腺导管腺癌分子图谱中的关键基因和通路:一项生物信息学和机器学习研究。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-24 DOI: 10.1016/j.compbiolchem.2024.108268
Sinan Eyuboglu , Semih Alpsoy , Vladimir N. Uversky , Orkid Coskuner-Weber
Pancreatic ductal adenocarcinoma (PDAC) is recognized for its aggressive nature, dismal prognosis, and a notably low five-year survival rate, underscoring the critical need for early detection methods and more effective therapeutic approaches. This research rigorously investigates the molecular mechanisms underlying PDAC, with a focus on the identification of pivotal genes and pathways that may hold therapeutic relevance and prognostic value. Through the construction of a protein-protein interaction (PPI) network and the examination of differentially expressed genes (DEGs), the study uncovers key hub genes such as CDK1, KIF11, and BUB1, demonstrating their substantial role in the pathogenesis of PDAC. Notably, the dysregulation of these genes is consistent across a spectrum of cancers, positing them as potential targets for wide-ranging cancer therapeutics. This study also brings to the fore significant genes encoding intrinsically disordered proteins, in particular GPRC5A and KRT7, unveiling promising new pathways for therapeutic intervention. Advanced machine learning techniques were harnessed to classify PDAC patients with high accuracy, utilizing the key genetic markers as a dataset. The Support Vector Machine (SVM) model leveraged the hub genes to achieve a sensitivity of 91 % and a specificity of 85 %, while the RandomForest model notched a sensitivity of 91 % and specificity of 92.5 %. Crucially, when the identified genes were cross-referenced with TCGA-PAAD clinical datasets, a tangible correlation with patient survival rates was discovered, reinforcing the potential of these genes as prognostic biomarkers and their viability as targets for therapeutic intervention. This study's findings serve as a potent testament to the value of molecular analysis in enhancing the understanding of PDAC and in advancing the pursuit for more effective diagnostic and treatment strategies.
胰腺导管腺癌(PDAC)因其侵袭性强、预后不良、五年存活率明显偏低而被公认,这凸显了对早期检测方法和更有效治疗方法的迫切需要。这项研究对 PDAC 的分子机制进行了严格研究,重点是确定可能具有治疗意义和预后价值的关键基因和通路。通过构建蛋白-蛋白相互作用(PPI)网络和检测差异表达基因(DEGs),研究发现了CDK1、KIF11和BUB1等关键枢纽基因,证明了它们在PDAC发病机制中的重要作用。值得注意的是,这些基因的失调在各种癌症中都是一致的,因此它们被认为是各种癌症疗法的潜在靶点。这项研究还揭示了编码内在紊乱蛋白的重要基因,特别是 GPRC5A 和 KRT7,为治疗干预揭示了前景广阔的新途径。利用关键遗传标记作为数据集,先进的机器学习技术对 PDAC 患者进行了高精度分类。支持向量机(SVM)模型利用枢纽基因实现了 91% 的灵敏度和 85% 的特异性,而随机森林(RandomForest)模型则实现了 91% 的灵敏度和 92.5% 的特异性。最重要的是,当将鉴定出的基因与TCGA-PAAD临床数据集进行交叉比对时,发现了这些基因与患者存活率的切实相关性,从而增强了这些基因作为预后生物标志物的潜力及其作为治疗干预靶点的可行性。这项研究的发现有力地证明了分子分析在增进人们对 PDAC 的了解以及推动人们寻求更有效的诊断和治疗策略方面的价值。
{"title":"Key genes and pathways in the molecular landscape of pancreatic ductal adenocarcinoma: A bioinformatics and machine learning study","authors":"Sinan Eyuboglu ,&nbsp;Semih Alpsoy ,&nbsp;Vladimir N. Uversky ,&nbsp;Orkid Coskuner-Weber","doi":"10.1016/j.compbiolchem.2024.108268","DOIUrl":"10.1016/j.compbiolchem.2024.108268","url":null,"abstract":"<div><div>Pancreatic ductal adenocarcinoma (PDAC) is recognized for its aggressive nature, dismal prognosis, and a notably low five-year survival rate, underscoring the critical need for early detection methods and more effective therapeutic approaches. This research rigorously investigates the molecular mechanisms underlying PDAC, with a focus on the identification of pivotal genes and pathways that may hold therapeutic relevance and prognostic value. Through the construction of a protein-protein interaction (PPI) network and the examination of differentially expressed genes (DEGs), the study uncovers key hub genes such as CDK1, KIF11, and BUB1, demonstrating their substantial role in the pathogenesis of PDAC. Notably, the dysregulation of these genes is consistent across a spectrum of cancers, positing them as potential targets for wide-ranging cancer therapeutics. This study also brings to the fore significant genes encoding intrinsically disordered proteins, in particular GPRC5A and KRT7, unveiling promising new pathways for therapeutic intervention. Advanced machine learning techniques were harnessed to classify PDAC patients with high accuracy, utilizing the key genetic markers as a dataset. The Support Vector Machine (SVM) model leveraged the hub genes to achieve a sensitivity of 91 % and a specificity of 85 %, while the RandomForest model notched a sensitivity of 91 % and specificity of 92.5 %. Crucially, when the identified genes were cross-referenced with TCGA-PAAD clinical datasets, a tangible correlation with patient survival rates was discovered, reinforcing the potential of these genes as prognostic biomarkers and their viability as targets for therapeutic intervention. This study's findings serve as a potent testament to the value of molecular analysis in enhancing the understanding of PDAC and in advancing the pursuit for more effective diagnostic and treatment strategies.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108268"},"PeriodicalIF":2.6,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of immunoinformatics to develop a novel and effective multiepitope chimeric vaccine against Variovorax durovernensis 应用免疫信息学开发新型有效的多位点嵌合体疫苗,预防黑翅大疣病毒
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-24 DOI: 10.1016/j.compbiolchem.2024.108266
Ahmad Hasan , Muhammad Ibrahim , Wadi B. Alonazi , Jian Shen
Bloodstream infections pose a significant public health challenge caused by resistant bacteria such as Variovorax durovernensis, a recently reported Gram-negative bacterium, worsening the burden on healthcare systems. The design of a vaccine using chimeric peptides derived from a representative V. durovernensis strain holds significant promise for preventing disease onset. The current study aimed to employ reverse vaccinology (RV) approaches such as the retrieval of V. durovernensis proteomics data, removal of redundant proteins by CD-HIT, filtering of non-homologous proteins to humans and essential proteins, identification of outer membrane (OM) proteins by CELLO and PSORTb. Following these steps immunoinformatic approaches were applied, such as epitope prediction by IEDB, vaccine design using linkers and adjuvant and analysis of antigenicity, allergenicity, safety and stability. Among the 4208 nonredundant proteins, an OmpA family protein (A0A940EKP4) was designated a potential candidate for the development of a multiepitope vaccine construct. Upon analysis of OM protein, six immunodominant (B cell) epitopes were found on the basis of the chimeric construct following the prediction of CTL stands cytotoxic T lymphocyte and HTL stands helper T lymphocyte epitopes. To ensure comprehensive population coverage globally, the CTL and HTL coverage rates were 58.18 % and 46.56 %, respectively, and 77.23 % overall. By utilizing EAAAK, GPGPG, and AAY linkers, Cholera toxin B subunit adjuvants, and appropriate epitopes were smoothly incorporated into a chimeric vaccine effectively triggering both adaptive and innate immune responses. For example, the administered antigen showed a peak in counts on the fifthday post injection and then gradually declined until the fifteenth day. Elevated levels of several antibodies (IgG + IgM > 700,000; IgM > 600,000; IgG1 + IgG2; IgG1 > 500,000) were observed as decreased in the antigen concentration. Molecular dynamics simulations carried out via iMODS revealed strong correlations between residue pairs, highlighting the stability of the docked complex. The designed vaccine has promising potential in eliciting specific immunogenic responses, thereby facilitating future research for vaccine development against V. durovernensis.
血流感染是由耐药细菌(如最近报道的革兰氏阴性菌 Variovorax durovernensis)引起的重大公共卫生挑战,加重了医疗系统的负担。利用从具有代表性的 V. durovernensis 菌株中提取的嵌合肽设计疫苗,在预防疾病发生方面大有可为。目前的研究旨在采用反向疫苗学(RV)方法,如检索 V. durovernensis 蛋白质组学数据、通过 CD-HIT 去除冗余蛋白、过滤与人类非同源的蛋白和必需蛋白、通过 CELLO 和 PSORTb 鉴定外膜(OM)蛋白。在这些步骤之后,还采用了免疫形式化方法,如通过 IEDB 预测表位,使用连接体和佐剂进行疫苗设计,以及分析抗原性、过敏性、安全性和稳定性。在 4208 个非冗余蛋白中,一个 OmpA 家族蛋白(A0A940EKP4)被确定为开发多位点疫苗构建体的潜在候选蛋白。对 OM 蛋白进行分析后,根据 CTL 代表细胞毒性 T 淋巴细胞和 HTL 代表辅助性 T 淋巴细胞表位的预测,在嵌合构建体的基础上发现了六个免疫优势(B 细胞)表位。为确保全面覆盖全球人群,CTL 和 HTL 的覆盖率分别为 58.18 % 和 46.56 %,总体覆盖率为 77.23 %。通过使用 EAAAK、GPGPG 和 AAY 连接器、霍乱毒素 B 亚基佐剂和适当的表位,嵌合体疫苗被顺利地整合到了一起,有效地激发了适应性免疫和先天性免疫反应。例如,给药抗原在注射后第五天出现计数高峰,然后逐渐下降,直到第十五天。在抗原浓度降低的同时,还观察到几种抗体(IgG + IgM > 700,000;IgM > 600,000;IgG1 + IgG2;IgG1 > 500,000)的水平升高。通过 iMODS 进行的分子动力学模拟显示,残基对之间存在很强的相关性,突出了对接复合物的稳定性。所设计的疫苗在诱导特异性免疫原反应方面具有良好的潜力,从而促进了未来针对杜氏疟原虫疫苗开发的研究。
{"title":"Application of immunoinformatics to develop a novel and effective multiepitope chimeric vaccine against Variovorax durovernensis","authors":"Ahmad Hasan ,&nbsp;Muhammad Ibrahim ,&nbsp;Wadi B. Alonazi ,&nbsp;Jian Shen","doi":"10.1016/j.compbiolchem.2024.108266","DOIUrl":"10.1016/j.compbiolchem.2024.108266","url":null,"abstract":"<div><div>Bloodstream infections pose a significant public health challenge caused by resistant bacteria such as <em>Variovorax durovernensis, a</em> recently reported Gram-negative bacterium, worsening the burden on healthcare systems. The design of a vaccine using chimeric peptides derived from a representative <em>V. durovernensis</em> strain holds significant promise for preventing disease onset. The current study aimed to employ reverse vaccinology (RV) approaches such as the retrieval of <em>V. durovernensis</em> proteomics data, removal of redundant proteins by CD-HIT, filtering of non-homologous proteins to humans and essential proteins, identification of outer membrane (OM) proteins by CELLO and PSORTb. Following these steps immunoinformatic approaches were applied, such as epitope prediction by IEDB, vaccine design using linkers and adjuvant and analysis of antigenicity, allergenicity, safety and stability. Among the 4208 nonredundant proteins, an OmpA family protein (A0A940EKP4) was designated a potential candidate for the development of a multiepitope vaccine construct. Upon analysis of OM protein, six immunodominant (B cell) epitopes were found on the basis of the chimeric construct following the prediction of CTL stands cytotoxic T lymphocyte and HTL stands helper T lymphocyte epitopes. To ensure comprehensive population coverage globally, the CTL and HTL coverage rates were 58.18 % and 46.56 %, respectively, and 77.23 % overall. By utilizing EAAAK, GPGPG, and AAY linkers, Cholera toxin B subunit adjuvants, and appropriate epitopes were smoothly incorporated into a chimeric vaccine effectively triggering both adaptive and innate immune responses. For example, the administered antigen showed a peak in counts on the fifthday post injection and then gradually declined until the fifteenth day. Elevated levels of several antibodies (IgG + IgM &gt; 700,000; IgM &gt; 600,000; IgG1 + IgG2; IgG1 &gt; 500,000) were observed as decreased in the antigen concentration. Molecular dynamics simulations carried out via iMODS revealed strong correlations between residue pairs, highlighting the stability of the docked complex. The designed vaccine has promising potential in eliciting specific immunogenic responses, thereby facilitating future research for vaccine development against <em>V. durovernensis</em>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108266"},"PeriodicalIF":2.6,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An integrative analysis to identify pancancer epigenetic biomarkers 综合分析确定胰腺癌表观遗传生物标志物。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-23 DOI: 10.1016/j.compbiolchem.2024.108260
Panchami V.U. , Manish T.I. , Manesh K.K.
Integrating and analyzing the pancancer data collected from different experiments is crucial for gaining insights into the common mechanisms in the molecular level underlying the development and progression of cancers. Epigenetic study of the pancancer data can provide promising results in biomarker discovery. The genes that are epigenetically dysregulated in different cancers are powerful biomarkers for drug-related studies. This paper identifies the genes having altered expression due to aberrant methylation patterns using differential analysis of TCGA pancancer data of 12 different cancers. We identified a comprehensive set of 115 epigenetic biomarker genes out of which 106 genes having pancancer properties. The correlation analysis, gene set enrichment, protein–protein interaction analysis, pancancer characteristics analysis, and diagnostic modeling were performed on these biomarkers to illustrate the power of this signature and found to be important in different molecular operations related to cancer. An accuracy of 97.56% was obtained on TCGA pancancer gene expression dataset for predicting the binary class tumor or normal. The source code and dataset of this work are available at https://github.com/panchamisuneeth/EpiPanCan.git.
整合和分析从不同实验中收集到的胰腺癌数据,对于深入了解癌症发生和发展的分子水平共同机制至关重要。对胰腺癌数据进行表观遗传学研究可为生物标志物的发现提供有希望的结果。在不同癌症中表观遗传失调的基因是药物相关研究的有力生物标志物。本文通过对 12 种不同癌症的 TCGA 胰腺癌数据进行差异分析,确定了因甲基化模式异常而导致表达改变的基因。我们鉴定出了一整套 115 个表观遗传生物标记基因,其中 106 个基因具有胰腺癌特性。我们对这些生物标志基因进行了相关性分析、基因组富集、蛋白-蛋白相互作用分析、胰腺癌特征分析和诊断模型分析,以说明该特征基因的强大功能,并发现它们在与癌症有关的不同分子操作中具有重要作用。在 TCGA 胰腺癌基因表达数据集上,预测二元类肿瘤或正常的准确率达到 97.56%。这项工作的源代码和数据集可在 https://github.com/panchamisuneeth/EpiPanCan.git 网站上查阅。
{"title":"An integrative analysis to identify pancancer epigenetic biomarkers","authors":"Panchami V.U. ,&nbsp;Manish T.I. ,&nbsp;Manesh K.K.","doi":"10.1016/j.compbiolchem.2024.108260","DOIUrl":"10.1016/j.compbiolchem.2024.108260","url":null,"abstract":"<div><div>Integrating and analyzing the pancancer data collected from different experiments is crucial for gaining insights into the common mechanisms in the molecular level underlying the development and progression of cancers. Epigenetic study of the pancancer data can provide promising results in biomarker discovery. The genes that are epigenetically dysregulated in different cancers are powerful biomarkers for drug-related studies. This paper identifies the genes having altered expression due to aberrant methylation patterns using differential analysis of TCGA pancancer data of 12 different cancers. We identified a comprehensive set of 115 epigenetic biomarker genes out of which 106 genes having pancancer properties. The correlation analysis, gene set enrichment, protein–protein interaction analysis, pancancer characteristics analysis, and diagnostic modeling were performed on these biomarkers to illustrate the power of this signature and found to be important in different molecular operations related to cancer. An accuracy of 97.56% was obtained on TCGA pancancer gene expression dataset for predicting the binary class tumor or normal. The source code and dataset of this work are available at <span><span>https://github.com/panchamisuneeth/EpiPanCan.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108260"},"PeriodicalIF":2.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CopyMix: Mixture model based single-cell clustering and copy number profiling using variational inference CopyMix:利用变异推理进行基于混合模型的单细胞聚类和拷贝数分析
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-23 DOI: 10.1016/j.compbiolchem.2024.108257
Negar Safinianaini , Camila P.E. De Souza , Andrew Roth , Hazal Koptagel , Hosein Toosi , Jens Lagergren
Investigating tumor heterogeneity using single-cell sequencing technologies is imperative to understand how tumors evolve since each cell subpopulation harbors a unique set of genomic features that yields a unique phenotype, which is bound to have clinical relevance. Clustering of cells based on copy number data obtained from single-cell DNA sequencing provides an opportunity to identify different tumor cell subpopulations. Accordingly, computational methods have emerged for single-cell copy number profiling and clustering; however, these two tasks have been handled sequentially by applying various ad-hoc pre- and post-processing steps; hence, a procedure vulnerable to introducing clustering artifacts. We avoid the clustering artifact issues in our method, CopyMix, a Variational Inference for a novel mixture model, by jointly inferring cell clusters and their underlying copy number profile. Our probabilistic graphical model is an improved version of the mixture of hidden Markov models, which is designed uniquely to infer single-cell copy number profiling and clustering. For the evaluation, we used likelihood-ratio test, CH index, Silhouette, V-measure, total variation scores. CopyMix performs well on both biological and simulated data. Our favorable results indicate a considerable potential to obtain clinical impact by using CopyMix in studies of cancer tumor heterogeneity.
利用单细胞测序技术研究肿瘤异质性是了解肿瘤如何演变的当务之急,因为每个细胞亚群都有一套独特的基因组特征,从而产生独特的表型,这必然与临床相关。根据单细胞 DNA 测序获得的拷贝数数据对细胞进行聚类,为识别不同的肿瘤细胞亚群提供了机会。因此,出现了用于单细胞拷贝数分析和聚类的计算方法;然而,这两项任务是通过应用各种临时的前处理和后处理步骤来顺序处理的;因此,这种程序很容易引入聚类伪影。在我们的方法 "CopyMix--新型混合模型的变量推理 "中,我们通过联合推断细胞簇及其基本拷贝数特征,避免了聚类伪影问题。我们的概率图形模型是隐马尔可夫模型混合物的改进版,其设计独特,可用于推断单细胞拷贝数剖析和聚类。在评估中,我们使用了似然比检验、CH 指数、Silhouette、V-measure 和总变异分数。CopyMix 在生物数据和模拟数据上都表现良好。我们的良好结果表明,在癌症肿瘤异质性研究中使用 CopyMix 有很大的潜力产生临床影响。
{"title":"CopyMix: Mixture model based single-cell clustering and copy number profiling using variational inference","authors":"Negar Safinianaini ,&nbsp;Camila P.E. De Souza ,&nbsp;Andrew Roth ,&nbsp;Hazal Koptagel ,&nbsp;Hosein Toosi ,&nbsp;Jens Lagergren","doi":"10.1016/j.compbiolchem.2024.108257","DOIUrl":"10.1016/j.compbiolchem.2024.108257","url":null,"abstract":"<div><div>Investigating tumor heterogeneity using single-cell sequencing technologies is imperative to understand how tumors evolve since each cell subpopulation harbors a unique set of genomic features that yields a unique phenotype, which is bound to have clinical relevance. Clustering of cells based on copy number data obtained from single-cell DNA sequencing provides an opportunity to identify different tumor cell subpopulations. Accordingly, computational methods have emerged for single-cell copy number profiling and clustering; however, these two tasks have been handled sequentially by applying various ad-hoc pre- and post-processing steps; hence, a procedure vulnerable to introducing clustering artifacts. We avoid the clustering artifact issues in our method, CopyMix, a Variational Inference for a novel mixture model, by jointly inferring cell clusters and their underlying copy number profile. Our probabilistic graphical model is an improved version of the mixture of hidden Markov models, which is designed uniquely to infer single-cell copy number profiling and clustering. For the evaluation, we used likelihood-ratio test, CH index, Silhouette, V-measure, total variation scores. CopyMix performs well on both biological and simulated data. Our favorable results indicate a considerable potential to obtain clinical impact by using CopyMix in studies of cancer tumor heterogeneity.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108257"},"PeriodicalIF":2.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computer-aided diagnosis of liver cancer with improved SegNet and deep stacking ensemble model 利用改进的 SegNet 和深度堆叠集合模型进行肝癌计算机辅助诊断。
IF 2.6 4区 生物学 Q2 BIOLOGY Pub Date : 2024-10-19 DOI: 10.1016/j.compbiolchem.2024.108243
Vinnakota Sai Durga Tejaswi, Venubabu Rachapudi
Liver cancer is a leading cause of cancer-related deaths, often diagnosed at advanced stages due to reliance on traditional imaging methods. Existing computer-aided diagnosis systems struggle with noise, anatomical complexity, and ineffective feature integration, leading to inaccuracies in lesion segmentation and classification. By effectively addressing these challenges, the model aims to enhance early detection and assist clinicians in making informed decisions. Ultimately, this research seeks to contribute to more efficient and accurate liver cancer diagnosis. This paper presents a novel model for liver cancer classification, called SegNet-based Liver Cancer Classification via SqueezeNet (SgN-LCC-SqN). The model effectively executes liver cancer segmentation and classification through four key steps: preprocessing, segmentation, feature extraction, and classification. During preprocessing, Quadratic Mean Estimated Wiener Filtering (QMEWF) is utilized to minimize image noise. Segmentation divides the image into segments using Enhanced Feature Pyramid SegNet (EFP-SgN), which is essential for precise diagnosis. Feature extraction encompasses color features, Local Directional Pattern Variance, and Correlation Filtering-Local Gradient Increasing Pattern (CF-LGIP) features. The extracted features are then processed through an ensemble model, Deep Convolutional, Recurrent, Long Short Term Memory with SqueezeNet (DCR-LSTM-SqN), which includes Deep Convolutional Neural Network (DCNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Modified Loss Function in SqueezeNet (MLF-SqN) classifiers, sequentially analyzing the feature sets through DCNN, RNN, and LSTM before classification by MLF-SqN. The performance of the suggested DCR-LSTM-SqN model is evaluated over conventional methods for positive, negative and other metrics. The DCR-LSTM-SqN model consistently demonstrates superior accuracy, ranging from 0.947 to 0.984, across all training data percentages. Thus, the proposed model effectively segments liver lesions and classifies cancerous areas, demonstrating its potential as a valuable resource for clinicians to enhance the efficiency and accuracy of liver cancer diagnosis.
肝癌是导致癌症相关死亡的主要原因之一,由于依赖传统的成像方法,肝癌往往在晚期才被诊断出来。现有的计算机辅助诊断系统难以应对噪声、解剖复杂性和无效的特征整合等问题,导致病变分割和分类不准确。通过有效应对这些挑战,该模型旨在加强早期检测,协助临床医生做出明智的决定。最终,这项研究旨在为更高效、更准确的肝癌诊断做出贡献。本文提出了一种新颖的肝癌分类模型,称为基于 SegNet 的挤压网肝癌分类(SgN-LCC-SqN)。该模型通过预处理、分割、特征提取和分类四个关键步骤有效地执行肝癌分割和分类。在预处理过程中,利用二次均值估计维纳滤波法(QMEWF)将图像噪声降至最低。分割利用增强型特征金字塔分割网(EFP-SgN)将图像分割成不同的部分,这对精确诊断至关重要。特征提取包括颜色特征、局部方向模式方差和相关过滤-局部梯度增加模式(CF-LGIP)特征。然后,提取的特征通过一个集合模型--SqueezeNet 深度卷积、递归、长短期记忆(DCR-LSTM-SqN)进行处理,该模型包括深度卷积神经网络(DCNN)、递归神经网络(RNN)、长短期记忆(LSTM)和 SqueezeNet 中的修正损失函数(MLF-SqN)分类器,在 MLF-SqN 分类之前,依次通过 DCNN、RNN 和 LSTM 对特征集进行分析。在正向、负向和其他指标方面,对建议的 DCR-LSTM-SqN 模型的性能进行了评估,结果优于传统方法。在所有训练数据百分比中,DCR-LSTM-SqN 模型的准确率始终保持在 0.947 到 0.984 之间。因此,所提出的模型能有效地分割肝脏病变并对癌变区域进行分类,为临床医生提高肝癌诊断的效率和准确性提供了宝贵的资源。
{"title":"Computer-aided diagnosis of liver cancer with improved SegNet and deep stacking ensemble model","authors":"Vinnakota Sai Durga Tejaswi,&nbsp;Venubabu Rachapudi","doi":"10.1016/j.compbiolchem.2024.108243","DOIUrl":"10.1016/j.compbiolchem.2024.108243","url":null,"abstract":"<div><div>Liver cancer is a leading cause of cancer-related deaths, often diagnosed at advanced stages due to reliance on traditional imaging methods. Existing computer-aided diagnosis systems struggle with noise, anatomical complexity, and ineffective feature integration, leading to inaccuracies in lesion segmentation and classification. By effectively addressing these challenges, the model aims to enhance early detection and assist clinicians in making informed decisions. Ultimately, this research seeks to contribute to more efficient and accurate liver cancer diagnosis. This paper presents a novel model for liver cancer classification, called SegNet-based Liver Cancer Classification via SqueezeNet (SgN-LCC-SqN). The model effectively executes liver cancer segmentation and classification through four key steps: preprocessing, segmentation, feature extraction, and classification. During preprocessing, Quadratic Mean Estimated Wiener Filtering (QMEWF) is utilized to minimize image noise. Segmentation divides the image into segments using Enhanced Feature Pyramid SegNet (EFP-SgN), which is essential for precise diagnosis. Feature extraction encompasses color features, Local Directional Pattern Variance, and Correlation Filtering-Local Gradient Increasing Pattern (CF-LGIP) features. The extracted features are then processed through an ensemble model, Deep Convolutional, Recurrent, Long Short Term Memory with SqueezeNet (DCR-LSTM-SqN), which includes Deep Convolutional Neural Network (DCNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Modified Loss Function in SqueezeNet (MLF-SqN) classifiers, sequentially analyzing the feature sets through DCNN, RNN, and LSTM before classification by MLF-SqN. The performance of the suggested DCR-LSTM-SqN model is evaluated over conventional methods for positive, negative and other metrics. The DCR-LSTM-SqN model consistently demonstrates superior accuracy, ranging from 0.947 to 0.984, across all training data percentages. Thus, the proposed model effectively segments liver lesions and classifies cancerous areas, demonstrating its potential as a valuable resource for clinicians to enhance the efficiency and accuracy of liver cancer diagnosis.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"113 ","pages":"Article 108243"},"PeriodicalIF":2.6,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computational Biology and Chemistry
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1