利用机器学习模型和 MD 模拟探索突变动态,阐明 B4GALNT1 是肝细胞癌的潜在生物标记物

Rohit Kumar Verma , Kiran Bharat Lokhande , Prashant Kumar Srivastava , Ashutosh Singh
{"title":"利用机器学习模型和 MD 模拟探索突变动态,阐明 B4GALNT1 是肝细胞癌的潜在生物标记物","authors":"Rohit Kumar Verma ,&nbsp;Kiran Bharat Lokhande ,&nbsp;Prashant Kumar Srivastava ,&nbsp;Ashutosh Singh","doi":"10.1016/j.imu.2024.101514","DOIUrl":null,"url":null,"abstract":"<div><p>Liver hepatocellular carcinoma (LIHC) is considered one of the primary contributors to cancer-related mortality on a global scale. The identification of new biomarkers is of utmost importance due to the fact that patients with LIHC are frequently detected at advanced stages, leading to an increased mortality rate. The study utilized TCGA-LIHC gene expression datasets to identify biomarkers and to address the complexity of datasets. A combination of feature selection (FS) techniques was used, and the performance of this strategy was assessed using ten machine learning classifiers. The findings were integrated, revealing biomarkers identified through at least five FS techniques. Through our proposed approach, we identified 55 potential biomarkers for LIHC. The Gaussian Naive Bayes Classifier (AUC = 0.99) was found to be the most effective classifier, achieving 98.67% accuracy when utilizing the 55 identified biomarkers in the test dataset. Additionally, we conducted differential gene expression, survival analysis, and enrichment analysis for all the identified biomarkers. Subsequently, Lasso-penalized Cox regression further refined the identified biomarkers to thirteen. Out of thirteen genes, we singled out B4GALNT1 because of its statistical significance in differential expression analysis and increasing importance across various cancer types, including LIHC. We carried out comprehensive bioinformatics and molecular dynamics simulation studies along with other structural analysis of B4GALNT1 in LIHC. In LIHC, six mutations (P64Q, S131F, A311S, R340Q, D478H, and P507Q) have been predicted to be probably damaging by evaluating in-silico prediction algorithms. In comparison to the wild type, the B4GALNT1 variations, specifically P64Q and S131F, demonstrate increased stability. However, these mutations lead to decreased atomic fluctuations, indicating a rigid protein structure. Again, mutations like A311S and P507Q induce increased flexibility, highlighting their structural impact on B4GALNT1. The study demonstrated the combination of various feature selection methods effectively reveals new biomarkers, thereby directly impacting their biological significance. Furthermore, our findings indicate a link between increased B4GALNT1 expression in individuals with liver cancer and a poorer prognosis, highlighting its potential as a promising therapeutic target.</p></div>","PeriodicalId":13953,"journal":{"name":"Informatics in Medicine Unlocked","volume":"48 ","pages":"Article 101514"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2352914824000704/pdfft?md5=fc03e97d776921a1dbf9039b163e1a45&pid=1-s2.0-S2352914824000704-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Elucidating B4GALNT1 as potential biomarker in hepatocellular carcinoma using machine learning models and mutational dynamics explored through MD simulation\",\"authors\":\"Rohit Kumar Verma ,&nbsp;Kiran Bharat Lokhande ,&nbsp;Prashant Kumar Srivastava ,&nbsp;Ashutosh Singh\",\"doi\":\"10.1016/j.imu.2024.101514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Liver hepatocellular carcinoma (LIHC) is considered one of the primary contributors to cancer-related mortality on a global scale. The identification of new biomarkers is of utmost importance due to the fact that patients with LIHC are frequently detected at advanced stages, leading to an increased mortality rate. The study utilized TCGA-LIHC gene expression datasets to identify biomarkers and to address the complexity of datasets. A combination of feature selection (FS) techniques was used, and the performance of this strategy was assessed using ten machine learning classifiers. The findings were integrated, revealing biomarkers identified through at least five FS techniques. Through our proposed approach, we identified 55 potential biomarkers for LIHC. The Gaussian Naive Bayes Classifier (AUC = 0.99) was found to be the most effective classifier, achieving 98.67% accuracy when utilizing the 55 identified biomarkers in the test dataset. Additionally, we conducted differential gene expression, survival analysis, and enrichment analysis for all the identified biomarkers. Subsequently, Lasso-penalized Cox regression further refined the identified biomarkers to thirteen. Out of thirteen genes, we singled out B4GALNT1 because of its statistical significance in differential expression analysis and increasing importance across various cancer types, including LIHC. We carried out comprehensive bioinformatics and molecular dynamics simulation studies along with other structural analysis of B4GALNT1 in LIHC. In LIHC, six mutations (P64Q, S131F, A311S, R340Q, D478H, and P507Q) have been predicted to be probably damaging by evaluating in-silico prediction algorithms. In comparison to the wild type, the B4GALNT1 variations, specifically P64Q and S131F, demonstrate increased stability. However, these mutations lead to decreased atomic fluctuations, indicating a rigid protein structure. Again, mutations like A311S and P507Q induce increased flexibility, highlighting their structural impact on B4GALNT1. The study demonstrated the combination of various feature selection methods effectively reveals new biomarkers, thereby directly impacting their biological significance. Furthermore, our findings indicate a link between increased B4GALNT1 expression in individuals with liver cancer and a poorer prognosis, highlighting its potential as a promising therapeutic target.</p></div>\",\"PeriodicalId\":13953,\"journal\":{\"name\":\"Informatics in Medicine Unlocked\",\"volume\":\"48 \",\"pages\":\"Article 101514\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2352914824000704/pdfft?md5=fc03e97d776921a1dbf9039b163e1a45&pid=1-s2.0-S2352914824000704-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informatics in Medicine Unlocked\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352914824000704\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics in Medicine Unlocked","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352914824000704","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

摘要

肝肝细胞癌(LIHC)被认为是导致全球癌症相关死亡率的主要因素之一。由于肝肝细胞癌患者往往在晚期才被发现,导致死亡率上升,因此鉴定新的生物标志物至关重要。该研究利用 TCGA-LIHC 基因表达数据集来鉴定生物标志物,并解决数据集的复杂性问题。研究结合使用了特征选择(FS)技术,并使用十种机器学习分类器评估了这一策略的性能。研究结果经过整合,揭示了通过至少五种特征选择技术识别出的生物标记物。通过我们提出的方法,我们确定了 55 个潜在的 LIHC 生物标记物。我们发现高斯直觉贝叶斯分类器(AUC = 0.99)是最有效的分类器,在测试数据集中使用 55 个已识别的生物标志物时,准确率达到 98.67%。此外,我们还对所有已确定的生物标记物进行了差异基因表达、生存分析和富集分析。随后,Lasso-penalized Cox 回归进一步将确定的生物标记物细化为 13 个。在这13个基因中,我们选择了B4GALNT1,因为它在差异表达分析中具有统计学意义,而且在包括LIHC在内的各种癌症类型中越来越重要。我们对 B4GALNT1 在 LIHC 中的结构进行了全面的生物信息学和分子动力学模拟研究。通过对体内预测算法的评估,我们预测在LIHC中,6个突变(P64Q、S131F、A311S、R340Q、D478H和P507Q)可能具有损伤性。与野生型相比,B4GALNT1 的变异,特别是 P64Q 和 S131F,显示出更高的稳定性。然而,这些突变导致原子波动减少,表明蛋白质结构僵硬。同样,A311S和P507Q等突变导致灵活性增加,突出了它们对B4GALNT1结构的影响。这项研究表明,各种特征选择方法的结合能有效揭示新的生物标记物,从而直接影响其生物学意义。此外,我们的研究结果表明,肝癌患者的B4GALNT1表达增加与预后较差之间存在联系,这突显了B4GALNT1作为治疗靶点的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Elucidating B4GALNT1 as potential biomarker in hepatocellular carcinoma using machine learning models and mutational dynamics explored through MD simulation

Liver hepatocellular carcinoma (LIHC) is considered one of the primary contributors to cancer-related mortality on a global scale. The identification of new biomarkers is of utmost importance due to the fact that patients with LIHC are frequently detected at advanced stages, leading to an increased mortality rate. The study utilized TCGA-LIHC gene expression datasets to identify biomarkers and to address the complexity of datasets. A combination of feature selection (FS) techniques was used, and the performance of this strategy was assessed using ten machine learning classifiers. The findings were integrated, revealing biomarkers identified through at least five FS techniques. Through our proposed approach, we identified 55 potential biomarkers for LIHC. The Gaussian Naive Bayes Classifier (AUC = 0.99) was found to be the most effective classifier, achieving 98.67% accuracy when utilizing the 55 identified biomarkers in the test dataset. Additionally, we conducted differential gene expression, survival analysis, and enrichment analysis for all the identified biomarkers. Subsequently, Lasso-penalized Cox regression further refined the identified biomarkers to thirteen. Out of thirteen genes, we singled out B4GALNT1 because of its statistical significance in differential expression analysis and increasing importance across various cancer types, including LIHC. We carried out comprehensive bioinformatics and molecular dynamics simulation studies along with other structural analysis of B4GALNT1 in LIHC. In LIHC, six mutations (P64Q, S131F, A311S, R340Q, D478H, and P507Q) have been predicted to be probably damaging by evaluating in-silico prediction algorithms. In comparison to the wild type, the B4GALNT1 variations, specifically P64Q and S131F, demonstrate increased stability. However, these mutations lead to decreased atomic fluctuations, indicating a rigid protein structure. Again, mutations like A311S and P507Q induce increased flexibility, highlighting their structural impact on B4GALNT1. The study demonstrated the combination of various feature selection methods effectively reveals new biomarkers, thereby directly impacting their biological significance. Furthermore, our findings indicate a link between increased B4GALNT1 expression in individuals with liver cancer and a poorer prognosis, highlighting its potential as a promising therapeutic target.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Informatics in Medicine Unlocked
Informatics in Medicine Unlocked Medicine-Health Informatics
CiteScore
9.50
自引率
0.00%
发文量
282
审稿时长
39 days
期刊介绍: Informatics in Medicine Unlocked (IMU) is an international gold open access journal covering a broad spectrum of topics within medical informatics, including (but not limited to) papers focusing on imaging, pathology, teledermatology, public health, ophthalmological, nursing and translational medicine informatics. The full papers that are published in the journal are accessible to all who visit the website.
期刊最新文献
Usability and accessibility in mHealth stroke apps: An empirical assessment Spatiotemporal chest wall movement analysis using depth sensor imaging for detecting respiratory asynchrony Regression and classification of Windkessel parameters from non-invasive cardiovascular quantities using a fully connected neural network Patient2Trial: From patient to participant in clinical trials using large language models Structural modification of Naproxen; physicochemical, spectral, medicinal, and pharmacological evaluation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1