Machine learning approach to predict blood-secretory proteins and potential biomarkers for liver cancer using omics data

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS ACS Applied Bio Materials Pub Date : 2024-08-30 DOI:10.1016/j.jprot.2024.105298
Dahrii Paul, Vigneshwar Suriya Prakash Sinnarasan, Rajesh Das, Md Mujibur Rahman Sheikh, Amouda Venkatesan
{"title":"Machine learning approach to predict blood-secretory proteins and potential biomarkers for liver cancer using omics data","authors":"Dahrii Paul,&nbsp;Vigneshwar Suriya Prakash Sinnarasan,&nbsp;Rajesh Das,&nbsp;Md Mujibur Rahman Sheikh,&nbsp;Amouda Venkatesan","doi":"10.1016/j.jprot.2024.105298","DOIUrl":null,"url":null,"abstract":"<div><p>Identifying non-invasive blood-based biomarkers is crucial for early detection and monitoring of liver cancer (LC), thereby improving patient outcomes. This study leveraged computational approaches to predict potential blood-based biomarkers for LC. Machine learning (ML) models were developed using selected features from blood-secretory proteins collected from the curated databases. The logistic regression (LR) model demonstrated the optimal performance. Transcriptome analysis across 7 LC cohorts revealed 231 common differentially expressed genes (DEGs). The encoded proteins of these DEGs were compared with the ML dataset, revealing 29 proteins overlapping with the blood-secretory dataset. The LR model also predicted 29 additional proteins as blood-secretory with the remaining protein-coding genes. As a result, 58 potential blood-secretory proteins were obtained. Among the top 20 genes, 13 common hub genes were identified. Further, area under the receiver operating characteristic curve (ROC AUC) analysis was performed to assess the genes as potential diagnostic blood biomarkers. Six genes, <em>ESM1</em>, <em>FCN2</em>, <em>MDK</em>, <em>GPC3</em>, <em>CTHRC1</em> and <em>COL6A6</em>, exhibited an AUC value higher than 0.85 and were predicted as blood-secretory. This study highlights the potential of an integrative computational approach for discovering non-invasive blood-based biomarkers in LC, facilitating for further validation and clinical translation.</p></div><div><h3>Significance</h3><p>Liver cancer is one of the leading causes of premature death worldwide, with its prevalence and mortality rates projected to increase. Although current diagnostic methods are highly sensitive, they are invasive and unsuitable for repeated testing. Blood biomarkers offer a promising non-invasive alternative, but their wide dynamic range of protein concentration poses experimental challenges. Therefore, utilizing available omics data to develop a diagnostic model could provide a potential solution for accurate diagnosis. This study developed a computational method integrating machine learning and bioinformatics analysis to identify potential blood biomarkers. As a result, ESM1, FCN2, MDK, GPC3, CTHRC1 and COL6A6 biomarkers were identified, holding significant promise for improving diagnosis and understanding of liver cancer. The integrated method can be applied to other cancers, offering a possible solution for early detection and improved patient outcomes.</p></div>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1874391924002306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0

Abstract

Identifying non-invasive blood-based biomarkers is crucial for early detection and monitoring of liver cancer (LC), thereby improving patient outcomes. This study leveraged computational approaches to predict potential blood-based biomarkers for LC. Machine learning (ML) models were developed using selected features from blood-secretory proteins collected from the curated databases. The logistic regression (LR) model demonstrated the optimal performance. Transcriptome analysis across 7 LC cohorts revealed 231 common differentially expressed genes (DEGs). The encoded proteins of these DEGs were compared with the ML dataset, revealing 29 proteins overlapping with the blood-secretory dataset. The LR model also predicted 29 additional proteins as blood-secretory with the remaining protein-coding genes. As a result, 58 potential blood-secretory proteins were obtained. Among the top 20 genes, 13 common hub genes were identified. Further, area under the receiver operating characteristic curve (ROC AUC) analysis was performed to assess the genes as potential diagnostic blood biomarkers. Six genes, ESM1, FCN2, MDK, GPC3, CTHRC1 and COL6A6, exhibited an AUC value higher than 0.85 and were predicted as blood-secretory. This study highlights the potential of an integrative computational approach for discovering non-invasive blood-based biomarkers in LC, facilitating for further validation and clinical translation.

Significance

Liver cancer is one of the leading causes of premature death worldwide, with its prevalence and mortality rates projected to increase. Although current diagnostic methods are highly sensitive, they are invasive and unsuitable for repeated testing. Blood biomarkers offer a promising non-invasive alternative, but their wide dynamic range of protein concentration poses experimental challenges. Therefore, utilizing available omics data to develop a diagnostic model could provide a potential solution for accurate diagnosis. This study developed a computational method integrating machine learning and bioinformatics analysis to identify potential blood biomarkers. As a result, ESM1, FCN2, MDK, GPC3, CTHRC1 and COL6A6 biomarkers were identified, holding significant promise for improving diagnosis and understanding of liver cancer. The integrated method can be applied to other cancers, offering a possible solution for early detection and improved patient outcomes.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用全息数据预测血液分泌蛋白和肝癌潜在生物标记物的机器学习方法。
确定非侵入性血液生物标志物对于早期检测和监测肝癌(LC),从而改善患者预后至关重要。本研究利用计算方法预测潜在的肝癌血液生物标志物。研究人员利用从相关数据库中收集的血液分泌蛋白的选定特征开发了机器学习(ML)模型。逻辑回归(LR)模型表现出最佳性能。对 7 个慢性淋巴细胞白血病队列的转录组分析发现了 231 个常见的差异表达基因(DEG)。将这些 DEGs 的编码蛋白与 ML 数据集进行比较,发现有 29 个蛋白与血液分泌数据集重叠。LR 模型还预测另外 29 个蛋白质与其余的蛋白编码基因具有血液分泌功能。因此,得到了 58 个潜在的血液分泌蛋白。在前 20 个基因中,发现了 13 个共同的枢纽基因。此外,还进行了接收者操作特征曲线下面积(ROC AUC)分析,以评估这些基因作为潜在血液诊断生物标志物的可能性。ESM1、FCN2、MDK、GPC3、CTHRC1 和 COL6A6 这六个基因的 AUC 值高于 0.85,被预测为血液分泌基因。这项研究凸显了综合计算方法在发现肝癌非侵入性血液生物标记物方面的潜力,有助于进一步验证和临床转化。意义:肝癌是导致全球过早死亡的主要原因之一,其发病率和死亡率预计还会上升。尽管目前的诊断方法灵敏度很高,但它们都是侵入性的,不适合反复检测。血液生物标志物提供了一种很有前景的非侵入性替代方法,但其蛋白质浓度的动态范围很大,给实验带来了挑战。因此,利用现有的全息数据来开发诊断模型可为准确诊断提供潜在的解决方案。本研究开发了一种整合了机器学习和生物信息学分析的计算方法,以确定潜在的血液生物标志物。结果发现了ESM1、FCN2、MDK、GPC3、CTHRC1和COL6A6等生物标志物,为改善肝癌的诊断和理解带来了重大希望。该综合方法可应用于其他癌症,为早期检测和改善患者预后提供了可能的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACS Applied Bio Materials
ACS Applied Bio Materials Chemistry-Chemistry (all)
CiteScore
9.40
自引率
2.10%
发文量
464
期刊最新文献
A Systematic Review of Sleep Disturbance in Idiopathic Intracranial Hypertension. Advancing Patient Education in Idiopathic Intracranial Hypertension: The Promise of Large Language Models. Anti-Myelin-Associated Glycoprotein Neuropathy: Recent Developments. Approach to Managing the Initial Presentation of Multiple Sclerosis: A Worldwide Practice Survey. Association Between LACE+ Index Risk Category and 90-Day Mortality After Stroke.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1