通过机器学习利用 SEER 数据预测肝癌患者的远处淋巴结转移和预后。

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC ACS Applied Electronic Materials Pub Date : 2024-08-26 DOI:10.1002/jgm.3732
Jiaxuan Sun, Lei Huang, Yahui Liu
{"title":"通过机器学习利用 SEER 数据预测肝癌患者的远处淋巴结转移和预后。","authors":"Jiaxuan Sun,&nbsp;Lei Huang,&nbsp;Yahui Liu","doi":"10.1002/jgm.3732","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Objectives</h3>\n \n <p>This study aims to develop and validate machine learning–based diagnostic and prognostic models to predict the risk of distant lymph node metastases (DLNM) in patients with hepatocellular carcinoma (HCC) and to evaluate the prognosis for this cohort.</p>\n </section>\n \n <section>\n \n <h3> Design</h3>\n \n <p>Utilizing a retrospective design, this investigation leverages data extracted from the Surveillance, Epidemiology, and End Results (SEER) database, specifically the January 2024 subset, to conduct the analysis.</p>\n </section>\n \n <section>\n \n <h3> Participants</h3>\n \n <p>The study cohort consists of 15,775 patients diagnosed with HCC as identified within the SEER database, spanning 2016 to 2020.</p>\n </section>\n \n <section>\n \n <h3> Method</h3>\n \n <p>In the construction of the diagnostic model, recursive feature elimination (RFE) is employed for variable selection, incorporating five critical predictors: age, tumor size, radiation therapy, T-stage, and serum alpha-fetoprotein (AFP) levels. These variables are the foundation for a stacking ensemble model, which is further elucidated through Shapley Additive Explanations (SHAP). Conversely, the prognostic model is crafted utilizing stepwise backward regression to select pertinent variables, including chemotherapy, radiation therapy, tumor size, and age. This model culminates in the development of a prognostic nomogram, underpinned by the Cox proportional hazards model.</p>\n </section>\n \n <section>\n \n <h3> Main outcome measures</h3>\n \n <p>The outcome of the diagnostic model is the occurrence of DLNM in patients. The outcome of the prognosis model is determined by survival time and survival status.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The integrated model developed based on stacking demonstrates good predictive performance and high interpretative variability and differentiation. The area under the curve (AUC) in the training set is 0.767, while the AUC in the validation set is 0.768. The nomogram, constructed using the Cox model, also demonstrates consistent and strong predictive capabilities. At the same time, we recognized elements that have a substantial impact on DLNM and the prognosis and extensively discussed their significance in the model and clinical practice.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>Our study identified key predictive factors for DLNM and elucidated significant prognostic indicators for HCC patients with DLNM. These findings provide clinicians with valuable tools to accurately identify high-risk individuals for DLNM and conduct more precise risk stratification for this patient subgroup, potentially improving management strategies and patient outcomes.</p>\n </section>\n </div>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Leveraging SEER data through machine learning to predict distant lymph node metastasis and prognosticate outcomes in hepatocellular carcinoma patients\",\"authors\":\"Jiaxuan Sun,&nbsp;Lei Huang,&nbsp;Yahui Liu\",\"doi\":\"10.1002/jgm.3732\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Objectives</h3>\\n \\n <p>This study aims to develop and validate machine learning–based diagnostic and prognostic models to predict the risk of distant lymph node metastases (DLNM) in patients with hepatocellular carcinoma (HCC) and to evaluate the prognosis for this cohort.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Design</h3>\\n \\n <p>Utilizing a retrospective design, this investigation leverages data extracted from the Surveillance, Epidemiology, and End Results (SEER) database, specifically the January 2024 subset, to conduct the analysis.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Participants</h3>\\n \\n <p>The study cohort consists of 15,775 patients diagnosed with HCC as identified within the SEER database, spanning 2016 to 2020.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Method</h3>\\n \\n <p>In the construction of the diagnostic model, recursive feature elimination (RFE) is employed for variable selection, incorporating five critical predictors: age, tumor size, radiation therapy, T-stage, and serum alpha-fetoprotein (AFP) levels. These variables are the foundation for a stacking ensemble model, which is further elucidated through Shapley Additive Explanations (SHAP). Conversely, the prognostic model is crafted utilizing stepwise backward regression to select pertinent variables, including chemotherapy, radiation therapy, tumor size, and age. This model culminates in the development of a prognostic nomogram, underpinned by the Cox proportional hazards model.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Main outcome measures</h3>\\n \\n <p>The outcome of the diagnostic model is the occurrence of DLNM in patients. The outcome of the prognosis model is determined by survival time and survival status.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>The integrated model developed based on stacking demonstrates good predictive performance and high interpretative variability and differentiation. The area under the curve (AUC) in the training set is 0.767, while the AUC in the validation set is 0.768. The nomogram, constructed using the Cox model, also demonstrates consistent and strong predictive capabilities. At the same time, we recognized elements that have a substantial impact on DLNM and the prognosis and extensively discussed their significance in the model and clinical practice.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>Our study identified key predictive factors for DLNM and elucidated significant prognostic indicators for HCC patients with DLNM. These findings provide clinicians with valuable tools to accurately identify high-risk individuals for DLNM and conduct more precise risk stratification for this patient subgroup, potentially improving management strategies and patient outcomes.</p>\\n </section>\\n </div>\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/jgm.3732\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jgm.3732","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

研究目的本研究旨在开发和验证基于机器学习的诊断和预后模型,以预测肝细胞癌(HCC)患者发生远处淋巴结转移(DLNM)的风险,并评估该人群的预后:本研究采用回顾性设计,利用从监测、流行病学和最终结果(SEER)数据库(特别是 2024 年 1 月的子集)中提取的数据进行分析:研究队列由 SEER 数据库中确定的 15,775 名确诊为 HCC 的患者组成,时间跨度为 2016 年至 2020 年:在构建诊断模型时,采用递归特征消除法(RFE)进行变量选择,其中包含五个关键预测因素:年龄、肿瘤大小、放射治疗、T期和血清甲胎蛋白(AFP)水平。这些变量是堆叠集合模型的基础,该模型通过夏普利相加解释(SHAP)得到进一步阐明。相反,预后模型则是利用逐步回归法来选择相关变量,包括化疗、放疗、肿瘤大小和年龄。该模型的最终结果是建立一个预后提名图,并以 Cox 比例危险模型为基础:诊断模型的结果是患者出现 DLNM。预后模型的结果由生存时间和生存状态决定:结果:基于堆叠法开发的综合模型显示出良好的预测性能、较高的解释变异性和区分度。训练集的曲线下面积(AUC)为 0.767,验证集的 AUC 为 0.768。使用 Cox 模型构建的提名图也显示出一致而强大的预测能力。同时,我们发现了对 DLNM 和预后有重大影响的因素,并广泛讨论了这些因素在模型和临床实践中的意义:我们的研究确定了 DLNM 的关键预测因素,并阐明了患有 DLNM 的 HCC 患者的重要预后指标。这些发现为临床医生准确识别 DLNM 的高危人群并对这一患者亚群进行更精确的风险分层提供了宝贵的工具,从而有可能改善管理策略和患者预后。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Leveraging SEER data through machine learning to predict distant lymph node metastasis and prognosticate outcomes in hepatocellular carcinoma patients

Objectives

This study aims to develop and validate machine learning–based diagnostic and prognostic models to predict the risk of distant lymph node metastases (DLNM) in patients with hepatocellular carcinoma (HCC) and to evaluate the prognosis for this cohort.

Design

Utilizing a retrospective design, this investigation leverages data extracted from the Surveillance, Epidemiology, and End Results (SEER) database, specifically the January 2024 subset, to conduct the analysis.

Participants

The study cohort consists of 15,775 patients diagnosed with HCC as identified within the SEER database, spanning 2016 to 2020.

Method

In the construction of the diagnostic model, recursive feature elimination (RFE) is employed for variable selection, incorporating five critical predictors: age, tumor size, radiation therapy, T-stage, and serum alpha-fetoprotein (AFP) levels. These variables are the foundation for a stacking ensemble model, which is further elucidated through Shapley Additive Explanations (SHAP). Conversely, the prognostic model is crafted utilizing stepwise backward regression to select pertinent variables, including chemotherapy, radiation therapy, tumor size, and age. This model culminates in the development of a prognostic nomogram, underpinned by the Cox proportional hazards model.

Main outcome measures

The outcome of the diagnostic model is the occurrence of DLNM in patients. The outcome of the prognosis model is determined by survival time and survival status.

Results

The integrated model developed based on stacking demonstrates good predictive performance and high interpretative variability and differentiation. The area under the curve (AUC) in the training set is 0.767, while the AUC in the validation set is 0.768. The nomogram, constructed using the Cox model, also demonstrates consistent and strong predictive capabilities. At the same time, we recognized elements that have a substantial impact on DLNM and the prognosis and extensively discussed their significance in the model and clinical practice.

Conclusion

Our study identified key predictive factors for DLNM and elucidated significant prognostic indicators for HCC patients with DLNM. These findings provide clinicians with valuable tools to accurately identify high-risk individuals for DLNM and conduct more precise risk stratification for this patient subgroup, potentially improving management strategies and patient outcomes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
期刊最新文献
Hyperbaric oxygen treatment promotes tendon-bone interface healing in a rabbit model of rotator cuff tears. Oxygen-ozone therapy for myocardial ischemic stroke and cardiovascular disorders. Comparative study on the anti-inflammatory and protective effects of different oxygen therapy regimens on lipopolysaccharide-induced acute lung injury in mice. Heme oxygenase/carbon monoxide system and development of the heart. Hyperbaric oxygen for moderate-to-severe traumatic brain injury: outcomes 5-8 years after injury.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1