Development of an AI model for predicting hypoxia status and prognosis in non-small cell lung cancer using multi-modal data.

IF 4 2区 医学 Q2 ONCOLOGY Translational lung cancer research Pub Date : 2024-12-31 Epub Date: 2024-12-27 DOI:10.21037/tlcr-24-982
Lina Zhou, Chenkai Mao, Tingting Fu, Xiao Ding, Luca Bertolaccini, Ao Liu, Junjun Zhang, Shicheng Li
{"title":"Development of an AI model for predicting hypoxia status and prognosis in non-small cell lung cancer using multi-modal data.","authors":"Lina Zhou, Chenkai Mao, Tingting Fu, Xiao Ding, Luca Bertolaccini, Ao Liu, Junjun Zhang, Shicheng Li","doi":"10.21037/tlcr-24-982","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Prognosis prediction is crucial for non-small cell lung cancer (NSCLC) treatment planning. While tumor hypoxia significantly impacts patient outcomes, identifying hypoxic genomic markers remains challenging. This study sought to identify hypoxic computed tomography (CT) radiomic features and create an artificial intelligence (AI) model for NSCLC through the integration of multi-modal data.</p><p><strong>Methods: </strong>In total, 452 NSCLC patients were enrolled in this study, including patients from The Second Affiliated Hospital of Soochow University (SC, n=112), The Cancer Genome Atlas (TCGA)-NSCLC dataset (n=74), the radiogenomics dataset (n=130), and the Gene Expression Omnibus (GEO) datasets (GSE19188: n=82, and GSE87340: n=54). Hypoxia status was classified using optimized cut-off values of hypoxia enrichment scores, which were calculated through single-sample gene set enrichment analysis (ssGSEA) of hypoxic genes. Radiomic features were extracted using three-dimensional (3D)-Slicer software. The least absolute shrinkage and selection operator (LASSO) algorithm was used to identify hypoxic CT radiomic features. A model named ssuBERT (semantic structured unit embedded in Bidirectional Encoder Representations from Transformers) was developed to analyze electronic health records (EHRs). An AI model for overall survival prediction was constructed by integrating CT radiomic features, ssuBERT features, and clinical data, and evaluated using five-fold cross-validation.</p><p><strong>Results: </strong>Higher hypoxia levels were correlated with worse survival outcomes. Twenty-eight radiomic features showed significant discriminatory power in detecting hypoxia status with an area under the curve (AUC) of 0.8295. The ssuBERT model achieved a weighted accuracy of 0.945 in recognizing semantic structured units in EHRs. The EHR model exhibited superior predictive performance among the single-modal models with an AUC of 0.7662. However, the multi-modal AI model had the highest average AUC of 0.8449 and an F1 score of 0.7557.</p><p><strong>Conclusions: </strong>The AI model demonstrated potential in predicting NSCLC patient prognosis through multi-modal data integration, warranting further validation.</p>","PeriodicalId":23271,"journal":{"name":"Translational lung cancer research","volume":"13 12","pages":"3642-3656"},"PeriodicalIF":4.0000,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736583/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Translational lung cancer research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/tlcr-24-982","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/27 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Prognosis prediction is crucial for non-small cell lung cancer (NSCLC) treatment planning. While tumor hypoxia significantly impacts patient outcomes, identifying hypoxic genomic markers remains challenging. This study sought to identify hypoxic computed tomography (CT) radiomic features and create an artificial intelligence (AI) model for NSCLC through the integration of multi-modal data.

Methods: In total, 452 NSCLC patients were enrolled in this study, including patients from The Second Affiliated Hospital of Soochow University (SC, n=112), The Cancer Genome Atlas (TCGA)-NSCLC dataset (n=74), the radiogenomics dataset (n=130), and the Gene Expression Omnibus (GEO) datasets (GSE19188: n=82, and GSE87340: n=54). Hypoxia status was classified using optimized cut-off values of hypoxia enrichment scores, which were calculated through single-sample gene set enrichment analysis (ssGSEA) of hypoxic genes. Radiomic features were extracted using three-dimensional (3D)-Slicer software. The least absolute shrinkage and selection operator (LASSO) algorithm was used to identify hypoxic CT radiomic features. A model named ssuBERT (semantic structured unit embedded in Bidirectional Encoder Representations from Transformers) was developed to analyze electronic health records (EHRs). An AI model for overall survival prediction was constructed by integrating CT radiomic features, ssuBERT features, and clinical data, and evaluated using five-fold cross-validation.

Results: Higher hypoxia levels were correlated with worse survival outcomes. Twenty-eight radiomic features showed significant discriminatory power in detecting hypoxia status with an area under the curve (AUC) of 0.8295. The ssuBERT model achieved a weighted accuracy of 0.945 in recognizing semantic structured units in EHRs. The EHR model exhibited superior predictive performance among the single-modal models with an AUC of 0.7662. However, the multi-modal AI model had the highest average AUC of 0.8449 and an F1 score of 0.7557.

Conclusions: The AI model demonstrated potential in predicting NSCLC patient prognosis through multi-modal data integration, warranting further validation.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用多模态数据预测非小细胞肺癌缺氧状态和预后的AI模型的开发。
背景:预后预测对非小细胞肺癌(NSCLC)的治疗方案至关重要。虽然肿瘤缺氧显著影响患者预后,但确定缺氧基因组标记仍然具有挑战性。本研究旨在通过整合多模态数据,识别低氧计算机断层扫描(CT)放射学特征,并创建NSCLC的人工智能(AI)模型。方法:共纳入452例NSCLC患者,包括来自苏州大学第二附属医院(SC, n=112)、癌症基因组图谱(TCGA)-NSCLC数据集(n=74)、放射基因组学数据集(n=130)和基因表达综合(GEO)数据集(GSE19188: n=82, GSE87340: n=54)的患者。通过缺氧基因的单样本基因集富集分析(ssGSEA)计算出缺氧富集分数的优化截断值,对缺氧状态进行分类。利用三维(3D)切片器软件提取放射学特征。采用最小绝对收缩和选择算子(LASSO)算法识别低氧CT放射学特征。开发了一个名为ssuBERT(嵌入在变压器双向编码器表示中的语义结构单元)的模型来分析电子健康记录(EHRs)。通过整合CT放射学特征、ssuBERT特征和临床数据构建总体生存预测的AI模型,并使用五倍交叉验证进行评估。结果:较高的缺氧水平与较差的生存结果相关。28个放射学特征在检测缺氧状态方面具有显著的鉴别能力,曲线下面积(AUC)为0.8295。ssuBERT模型在电子病历语义结构单元识别上的加权准确率为0.945。EHR模型在单模态模型中具有较好的预测效果,AUC为0.7662。而多模态AI模型的平均AUC最高,为0.8449,F1得分为0.7557。结论:AI模型通过多模态数据整合显示了预测NSCLC患者预后的潜力,需要进一步验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.20
自引率
2.50%
发文量
137
期刊介绍: Translational Lung Cancer Research(TLCR, Transl Lung Cancer Res, Print ISSN 2218-6751; Online ISSN 2226-4477) is an international, peer-reviewed, open-access journal, which was founded in March 2012. TLCR is indexed by PubMed/PubMed Central and the Chemical Abstracts Service (CAS) Databases. It is published quarterly the first year, and published bimonthly since February 2013. It provides practical up-to-date information on prevention, early detection, diagnosis, and treatment of lung cancer. Specific areas of its interest include, but not limited to, multimodality therapy, markers, imaging, tumor biology, pathology, chemoprevention, and technical advances related to lung cancer.
期刊最新文献
Identification and validation of pyroptosis patterns with a novel quantification system for the prediction of prognosis in lung squamous cell carcinoma. Impact of lymph node involvement in pulmonary carcinoids: a narrative review. Inhibition of miR-9-3p facilitates ferroptosis by activating SAT1/p53 pathway in lung adenocarcinoma. Long-term high fat diet aggravates the risk of lung fibrosis and lung cancer: transcriptomic analysis in the lung tissues of obese mice. Long-term survival after combination therapy with atezolizumab in a patient with small-cell lung cancer: a case report.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1