Evaluating generative AI models for explainable pathological feature extraction in lung adenocarcinoma: grading assessment and prognostic model construction

IF 7.6 1区 医学 Q1 HEALTH CARE SCIENCES & SERVICES The Lancet Regional Health: Western Pacific Pub Date : 2025-02-01 DOI:10.1016/j.lanwpc.2024.101352
Junyi Shen, Anqi Lin, Ting Wei, Jian Zhang, Peng Luo
{"title":"Evaluating generative AI models for explainable pathological feature extraction in lung adenocarcinoma: grading assessment and prognostic model construction","authors":"Junyi Shen,&nbsp;Anqi Lin,&nbsp;Ting Wei,&nbsp;Jian Zhang,&nbsp;Peng Luo","doi":"10.1016/j.lanwpc.2024.101352","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>With the widespread application of generative AI (GenAI) models, it is crucial to systematically evaluate their performance in lung adenocarcinoma histopathological assessment. This study aimed to evaluate and compare the performance of three GenAI models with visual capabilities (GPT-4o, Claude-3.5-Sonnet, and Gemini-1.5-Pro) in lung adenocarcinoma histological pattern recognition and grading, and to explore the construction of prognostic prediction models based on GenAI feature extraction.</div></div><div><h3>Methods</h3><div>This retrospective study extracted 310 diagnostic slides from the TCGA-LUAD database for model evaluation. An additional 87 diagnostic pathology slides from local lung adenocarcinoma surgical patients were used for external validation of the prognostic model. Primary outcomes were GenAI grading accuracy and stability, measured by the area under the receiver operating characteristic curve (AUC) and intraclass correlation coefficient (ICC), respectively. Secondary outcomes included the construction and assessment of machine learning-based prognostic prediction models, utilizing features extracted by GenAI, with model performance evaluated using the Concordance index (C-index).</div></div><div><h3>Findings</h3><div>Claude-3.5-Sonnet demonstrated the best overall performance, combining high grading accuracy (average AUC = 0.82) with moderate stability (ICC = 0.59) The optimal machine learning-based prognostic model, constructed using features extracted by Claude-3.5-Sonnet and incorporating clinical variables, showed good performance in both internal and external validation, with an average C-index of 0.72. Meta-analysis demonstrated that this prognostic model effectively stratified patients into risk groups, with the high-risk group showing significantly worse outcomes (Hazard ratio = 6.44, 95% confidence interval = 3.42-12.14).</div></div><div><h3>Interpretation</h3><div>This study demonstrates the potential application value of GenAI models in lung adenocarcinoma histopathological assessment. Claude-3.5-Sonnet demonstrated the highest grading accuracy, and the machine learning-based prognostic model that utilized its feature extraction showed good predictive capabilities. These findings provide new research directions for AI-assisted pathological diagnosis and prognostic prediction, with the potential to improve the management of lung adenocarcinoma patients.</div></div>","PeriodicalId":22792,"journal":{"name":"The Lancet Regional Health: Western Pacific","volume":"55 ","pages":"Article 101352"},"PeriodicalIF":7.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Lancet Regional Health: Western Pacific","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666606524003468","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Background

With the widespread application of generative AI (GenAI) models, it is crucial to systematically evaluate their performance in lung adenocarcinoma histopathological assessment. This study aimed to evaluate and compare the performance of three GenAI models with visual capabilities (GPT-4o, Claude-3.5-Sonnet, and Gemini-1.5-Pro) in lung adenocarcinoma histological pattern recognition and grading, and to explore the construction of prognostic prediction models based on GenAI feature extraction.

Methods

This retrospective study extracted 310 diagnostic slides from the TCGA-LUAD database for model evaluation. An additional 87 diagnostic pathology slides from local lung adenocarcinoma surgical patients were used for external validation of the prognostic model. Primary outcomes were GenAI grading accuracy and stability, measured by the area under the receiver operating characteristic curve (AUC) and intraclass correlation coefficient (ICC), respectively. Secondary outcomes included the construction and assessment of machine learning-based prognostic prediction models, utilizing features extracted by GenAI, with model performance evaluated using the Concordance index (C-index).

Findings

Claude-3.5-Sonnet demonstrated the best overall performance, combining high grading accuracy (average AUC = 0.82) with moderate stability (ICC = 0.59) The optimal machine learning-based prognostic model, constructed using features extracted by Claude-3.5-Sonnet and incorporating clinical variables, showed good performance in both internal and external validation, with an average C-index of 0.72. Meta-analysis demonstrated that this prognostic model effectively stratified patients into risk groups, with the high-risk group showing significantly worse outcomes (Hazard ratio = 6.44, 95% confidence interval = 3.42-12.14).

Interpretation

This study demonstrates the potential application value of GenAI models in lung adenocarcinoma histopathological assessment. Claude-3.5-Sonnet demonstrated the highest grading accuracy, and the machine learning-based prognostic model that utilized its feature extraction showed good predictive capabilities. These findings provide new research directions for AI-assisted pathological diagnosis and prognostic prediction, with the potential to improve the management of lung adenocarcinoma patients.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
背景随着生成式人工智能(GenAI)模型的广泛应用,系统评估其在肺腺癌组织病理学评估中的性能至关重要。本研究旨在评估和比较三种具有视觉能力的 GenAI 模型(GPT-4o、Claude-3.5-Sonnet 和 Gemini-1.5-Pro)在肺腺癌组织学模式识别和分级中的表现,并探索基于 GenAI 特征提取构建预后预测模型。另外87张来自当地肺腺癌手术患者的诊断病理切片用于预后模型的外部验证。主要结果是GenAI分级的准确性和稳定性,分别用接收者操作特征曲线下面积(AUC)和类内相关系数(ICC)来衡量。次要结果包括利用 GenAI 提取的特征构建和评估基于机器学习的预后预测模型,并使用一致性指数(C-index)评估模型性能。利用 Claude-3.5-Sonnet 提取的特征并结合临床变量构建的基于机器学习的最佳预后模型在内部和外部验证中均表现良好,平均 C-index 为 0.72。Meta分析表明,该预后模型能有效地将患者分为不同的风险组,其中高风险组的预后明显较差(危险比=6.44,95%置信区间=3.42-12.14)。Claude-3.5-Sonnet显示了最高的分级准确性,利用其特征提取的基于机器学习的预后模型显示了良好的预测能力。这些发现为人工智能辅助病理诊断和预后预测提供了新的研究方向,有望改善肺腺癌患者的管理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
The Lancet Regional Health: Western Pacific
The Lancet Regional Health: Western Pacific Medicine-Pediatrics, Perinatology and Child Health
CiteScore
8.80
自引率
2.80%
发文量
305
审稿时长
11 weeks
期刊介绍: The Lancet Regional Health – Western Pacific, a gold open access journal, is an integral part of The Lancet's global initiative advocating for healthcare quality and access worldwide. It aims to advance clinical practice and health policy in the Western Pacific region, contributing to enhanced health outcomes. The journal publishes high-quality original research shedding light on clinical practice and health policy in the region. It also includes reviews, commentaries, and opinion pieces covering diverse regional health topics, such as infectious diseases, non-communicable diseases, child and adolescent health, maternal and reproductive health, aging health, mental health, the health workforce and systems, and health policy.
期刊最新文献
Cost-effectiveness analysis of switching from a bivalent to a nonavalent HPV vaccination programme in China: a modelling study Strategies for the prevention of ischemic stroke in atrial fibrillation in East Asia: clinical features, changes and challenges Prevalence of chronic kidney disease among Chinese adults with diabetes: a nationwide population-based cross-sectional study A randomised, double-masked, placebo-controlled trial evaluating the efficacy and safety of teprotumumab for active thyroid eye disease in Japanese patients Middle-age cerebral small vessel disease and cognitive function in later life: a population-based prospective cohort study
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1