首页 > 最新文献

JCO Clinical Cancer Informatics最新文献

英文 中文
Reimagining Cancer Care With Generative Artificial Intelligence: The Promise of Large Language Models. 用生成式人工智能重新构想癌症治疗:大型语言模型的前景。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-12-01 DOI: 10.1200/CCI-25-00134
Ji-Eun Irene Yum, Syed Arsalan Ahmed Naqvi, Ben Zhou, Irbaz Bin Riaz

The emergence of state-of-the-art large language models (LLMs), which hold the ability to generalize to diverse natural language processing tasks, has led to new opportunities in health care. Oncology is especially well-suited to leverage these resources as the journeys of patients with cancer inherently yield extensive, longitudinal data sets comprising clinical narratives, pathology and radiology reports, and genomic sequencing reports. This review begins with an overview of the fundamental concepts behind LLMs, including the definitions, architecture, training paradigm, and performance optimization through prompt engineering and retrieval-augmented generation. We also take a moment to explore the newly emerging paradigm of LLMs in a multiagentic framework. We then synthesize current research on how LLMs may benefit stakeholders within the practice of oncology, including patients, oncologists, researchers, and learners. Finally, we address the limitations and risks of LLMs, including hallucinations, inherent biases, patient privacy, and clinician deskilling. While research thus far shows significant potential for LLMs to transform cancer care, necessary future directions include studies emphasizing patient stakeholder perspectives on LLM incorporation in clinical workflows, the development of relevant clinical benchmarks for LLM evaluation, a greater focus on real-world prospective testing, and deeper exploration of LLM reasoning capabilities.

最先进的大型语言模型(llm)的出现,具有推广到各种自然语言处理任务的能力,为医疗保健带来了新的机会。肿瘤学特别适合利用这些资源,因为癌症患者的旅程固有地产生广泛的纵向数据集,包括临床叙述、病理和放射学报告以及基因组测序报告。本文首先概述了llm背后的基本概念,包括定义、体系结构、训练范例,以及通过快速工程和检索增强生成实现的性能优化。我们还花了一点时间来探索在多机构框架中新兴的法学硕士范式。然后,我们综合当前的研究如何法学硕士可能有利于肿瘤实践中的利益相关者,包括患者,肿瘤学家,研究人员和学习者。最后,我们讨论了法学硕士的局限性和风险,包括幻觉、固有偏见、患者隐私和临床医生的技能。虽然迄今为止的研究表明LLM在改变癌症治疗方面具有巨大的潜力,但未来必要的方向包括强调将LLM纳入临床工作流程的患者利益相关者观点的研究,为LLM评估制定相关的临床基准,更加关注现实世界的前瞻性测试,以及更深入地探索LLM推理能力。
{"title":"Reimagining Cancer Care With Generative Artificial Intelligence: The Promise of Large Language Models.","authors":"Ji-Eun Irene Yum, Syed Arsalan Ahmed Naqvi, Ben Zhou, Irbaz Bin Riaz","doi":"10.1200/CCI-25-00134","DOIUrl":"https://doi.org/10.1200/CCI-25-00134","url":null,"abstract":"<p><p>The emergence of state-of-the-art large language models (LLMs), which hold the ability to generalize to diverse natural language processing tasks, has led to new opportunities in health care. Oncology is especially well-suited to leverage these resources as the journeys of patients with cancer inherently yield extensive, longitudinal data sets comprising clinical narratives, pathology and radiology reports, and genomic sequencing reports. This review begins with an overview of the fundamental concepts behind LLMs, including the definitions, architecture, training paradigm, and performance optimization through prompt engineering and retrieval-augmented generation. We also take a moment to explore the newly emerging paradigm of LLMs in a multiagentic framework. We then synthesize current research on how LLMs may benefit stakeholders within the practice of oncology, including patients, oncologists, researchers, and learners. Finally, we address the limitations and risks of LLMs, including hallucinations, inherent biases, patient privacy, and clinician deskilling. While research thus far shows significant potential for LLMs to transform cancer care, necessary future directions include studies emphasizing patient stakeholder perspectives on LLM incorporation in clinical workflows, the development of relevant clinical benchmarks for LLM evaluation, a greater focus on real-world prospective testing, and deeper exploration of LLM reasoning capabilities.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500134"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145656461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geospatial Analysis of Commission on Cancer-Accredited Centers Within Cancer Care Utilization-Based Catchment Areas. 基于癌症治疗利用的集水区内癌症委员会认可中心的地理空间分析。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-12-01 Epub Date: 2025-12-23 DOI: 10.1200/CCI-25-00163
Nicole Rademacher, Connor Sisk, Joshua S Richman, Kristy Broman, Changzhen Wang

Purpose: The Commission on Cancer (CoC) seeks to expand access to high-quality care through community engagement standards targeting centers' catchment areas and efforts to accredit centers in more areas including rural hospitals. Little is known about the social, environmental, and geographic characteristics of their catchment areas. To support future investigation into the impact of CoC-accredited centers, this study compares characteristics of cancer care utilization-based catchment areas, termed Cancer Service Areas (CSAs), with and without CoC-accredited centers.

Methods: Geocoded CoC-accredited centers and cancer care patient flows extracted from Medicare claims data were used to delineate CSAs using a spatially constrained community detection method. Characteristics including environmental justice index (EJI), social vulnerability index (SVI), rurality, travel time, and localization index (LI, a ratio of cancer care received by patients within a CSA) were aggregated by CSA. A logistic regression model was created to evaluate characteristics associated with the presence of a CoC-accredited center within a CSA.

Results: Six hundred sixty-eight CSAs were defined, of which 511 CSAs had at least one CoC-accredited center. CSAs with CoC-accredited centers had lower health vulnerability (odds ratio [OR], 0.65 [95% CI, 0.427 to 0.993]) and lower racial and ethnic minority status vulnerability (OR, 0.61 [95% CI, 0.424 to 0.886]), but no differences for other components of the EJI or SVI. These CSAs also had higher LIs, meaning patients remained in their local CSA for care (OR, 9.00 [95% CI, 2.408 to 33.640] for high v low LIs).

Conclusion: Minority and comorbid populations may have more difficulty accessing cancer center care, further exacerbating observed variations in cancer outcomes. Cancer centers may address this by broadening their outreach into at-risk catchment areas.

目的:癌症委员会(CoC)力求通过针对中心集水区的社区参与标准和努力在包括农村医院在内的更多地区对中心进行认证,扩大获得高质量护理的机会。人们对其集水区的社会、环境和地理特征知之甚少。为了支持未来对coc认证中心影响的调查,本研究比较了有和没有coc认证中心的基于癌症护理利用的集水区(称为癌症服务区(csa))的特征。方法:使用地理编码的coc认证中心和从医疗保险索赔数据中提取的癌症护理患者流,使用空间受限的社区检测方法来描述csa。通过CSA对环境正义指数(EJI)、社会脆弱性指数(SVI)、乡村性、出行时间和本地化指数(LI,一个CSA内患者接受癌症治疗的比率)等特征进行汇总。建立了一个逻辑回归模型来评估与CSA内coc认证中心存在相关的特征。结果:共确定668家csa,其中511家csa至少有一家coc认证中心。coc认证中心的csa具有较低的健康脆弱性(优势比[OR], 0.65 [95% CI, 0.427至0.993])和较低的种族和少数民族地位脆弱性(OR, 0.61 [95% CI, 0.424至0.886]),但EJI或SVI的其他组成部分没有差异。这些CSA也具有较高的LIs,这意味着患者仍留在当地CSA接受护理(对于高和低LIs, OR为9.00 [95% CI, 2.408至33.640])。结论:少数民族和合并症人群可能更难以获得癌症中心的护理,进一步加剧了观察到的癌症结局的变化。癌症中心可以通过扩大他们在高危地区的服务范围来解决这个问题。
{"title":"Geospatial Analysis of Commission on Cancer-Accredited Centers Within Cancer Care Utilization-Based Catchment Areas.","authors":"Nicole Rademacher, Connor Sisk, Joshua S Richman, Kristy Broman, Changzhen Wang","doi":"10.1200/CCI-25-00163","DOIUrl":"https://doi.org/10.1200/CCI-25-00163","url":null,"abstract":"<p><strong>Purpose: </strong>The Commission on Cancer (CoC) seeks to expand access to high-quality care through community engagement standards targeting centers' catchment areas and efforts to accredit centers in more areas including rural hospitals. Little is known about the social, environmental, and geographic characteristics of their catchment areas. To support future investigation into the impact of CoC-accredited centers, this study compares characteristics of cancer care utilization-based catchment areas, termed <i>Cancer Service Areas</i> (<i>CSAs</i>), with and without CoC-accredited centers.</p><p><strong>Methods: </strong>Geocoded CoC-accredited centers and cancer care patient flows extracted from Medicare claims data were used to delineate CSAs using a spatially constrained community detection method. Characteristics including environmental justice index (EJI), social vulnerability index (SVI), rurality, travel time, and localization index (LI, a ratio of cancer care received by patients within a CSA) were aggregated by CSA. A logistic regression model was created to evaluate characteristics associated with the presence of a CoC-accredited center within a CSA.</p><p><strong>Results: </strong>Six hundred sixty-eight CSAs were defined, of which 511 CSAs had at least one CoC-accredited center. CSAs with CoC-accredited centers had lower health vulnerability (odds ratio [OR], 0.65 [95% CI, 0.427 to 0.993]) and lower racial and ethnic minority status vulnerability (OR, 0.61 [95% CI, 0.424 to 0.886]), but no differences for other components of the EJI or SVI. These CSAs also had higher LIs, meaning patients remained in their local CSA for care (OR, 9.00 [95% CI, 2.408 to 33.640] for high <i>v</i> low LIs).</p><p><strong>Conclusion: </strong>Minority and comorbid populations may have more difficulty accessing cancer center care, further exacerbating observed variations in cancer outcomes. Cancer centers may address this by broadening their outreach into at-risk catchment areas.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500163"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development, External Validation, and Deployment of RFAN-ML: A Machine Learning Model to Estimate Renal Function After Nephrectomy. RFAN-ML的开发、外部验证和部署:一种评估肾切除术后肾功能的机器学习模型。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-07 DOI: 10.1200/CCI-25-00086
Jesse Persily, Steven L Chang, Chen Chen, Yassamin Neshatvar, Siri Desiraju, Rajesh Ranganath, Katie Murray, Adam Feldman, Douglas Dahl, Samir S Taneja, William C Huang, Madhur Nayan

Purpose: Partial nephrectomy has been advocated as the preferred surgical approach for small kidney tumors over total nephrectomy. However, partial nephrectomy is associated with increased perioperative risk. Estimating renal function after nephrectomy can facilitate personalized patient counseling, guide surgical approach, and identify patients who could benefit from perioperative interventions. Existing prediction models have several limitations including the lack of external validation or a user-friendly tool or application, and most have used traditional statistical methods.

Methods: We used data from two academic medical institutions and machine learning (ML) methods to develop and externally validate renal function after nephrectomy-machine learning (RFAN-ML), a model to estimate long-term renal function after partial or total nephrectomy. Boruta feature selection was used to select four routinely available clinical features, specifically age, BMI, preoperative renal function, and nephrectomy type. In the training set of 1,932 patients, we compared six ML regression models representing a set of both ensemble and nonensemble ML algorithms and optimized for root mean squared error (RMSE). This model was evaluated in a test set of 1,995 patients, and the best performing model was selected as RFAN-ML.

Results: We compared RFAN-ML with existing renal function prediction benchmarks and found that RFAN-ML outperformed or had competitive performance with benchmarks on RMSE (16.6 [95% CI, 15.6 to 17.5]), R2, and mean absolute error.

Conclusion: We developed and externally validated RFAN-ML, a ML model to predict renal function after nephrectomy, and have deployed our model online. RFAN-ML has the potential to improve the care and outcomes in patients with kidney tumors by informing personalized patient counseling and guiding surgical planning.

目的:相对于全肾切除术,部分肾切除术被认为是治疗小肾肿瘤的首选手术方法。然而,部分肾切除术与围手术期风险增加有关。评估肾切除术后的肾功能可以促进个性化患者咨询,指导手术方法,并确定可以从围手术期干预中受益的患者。现有的预测模型有一些局限性,包括缺乏外部验证或用户友好的工具或应用程序,并且大多数使用传统的统计方法。方法:我们使用来自两家学术医疗机构的数据和机器学习(ML)方法来开发和外部验证肾切除术后肾功能-机器学习(RFAN-ML)模型,这是一个评估部分或全部肾切除术后长期肾功能的模型。采用Boruta特征选择方法选择4个常规可用的临床特征,特别是年龄、BMI、术前肾功能和肾切除术类型。在1932例患者的训练集中,我们比较了六种ML回归模型,这些模型代表了一组集成和非集成ML算法,并对均方根误差(RMSE)进行了优化。该模型在1995例患者的测试集中进行评估,选择表现最好的模型为RFAN-ML。结果:我们将RFAN-ML与现有的肾功能预测基准进行了比较,发现RFAN-ML在RMSE (16.6 [95% CI, 15.6至17.5])、R2和平均绝对误差方面优于基准或具有竞争力。结论:我们开发并外部验证了RFAN-ML,这是一个预测肾切除术后肾功能的ML模型,并已在线部署我们的模型。RFAN-ML通过提供个性化的患者咨询和指导手术计划,有可能改善肾脏肿瘤患者的护理和预后。
{"title":"Development, External Validation, and Deployment of RFAN-ML: A Machine Learning Model to Estimate Renal Function After Nephrectomy.","authors":"Jesse Persily, Steven L Chang, Chen Chen, Yassamin Neshatvar, Siri Desiraju, Rajesh Ranganath, Katie Murray, Adam Feldman, Douglas Dahl, Samir S Taneja, William C Huang, Madhur Nayan","doi":"10.1200/CCI-25-00086","DOIUrl":"https://doi.org/10.1200/CCI-25-00086","url":null,"abstract":"<p><strong>Purpose: </strong>Partial nephrectomy has been advocated as the preferred surgical approach for small kidney tumors over total nephrectomy. However, partial nephrectomy is associated with increased perioperative risk. Estimating renal function after nephrectomy can facilitate personalized patient counseling, guide surgical approach, and identify patients who could benefit from perioperative interventions. Existing prediction models have several limitations including the lack of external validation or a user-friendly tool or application, and most have used traditional statistical methods.</p><p><strong>Methods: </strong>We used data from two academic medical institutions and machine learning (ML) methods to develop and externally validate renal function after nephrectomy-machine learning (RFAN-ML), a model to estimate long-term renal function after partial or total nephrectomy. Boruta feature selection was used to select four routinely available clinical features, specifically age, BMI, preoperative renal function, and nephrectomy type. In the training set of 1,932 patients, we compared six ML regression models representing a set of both ensemble and nonensemble ML algorithms and optimized for root mean squared error (RMSE). This model was evaluated in a test set of 1,995 patients, and the best performing model was selected as RFAN-ML.</p><p><strong>Results: </strong>We compared RFAN-ML with existing renal function prediction benchmarks and found that RFAN-ML outperformed or had competitive performance with benchmarks on RMSE (16.6 [95% CI, 15.6 to 17.5]), R<sup>2</sup>, and mean absolute error.</p><p><strong>Conclusion: </strong>We developed and externally validated RFAN-ML, a ML model to predict renal function after nephrectomy, and have deployed our model online. RFAN-ML has the potential to improve the care and outcomes in patients with kidney tumors by informing personalized patient counseling and guiding surgical planning.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500086"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145472458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harvesting Risk: An Ecologic Study of Agricultural Practices and Patterns and Melanoma Incidence in Pennsylvania. 收获风险:宾夕法尼亚州农业实践和模式与黑色素瘤发病率的生态学研究。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-14 DOI: 10.1200/CCI-25-00160
Benjamin J Marks, Jiangang Liao, Charlene Lam, Camille Moeckel, Eugene J Lengerich

Purpose: To examine the geospatial distribution of melanoma incidence in Pennsylvania (PA), quantify its association with agriculture practices and patterns, and consider its relevance for cancer control.

Methods: The study used an ecologic design with county-level PA data on the 2017-2021 incidence of invasive melanoma among adults 50 years and older, as well as agricultural patterns and practices, ultraviolet radiation (UVR), and demographics/socioeconomics. Spatial clustering was examined using local indicators of spatial association and Getis-Ord Gi*. Separate adjacency-weighted Conway-Maxwell-Poisson models, adjusted for UVR and social vulnerability, quantified the association between melanoma and (1) cultivated and pasture/hay acreage and (2) herbicide-, insecticide-, fungicide-, and manure-treated acreage.

Results: Melanoma incidence was 57.1% greater in a 15-county cluster (P < .05) in South Central PA; eight counties were designated as metropolitan. Compared with noncluster counties, cluster counties had significantly more cultivated land (mean 19.8% v 6.9%, P < .001) and herbicide-treated land (16.8% v 6.5%, P < .001). In adjusted models, a 10% increase in cultivated land and a 9% increase in herbicide-treated acreage each independently corresponded to a 14% increase in incidence.

Conclusion: Melanoma incidence clustered in South Central PA, an area with substantial agricultural industry. However, a majority of counties in the cluster were designated as metropolitan, challenging the concept that agriculture is primarily an industry of counties designated as nonmetropolitan (rural). Agricultural practices and patterns were associated with incidence, suggesting that cancer control adopt an integrated One Health approach to concurrently address occupational, environmental, and behavioral risks. The cluster was entirely within the 28-county catchment area of the Penn State Cancer Institute, demonstrating the utility of geospatial data and analysis for cancer control by a cancer center.

目的:研究宾夕法尼亚州(PA)黑色素瘤发病率的地理空间分布,量化其与农业实践和模式的关系,并考虑其与癌症控制的相关性。方法:该研究采用生态设计,结合2017-2021年50岁及以上成年人侵袭性黑色素瘤发病率的县级PA数据,以及农业模式和实践、紫外线辐射(UVR)和人口统计学/社会经济学数据。利用空间关联局部指标和Getis-Ord Gi*检验空间聚类。单独的邻接加权康威-麦克斯韦-泊松模型,对紫外线辐射和社会脆弱性进行了调整,量化了黑色素瘤与(1)耕地和牧场/干草面积以及(2)除草剂、杀虫剂、杀菌剂和肥料处理面积之间的关系。结果:PA中南部15个县的黑色素瘤发病率高出57.1% (P < 0.05);8个县被指定为都会县。与非聚类县相比,聚类县的耕地(平均19.8% vs 6.9%, P < .001)和除草剂处理土地(16.8% vs 6.5%, P < .001)显著增加。在调整后的模型中,耕地面积增加10%和除草剂处理面积增加9%各自对应于发病率增加14%。结论:黑色素瘤发病集中在PA中南部,该地区农业产业丰富。然而,集群中的大多数县被指定为大都市,挑战了农业主要是被指定为非大都市(农村)县的产业的概念。农业实践和模式与发病率相关,这表明癌症控制应采用综合的“同一个健康”方法,同时处理职业、环境和行为风险。该集群完全位于宾夕法尼亚州立癌症研究所的28个县的集水区内,展示了癌症中心在癌症控制方面的地理空间数据和分析的效用。
{"title":"Harvesting Risk: An Ecologic Study of Agricultural Practices and Patterns and Melanoma Incidence in Pennsylvania.","authors":"Benjamin J Marks, Jiangang Liao, Charlene Lam, Camille Moeckel, Eugene J Lengerich","doi":"10.1200/CCI-25-00160","DOIUrl":"10.1200/CCI-25-00160","url":null,"abstract":"<p><strong>Purpose: </strong>To examine the geospatial distribution of melanoma incidence in Pennsylvania (PA), quantify its association with agriculture practices and patterns, and consider its relevance for cancer control.</p><p><strong>Methods: </strong>The study used an ecologic design with county-level PA data on the 2017-2021 incidence of invasive melanoma among adults 50 years and older, as well as agricultural patterns and practices, ultraviolet radiation (UVR), and demographics/socioeconomics. Spatial clustering was examined using local indicators of spatial association and Getis-Ord Gi*. Separate adjacency-weighted Conway-Maxwell-Poisson models, adjusted for UVR and social vulnerability, quantified the association between melanoma and (1) cultivated and pasture/hay acreage and (2) herbicide-, insecticide-, fungicide-, and manure-treated acreage.</p><p><strong>Results: </strong>Melanoma incidence was 57.1% greater in a 15-county cluster (<i>P</i> < .05) in South Central PA; eight counties were designated as metropolitan. Compared with noncluster counties, cluster counties had significantly more cultivated land (mean 19.8% <i>v</i> 6.9%, <i>P</i> < .001) and herbicide-treated land (16.8% <i>v</i> 6.5%, <i>P</i> < .001). In adjusted models, a 10% increase in cultivated land and a 9% increase in herbicide-treated acreage each independently corresponded to a 14% increase in incidence.</p><p><strong>Conclusion: </strong>Melanoma incidence clustered in South Central PA, an area with substantial agricultural industry. However, a majority of counties in the cluster were designated as metropolitan, challenging the concept that agriculture is primarily an industry of counties designated as nonmetropolitan (rural). Agricultural practices and patterns were associated with incidence, suggesting that cancer control adopt an integrated One Health approach to concurrently address occupational, environmental, and behavioral risks. The cluster was entirely within the 28-county catchment area of the Penn State Cancer Institute, demonstrating the utility of geospatial data and analysis for cancer control by a cancer center.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500160"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629121/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial Intelligence System for Psychospiritual Distress in Family Caregivers of Patients With Terminal Cancer: A Retrospective Study. 人工智能系统对晚期癌症患者家属照顾者心理精神困扰的回顾性研究。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-05 DOI: 10.1200/CCI-25-00129
Kento Masukawa, Ryusho Suzuki, Momoka Tanno, Masaharu Nakayama, Mitsunori Miyashita

Purpose: Family caregivers of patients with terminal cancer need psychospiritual care. The assessment of their psychospiritual distress is challenging. An automated system can be used to detect psychospiritual distress from large medical records in electronic medical records and help health care providers to accurately assess distress. This study aimed to develop an artificial intelligence system that automatically detects the psychological and spiritual distress of the families of patients with terminal cancer from unstructured text data in electronic medical records.

Methods: This retrospective study collected medical records (n = 1,554,736) from 1 month before the participants died. The participants (n = 808) died at Tohoku University Hospital in Japan between January 1, 2018, and December 31, 2019. We randomly selected 10,000 records from physician and nursing records and split the data set into training and testing sets at a ratio of 70:30. We used the area under the receiver operating characteristic curve (AUROC) and precision-recall curve (AUPRC) to evaluate the model performances. We used explain it like I am 5 and identified important expressions for detecting psychospiritual distress.

Results: The model with the highest performance for detecting psychological distress had AUROC and AUPRC values of 0.92 and 0.62, respectively. The model with the highest performance for detecting spiritual distress had values of 0.92 and 0.41, respectively. In psychological distress, the expressions with higher values were anxiety, worry, and tears. In spiritual distress, the expressions with higher values were want, me, and how.

Conclusion: This study showed the application of machine learning models for the detection of psychospiritual distress among family caregivers of patients with terminal cancer from electronic medical records.

目的:癌症晚期患者的家庭照顾者需要心理关怀。评估他们的精神痛苦是一项挑战。自动化系统可用于检测电子病历中大量医疗记录中的心理精神困扰,并帮助医疗保健提供者准确评估困扰。本研究旨在开发一种人工智能系统,从电子病历中的非结构化文本数据中自动检测癌症晚期患者家属的心理和精神痛苦。方法:本回顾性研究收集了参与者死亡前1个月的医疗记录(n = 1,554,736)。参与者(n = 808)于2018年1月1日至2019年12月31日期间在日本东北大学医院死亡。我们从医生和护理记录中随机选择了10,000条记录,并以70:30的比例将数据集分成训练集和测试集。我们使用接收者工作特征曲线(AUROC)和精确召回率曲线(AUPRC)下的面积来评估模型的性能。我们曾经把它解释成我5岁,并确定了检测心理痛苦的重要表达。结果:AUROC值为0.92,AUPRC值为0.62,对心理困扰的检测效果最好。对精神痛苦的检测效果最好的模型值分别为0.92和0.41。在心理困扰中,焦虑、担忧、泪水的表达值较高。在精神困境中,具有较高价值的表达是“想要”、“我”和“怎样”。结论:本研究展示了机器学习模型在从电子病历中检测晚期癌症患者家属照顾者心理精神困扰中的应用。
{"title":"Artificial Intelligence System for Psychospiritual Distress in Family Caregivers of Patients With Terminal Cancer: A Retrospective Study.","authors":"Kento Masukawa, Ryusho Suzuki, Momoka Tanno, Masaharu Nakayama, Mitsunori Miyashita","doi":"10.1200/CCI-25-00129","DOIUrl":"https://doi.org/10.1200/CCI-25-00129","url":null,"abstract":"<p><strong>Purpose: </strong>Family caregivers of patients with terminal cancer need psychospiritual care. The assessment of their psychospiritual distress is challenging. An automated system can be used to detect psychospiritual distress from large medical records in electronic medical records and help health care providers to accurately assess distress. This study aimed to develop an artificial intelligence system that automatically detects the psychological and spiritual distress of the families of patients with terminal cancer from unstructured text data in electronic medical records.</p><p><strong>Methods: </strong>This retrospective study collected medical records (n = 1,554,736) from 1 month before the participants died. The participants (n = 808) died at Tohoku University Hospital in Japan between January 1, 2018, and December 31, 2019. We randomly selected 10,000 records from physician and nursing records and split the data set into training and testing sets at a ratio of 70:30. We used the area under the receiver operating characteristic curve (AUROC) and precision-recall curve (AUPRC) to evaluate the model performances. We used explain it like I am 5 and identified important expressions for detecting psychospiritual distress.</p><p><strong>Results: </strong>The model with the highest performance for detecting psychological distress had AUROC and AUPRC values of 0.92 and 0.62, respectively. The model with the highest performance for detecting spiritual distress had values of 0.92 and 0.41, respectively. In psychological distress, the expressions with higher values were anxiety, worry, and tears. In spiritual distress, the expressions with higher values were want, me, and how.</p><p><strong>Conclusion: </strong>This study showed the application of machine learning models for the detection of psychospiritual distress among family caregivers of patients with terminal cancer from electronic medical records.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500129"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145453531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RadOncRAG: A Novel Retrieval-Augmented Generation Framework Improves Large Language Model Benchmark Performance in Radiation Oncology. RadOncRAG:一种新的检索增强生成框架,提高了放射肿瘤学中大型语言模型的基准性能。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-14 DOI: 10.1200/CCI-25-00220
Nikhil Gautam Thaker, Navid Redjal, Adam Dicker, Arturo Loaiza-Bonilla, Trevor Royce, Vivek Subbiah, Vikash Deendyal, Jonathan R Gabriel, Neena Shetty, Ajay Choudhri, Gautam H Thaker

Purpose: Large language models (LLMs) show promise in assisting knowledge-intensive fields such as oncology, where up-to-date information and multidisciplinary expertise are critical. Traditional LLMs risk hallucinations and reliance on static, possibly outdated data that lack domain-specific context. Retrieval-augmented generation (RAG) has emerged as a strategy to address these issues by incorporating domain-specific information from external knowledge repositories.

Methods: We evaluated 15 LLMs, including Meta Llama-2/3, generative pretrained transformer (GPT)-3.5/4/4o variants, Claude-3, Gemini-2.0, and DeepSeek-R1. In a zero-shot workflow, each LLM answered 298 scorable questions from the 2021 American College of Radiology in-training examination. We implemented a RAG pipeline (Iridium Model) that transforms user prompts into vector embeddings, queries a specialized radiation oncology database, and merges relevant text with the original prompt to form an augmented query. We compared zero-shot versus RAG-augmented performance.

Results: Larger-parameter LLMs had higher zero-shot accuracy, with six models outscoring graduating residents (P < .01). Top scorers were reasoning models GPT-4o1, o3-mini, and DeepSeek-R1, which achieved 91.6%, 86.6%, and 91.6% without RAG, respectively. Gemini-2.0 improved 6.7% (to 79.2%), Llama-3-70b 8.4% (to 75.8%), and GPT-4o 5.7% (to 85.6%) with RAG. Top scoring reasoning models surpassed graduating resident averages by 17.7%-20% (P < .01), but had no improvement or detriment with RAG. Domain-specific gains occurred in clinical, biology, and physics. Majority voting boosted aggregate accuracy when individual model performance exceeded 50%. RAG workflows and reasoning models incurred higher computational costs.

Conclusion: Radiation-oncology-specific retrieval-augmented generation pipeline enhances nonreasoning LLM performance in radiation oncology by integrating domain-specific evidence, whereas it does not improve performance of reasoning models. These findings demonstrate that RAG can elevate clinical decision support by enabling simpler, cost-effective nonreasoning models to tackle complex tasks through retrieval capabilities-an efficient alternative to extensive model training that also yields citable, evidence-based explanations.

目的:大型语言模型(llm)在协助肿瘤学等知识密集型领域显示出希望,在这些领域中,最新信息和多学科专业知识至关重要。传统法学硕士有产生幻觉的风险,并且依赖于静态的、可能过时的、缺乏特定领域背景的数据。检索增强生成(retrieve -augmented generation, RAG)作为一种策略出现,通过合并来自外部知识库的特定领域信息来解决这些问题。方法:我们评估了15个llm,包括Meta Llama-2/3、生成式预训练变压器(GPT)-3.5/4/ 40变体、claud -3、Gemini-2.0和DeepSeek-R1。在一个零射击的工作流程中,每个LLM回答了2021年美国放射学院在职考试中的298个可计分问题。我们实现了一个RAG管道(铱模型),它将用户提示转换为矢量嵌入,查询专门的放射肿瘤学数据库,并将相关文本与原始提示合并以形成增强查询。我们比较了零射击和ragar增强性能。结果:大参数LLMs的零射击精度更高,其中6个模型的零射击精度高于毕业居民(P < 0.01)。得分最高的是推理模型gpt - 410、o3-mini和DeepSeek-R1,它们在没有RAG的情况下分别达到了91.6%、86.6%和91.6%。使用RAG后,Gemini-2.0改善了6.7%(至79.2%),Llama-3-70b改善了8.4%(至75.8%),gpt - 40改善了5.7%(至85.6%)。得分最高的推理模型比毕业居民平均水平高出17.7% ~ 20% (P < 0.01),但对RAG没有改善或损害。特定领域的收益发生在临床、生物学和物理学。当单个模型的性能超过50%时,多数投票提高了总体准确性。RAG工作流和推理模型产生了更高的计算成本。结论:放射肿瘤学特异性检索增强生成管道通过整合特定领域的证据来提高非推理LLM在放射肿瘤学中的性能,而它并没有提高推理模型的性能。这些发现表明,RAG可以提高临床决策支持,使更简单、成本效益高的非推理模型通过检索能力来处理复杂的任务,这是一种有效的替代广泛的模型训练,也可以产生可引用的、基于证据的解释。
{"title":"RadOncRAG: A Novel Retrieval-Augmented Generation Framework Improves Large Language Model Benchmark Performance in Radiation Oncology.","authors":"Nikhil Gautam Thaker, Navid Redjal, Adam Dicker, Arturo Loaiza-Bonilla, Trevor Royce, Vivek Subbiah, Vikash Deendyal, Jonathan R Gabriel, Neena Shetty, Ajay Choudhri, Gautam H Thaker","doi":"10.1200/CCI-25-00220","DOIUrl":"https://doi.org/10.1200/CCI-25-00220","url":null,"abstract":"<p><strong>Purpose: </strong>Large language models (LLMs) show promise in assisting knowledge-intensive fields such as oncology, where up-to-date information and multidisciplinary expertise are critical. Traditional LLMs risk hallucinations and reliance on static, possibly outdated data that lack domain-specific context. Retrieval-augmented generation (RAG) has emerged as a strategy to address these issues by incorporating domain-specific information from external knowledge repositories.</p><p><strong>Methods: </strong>We evaluated 15 LLMs, including Meta Llama-2/3, generative pretrained transformer (GPT)-3.5/4/4o variants, Claude-3, Gemini-2.0, and DeepSeek-R1. In a zero-shot workflow, each LLM answered 298 scorable questions from the 2021 American College of Radiology in-training examination. We implemented a RAG pipeline (Iridium Model) that transforms user prompts into vector embeddings, queries a specialized radiation oncology database, and merges relevant text with the original prompt to form an augmented query. We compared zero-shot versus RAG-augmented performance.</p><p><strong>Results: </strong>Larger-parameter LLMs had higher zero-shot accuracy, with six models outscoring graduating residents (<i>P</i> < .01). Top scorers were reasoning models GPT-4o1, o3-mini, and DeepSeek-R1, which achieved 91.6%, 86.6%, and 91.6% without RAG, respectively. Gemini-2.0 improved 6.7% (to 79.2%), Llama-3-70b 8.4% (to 75.8%), and GPT-4o 5.7% (to 85.6%) with RAG. Top scoring reasoning models surpassed graduating resident averages by 17.7%-20% (<i>P</i> < .01), but had no improvement or detriment with RAG. Domain-specific gains occurred in clinical, biology, and physics. Majority voting boosted aggregate accuracy when individual model performance exceeded 50%. RAG workflows and reasoning models incurred higher computational costs.</p><p><strong>Conclusion: </strong>Radiation-oncology-specific retrieval-augmented generation pipeline enhances nonreasoning LLM performance in radiation oncology by integrating domain-specific evidence, whereas it does not improve performance of reasoning models. These findings demonstrate that RAG can elevate clinical decision support by enabling simpler, cost-effective nonreasoning models to tackle complex tasks through retrieval capabilities-an efficient alternative to extensive model training that also yields citable, evidence-based explanations.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500220"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Evaluation of Explainable Machine Learning Versus Linear Regression for Predicting County-Level Lung Cancer Mortality Rate in the United States. 可解释机器学习与线性回归预测美国县级肺癌死亡率的比较评价
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-17 DOI: 10.1200/CCI-24-00310
Soheil Hashtarkhani, Brianna M White, Benyamin Hoseini, David L Schwartz, Arash Shaban-Nejad

Purpose: Lung cancer (LC) is a leading cause of cancer-related mortality in the United States. Accurate prediction of LC mortality rates is crucial for guiding targeted interventions and addressing health disparities. Although traditional regression-based models have been commonly used, explainable machine learning models may offer enhanced predictive accuracy and deeper insights into the factors influencing LC mortality.

Methods: This study applied three models-random forest (RF), gradient boosting regression (GBR), and linear regression (LR)-to predict county-level LC mortality rates across the United States. Model performance was evaluated using R-squared and root mean squared error (RMSE). Shapley Additive Explanations (SHAP) values were used to determine variable importance and their directional impact. Geographic disparities in LC mortality were analyzed through Getis-Ord (Gi*) hotspot analysis.

Results: The RF model outperformed both GBR and LR, achieving an R2 value of 41.9% and an RMSE of 12.8. SHAP analysis identified smoking rate as the most important predictor, followed by median home value and the percentage of the Hispanic ethnic population. Spatial analysis revealed significant clusters of elevated LC mortality in the mid-eastern counties of the United States.

Conclusion: The RF model demonstrated superior predictive performance for LC mortality rates, emphasizing the critical roles of smoking prevalence, housing values, and the percentage of Hispanic ethnic population. These findings offer valuable actionable insights for designing targeted interventions, promoting screening, and addressing health disparities in regions most affected by LC in the United States.

目的:肺癌(LC)是美国癌症相关死亡的主要原因。准确预测低死亡率对于指导有针对性的干预措施和解决健康差距至关重要。尽管传统的基于回归的模型已被广泛使用,但可解释的机器学习模型可以提供更高的预测准确性,并更深入地了解影响LC死亡率的因素。方法:本研究采用随机森林(RF)、梯度增强回归(GBR)和线性回归(LR)三种模型来预测美国县级LC死亡率。使用r平方和均方根误差(RMSE)评估模型性能。Shapley加性解释(SHAP)值用于确定变量的重要性及其方向影响。通过Getis-Ord (Gi*)热点分析分析LC死亡率的地理差异。结果:RF模型优于GBR和LR, R2值为41.9%,RMSE为12.8。SHAP分析发现,吸烟率是最重要的预测因素,其次是房屋价值中位数和西班牙裔人口比例。空间分析显示,美国中东部地区的LC死亡率显著升高。结论:RF模型对LC死亡率的预测表现优异,强调了吸烟率、住房价值和西班牙裔人口比例的关键作用。这些发现为设计有针对性的干预措施、促进筛查和解决美国受LC影响最严重地区的健康差异提供了有价值的可操作见解。
{"title":"Comparative Evaluation of Explainable Machine Learning Versus Linear Regression for Predicting County-Level Lung Cancer Mortality Rate in the United States.","authors":"Soheil Hashtarkhani, Brianna M White, Benyamin Hoseini, David L Schwartz, Arash Shaban-Nejad","doi":"10.1200/CCI-24-00310","DOIUrl":"10.1200/CCI-24-00310","url":null,"abstract":"<p><strong>Purpose: </strong>Lung cancer (LC) is a leading cause of cancer-related mortality in the United States. Accurate prediction of LC mortality rates is crucial for guiding targeted interventions and addressing health disparities. Although traditional regression-based models have been commonly used, explainable machine learning models may offer enhanced predictive accuracy and deeper insights into the factors influencing LC mortality.</p><p><strong>Methods: </strong>This study applied three models-random forest (RF), gradient boosting regression (GBR), and linear regression (LR)-to predict county-level LC mortality rates across the United States. Model performance was evaluated using R-squared and root mean squared error (RMSE). Shapley Additive Explanations (SHAP) values were used to determine variable importance and their directional impact. Geographic disparities in LC mortality were analyzed through Getis-Ord (Gi*) hotspot analysis.</p><p><strong>Results: </strong>The RF model outperformed both GBR and LR, achieving an <i>R</i><sup>2</sup> value of 41.9% and an RMSE of 12.8. SHAP analysis identified smoking rate as the most important predictor, followed by median home value and the percentage of the Hispanic ethnic population. Spatial analysis revealed significant clusters of elevated LC mortality in the mid-eastern counties of the United States.</p><p><strong>Conclusion: </strong>The RF model demonstrated superior predictive performance for LC mortality rates, emphasizing the critical roles of smoking prevalence, housing values, and the percentage of Hispanic ethnic population. These findings offer valuable actionable insights for designing targeted interventions, promoting screening, and addressing health disparities in regions most affected by LC in the United States.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400310"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12643560/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145543792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reimagining Evidence: Artificial Intelligence Synthetic Data Generation for Cancer Research. 重新想象证据:癌症研究的人工智能合成数据生成。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-14 DOI: 10.1200/CCI-25-00304
Guergana Savova, Shan Chen, Jiarui Yao, Danielle Bitterman
{"title":"Reimagining Evidence: Artificial Intelligence Synthetic Data Generation for Cancer Research.","authors":"Guergana Savova, Shan Chen, Jiarui Yao, Danielle Bitterman","doi":"10.1200/CCI-25-00304","DOIUrl":"https://doi.org/10.1200/CCI-25-00304","url":null,"abstract":"","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500304"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Informatics Perspectives on the National Cancer Policy Forum Workshop "Enabling 21st Century Applications for Cancer Surveillance Through Enhanced Registries and Beyond". 国家癌症政策论坛研讨会的信息学观点“通过增强登记及其他方式实现21世纪癌症监测应用”。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-19 DOI: 10.1200/CCI-25-00098
Peter P Yu, W Scott Campbell, Eric B Durbin, Lawrence N Shulman, Jeremy L Warner

The National Cancer Policy Forum workshop Enabling 21st Century Applications for Cancer Surveillance Through Enhanced Registries and Beyond examined the current state of cancer registries and how they might evolve to extend registry missions to national health priorities related to improving patient and health economic outcomes, equitable access to care, and improvement in quality of health care and health system operational efficiencies. Session 3 of the workshop focused on medical informatics as a driver of improvement in cancer registry data quality and interoperability. Data quality begins with precision in data definitions as codified in controlled vocabularies and ontologies. Oncology data dictionaries that have been established or are evolving are described. Harmonization of various data dictionaries through representation in Systematized Nomenclature of Medicine-Clinical Terms and hierarchical classification systems within Common Data Models are outlined. Interoperability requires transmission standards that facilitate exchange of data between data sources, registries, and data consumers. While highly structured data capture and representation support semantically appropriate data use, the high degree of effort related to data capture and the accompanying rigidity in the data structure are challenges to implementation. Artificial intelligence may provide alternative paths for the extraction and representation of cancer registry data. Higher-fidelity cancer data and greater interoperability of data combined with data governance will help realize a Learning Health System for oncology, but economic benefits need to be shared to support the infrastructure costs incurred by health care systems.

国家癌症政策论坛研讨会“通过加强登记及其他方式实现21世纪癌症监测应用”审查了癌症登记的现状,以及它们如何发展,将登记任务扩展到与改善患者和健康经济结果、公平获得医疗服务、提高医疗质量和卫生系统运营效率相关的国家卫生优先事项。研讨会第3次会议的重点是医疗信息学作为改善癌症登记数据质量和互操作性的驱动因素。数据质量从数据定义的精确度开始,这些数据定义是在受控词汇表和本体中编码的。描述了已经建立或正在发展的肿瘤学数据词典。通过在医学-临床术语系统化命名法和公共数据模型中的分层分类系统中的表示来协调各种数据字典。互操作性需要传输标准来促进数据源、注册中心和数据使用者之间的数据交换。虽然高度结构化的数据捕获和表示支持语义上适当的数据使用,但与数据捕获相关的高度工作以及数据结构中伴随的刚性是实现的挑战。人工智能可以为癌症登记数据的提取和表示提供替代途径。更高保真度的癌症数据和更强的数据互操作性与数据治理相结合,将有助于实现肿瘤学的学习卫生系统,但经济效益需要共享,以支持卫生保健系统所产生的基础设施成本。
{"title":"Informatics Perspectives on the National Cancer Policy Forum Workshop \"Enabling 21st Century Applications for Cancer Surveillance Through Enhanced Registries and Beyond\".","authors":"Peter P Yu, W Scott Campbell, Eric B Durbin, Lawrence N Shulman, Jeremy L Warner","doi":"10.1200/CCI-25-00098","DOIUrl":"https://doi.org/10.1200/CCI-25-00098","url":null,"abstract":"<p><p>The National Cancer Policy Forum workshop <i>Enabling 21st Century Applications for Cancer Surveillance Through Enhanced Registries and Beyond</i> examined the current state of cancer registries and how they might evolve to extend registry missions to national health priorities related to improving patient and health economic outcomes, equitable access to care, and improvement in quality of health care and health system operational efficiencies. Session 3 of the workshop focused on medical informatics as a driver of improvement in cancer registry data quality and interoperability. Data quality begins with precision in data definitions as codified in controlled vocabularies and ontologies. Oncology data dictionaries that have been established or are evolving are described. Harmonization of various data dictionaries through representation in Systematized Nomenclature of Medicine-Clinical Terms and hierarchical classification systems within Common Data Models are outlined. Interoperability requires transmission standards that facilitate exchange of data between data sources, registries, and data consumers. While highly structured data capture and representation support semantically appropriate data use, the high degree of effort related to data capture and the accompanying rigidity in the data structure are challenges to implementation. Artificial intelligence may provide alternative paths for the extraction and representation of cancer registry data. Higher-fidelity cancer data and greater interoperability of data combined with data governance will help realize a Learning Health System for oncology, but economic benefits need to be shared to support the infrastructure costs incurred by health care systems.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500098"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145558418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Artificial Intelligence Model From Baseline Histopathology Adds Prognostic Information for Distant Recurrence Assessment in Hormone Receptor-Positive/Human Epidermal Growth Factor Receptor 2-Negative Early Breast Cancer. 基于基线组织病理学的多模式人工智能模型为激素受体阳性/人表皮生长因子受体2阴性早期乳腺癌的远处复发评估增加了预后信息。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-21 DOI: 10.1200/CCI-24-00287
Daniel Kates-Harbeck, Hans Kreipe, Oleg Gluz, Matthias Christgen, Sherko Kuemmel, Monika Graeser, Ulrike Nitz, Sven Mahner, Doris Mayr, Rachel Wuerstlein, Akinori Mitani, Jingbin Zhang, Hans Pinckaers, Gijs Smit, Yi Ren, Songwan Joun, Jacqueline Griffin, Nancy Lin, Felix Feng, Andre Esteva, Ronald Kates, Nadia Harbeck

Purpose: Prognostic assessment in hormone receptor-positive (HR+)/human epidermal growth factor receptor 2-negative (HER2-) early breast cancer (EBC) remains challenging, given relatively low rates of disease progression. Modern artificial intelligence (AI)-based techniques have provided advanced prognostic tools in cancer.

Patients and methods: The Artera multimodal AI (MMAI) platform, using digital histopathology and clinical data, was applied to develop and test a prognostic risk assessment algorithm in HR+/HER2- EBC. Hematoxylin and eosin (H&E) slides from pretreatment breast biopsy and surgical specimens were digitized from the WSG PlanB and ADAPT trials. Patients with available images and complete data (n = 5,259) were stratified by trial, treatment, and distant metastasis (DM) into training (development: 60%) and internal validation (holdout: 40%) cohorts. The algorithm provided prognostic DM risk scores on the basis of image data and clinical variables (age, T and N stages, and tumor size). Univariable and multivariable Fine-Gray models were used to assess performance on the test cohort; subdistribution hazard ratios (sHR) are reported per standard deviation increase of the model scores. Prespecified prognostic subgroups for analysis were defined by nodal status, menopausal status, and tumor grade.

Results: The trained MMAI score was significantly associated with risk of DM in the test cohort (sHR, 2.3 [95% CI, 2.0 to 2.8]) as a whole and across subgroups. The score remained significant (sHR, 2.2 [95% CI, 1.7 to 2.8]) after adjusting for clinical prognostic factors. The MMAI image component alone had significant prognostic value (sHR, 1.6 [95% CI, 1.3 to 1.9]) in the test cohort; it also had significant prognostic value separately within the G2 and G3 subgroups, with sHR of 1.5 per standard deviation increase, and in most of the other predefined clinical subgroups.

Conclusion: MMAI using digital pathology from H&E slides provides enhanced prognostic quality in HR+/HER2- EBC and could help to advance personalized breast cancer management.

目的:由于疾病进展率相对较低,激素受体阳性(HR+)/人表皮生长因子受体2阴性(HER2-)早期乳腺癌(EBC)的预后评估仍然具有挑战性。现代基于人工智能(AI)的技术为癌症提供了先进的预后工具。患者和方法:应用Artera multimodal AI (MMAI)平台,使用数字组织病理学和临床数据,开发和测试HR+/HER2- EBC的预后风险评估算法。从WSG PlanB和ADAPT试验中对预处理乳腺活检和手术标本的苏木精和伊红(H&E)切片进行数字化处理。具有可用图像和完整数据的患者(n = 5259)按试验、治疗和远处转移(DM)分层分为训练(发展:60%)和内部验证(保留:40%)队列。该算法根据图像数据和临床变量(年龄、T和N分期以及肿瘤大小)提供预后DM风险评分。使用单变量和多变量Fine-Gray模型评估测试队列的表现;子分布风险比(sHR)是模型分数每增加一个标准差所报告的。用于分析的预先指定预后亚组由淋巴结状态、绝经状态和肿瘤分级定义。结果:在整个和跨亚组中,训练后的MMAI评分与DM的风险显著相关(sHR, 2.3 [95% CI, 2.0至2.8])。在调整临床预后因素后,评分仍然显著(sHR, 2.2 [95% CI, 1.7至2.8])。在测试队列中,仅MMAI图像分量具有显著的预后价值(sHR, 1.6 [95% CI, 1.3 ~ 1.9]);在G2和G3亚组中也具有显著的预后价值,每增加一个标准差的sHR为1.5,在大多数其他预定义的临床亚组中也是如此。结论:基于H&E载玻片数字病理学的MMAI提高了HR+/HER2- EBC的预后质量,有助于推进乳腺癌的个性化治疗。
{"title":"Multimodal Artificial Intelligence Model From Baseline Histopathology Adds Prognostic Information for Distant Recurrence Assessment in Hormone Receptor-Positive/Human Epidermal Growth Factor Receptor 2-Negative Early Breast Cancer.","authors":"Daniel Kates-Harbeck, Hans Kreipe, Oleg Gluz, Matthias Christgen, Sherko Kuemmel, Monika Graeser, Ulrike Nitz, Sven Mahner, Doris Mayr, Rachel Wuerstlein, Akinori Mitani, Jingbin Zhang, Hans Pinckaers, Gijs Smit, Yi Ren, Songwan Joun, Jacqueline Griffin, Nancy Lin, Felix Feng, Andre Esteva, Ronald Kates, Nadia Harbeck","doi":"10.1200/CCI-24-00287","DOIUrl":"https://doi.org/10.1200/CCI-24-00287","url":null,"abstract":"<p><strong>Purpose: </strong>Prognostic assessment in hormone receptor-positive (HR+)/human epidermal growth factor receptor 2-negative (HER2-) early breast cancer (EBC) remains challenging, given relatively low rates of disease progression. Modern artificial intelligence (AI)-based techniques have provided advanced prognostic tools in cancer.</p><p><strong>Patients and methods: </strong>The Artera multimodal AI (MMAI) platform, using digital histopathology and clinical data, was applied to develop and test a prognostic risk assessment algorithm in HR+/HER2- EBC. Hematoxylin and eosin (H&E) slides from pretreatment breast biopsy and surgical specimens were digitized from the WSG PlanB and ADAPT trials. Patients with available images and complete data (n = 5,259) were stratified by trial, treatment, and distant metastasis (DM) into training (development: 60%) and internal validation (holdout: 40%) cohorts. The algorithm provided prognostic DM risk scores on the basis of image data and clinical variables (age, T and N stages, and tumor size). Univariable and multivariable Fine-Gray models were used to assess performance on the test cohort; subdistribution hazard ratios (sHR) are reported per standard deviation increase of the model scores. Prespecified prognostic subgroups for analysis were defined by nodal status, menopausal status, and tumor grade.</p><p><strong>Results: </strong>The trained MMAI score was significantly associated with risk of DM in the test cohort (sHR, 2.3 [95% CI, 2.0 to 2.8]) as a whole and across subgroups. The score remained significant (sHR, 2.2 [95% CI, 1.7 to 2.8]) after adjusting for clinical prognostic factors. The MMAI image component alone had significant prognostic value (sHR, 1.6 [95% CI, 1.3 to 1.9]) in the test cohort; it also had significant prognostic value separately within the G2 and G3 subgroups, with sHR of 1.5 per standard deviation increase, and in most of the other predefined clinical subgroups.</p><p><strong>Conclusion: </strong>MMAI using digital pathology from H&E slides provides enhanced prognostic quality in HR+/HER2- EBC and could help to advance personalized breast cancer management.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400287"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145574675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JCO Clinical Cancer Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1