首页 > 最新文献

JCO Clinical Cancer Informatics最新文献

英文 中文
SmokeBERT and Beyond: Bridging Clinical Narratives and Structured Smoking Data to Improve Lung Cancer Screening. 吸烟和超越:连接临床叙述和结构化吸烟数据以改善肺癌筛查。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-12-01 Epub Date: 2025-12-22 DOI: 10.1200/CCI-25-00350
Heng Tan, Travis J Osterman
{"title":"SmokeBERT and Beyond: Bridging Clinical Narratives and Structured Smoking Data to Improve Lung Cancer Screening.","authors":"Heng Tan, Travis J Osterman","doi":"10.1200/CCI-25-00350","DOIUrl":"10.1200/CCI-25-00350","url":null,"abstract":"","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500350"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12782282/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145812136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Critical Role of Model Selection in Evaluating AI Performance for Tumor Board Decision Making. 模型选择在评估人工智能在肿瘤委员会决策中的关键作用。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-12-01 Epub Date: 2025-12-11 DOI: 10.1200/CCI-25-00189
Mehmet Mutlu Çatlı, Arif Hakan Önder
{"title":"Critical Role of Model Selection in Evaluating AI Performance for Tumor Board Decision Making.","authors":"Mehmet Mutlu Çatlı, Arif Hakan Önder","doi":"10.1200/CCI-25-00189","DOIUrl":"https://doi.org/10.1200/CCI-25-00189","url":null,"abstract":"","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500189"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145745668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Mortality Risk Prediction in Myelodysplastic Syndromes Using Longitudinal Clinical Data. 利用纵向临床数据预测骨髓增生异常综合征的动态死亡风险。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-12-01 Epub Date: 2025-12-23 DOI: 10.1200/CCI-25-00236
Jonathan Bobak, Philipp Spohr, Sarah Richter, Alexander Streuer, Felicitas Isabel Schulz, Corinna Strupp, Catharina Gerhards, Nanni Schmitt, Thomas Luft, Sascha Dietrich, Ulrich Germing, Gunnar W Klau

Purpose: Patients with myelodysplastic syndromes (MDS) exhibit diverse disease trajectories necessitating different clinical approaches ranging from watch-and-wait strategies to hematopoietic stem cell transplantation. Existing risk scores like the IPSS-R or Endothelial Activation and Stress Index provide static risk stratification at diagnosis but do not capture evolving disease dynamics. We addressed this problem by introducing a dynamic, data-driven approach to repeatedly predict short-term mortality risks, across the patient's disease course.

Materials and methods: We developed a machine learning model on the basis of gradient-boosted decision trees to estimate 1-year mortality risks from both longitudinal parameters from blood values and diagnosis-based features. We trained the model on a data set of patients from the MDS Registry Düsseldorf (n = 1,024) and validated it on patients from University Hospitals Heidelberg (n = 286) and Mannheim (n = 31).

Results: Validations on independent cohorts achieved area under the receiver operating characteristic curve scores of around 0.8 and better predictive performance for 1-year mortality compared with a diagnosis-only baseline model. The model accurately predicted mortality risks as early as within the first 90 days of diagnosis. Feature importance analysis revealed clinically plausible feature-label relations, supporting interpretability. Comparison with the IPSS-R and training on 1-year AML progression revealed the advantages and generalizability of the approach.

Conclusion: This dynamic risk model enables continuous, individualized assessment of 1-year mortality risk in patients with MDS, offering a supplement to static scores used at diagnosis. Our results highlight the utility and importance of including longitudinal parameters in risk assessment analysis.

目的:骨髓增生异常综合征(MDS)患者表现出不同的疾病轨迹,需要不同的临床方法,从观察和等待策略到造血干细胞移植。现有的风险评分,如IPSS-R或内皮激活和应激指数,在诊断时提供了静态的风险分层,但不能捕捉到不断发展的疾病动态。我们通过引入一种动态的、数据驱动的方法来反复预测患者整个病程中的短期死亡风险,从而解决了这个问题。材料和方法:我们开发了一个基于梯度增强决策树的机器学习模型,从血液值和基于诊断的特征的纵向参数估计1年死亡风险。我们在来自塞尔多夫MDS注册中心(n = 1,024)的患者数据集上训练模型,并在海德堡大学医院(n = 286)和曼海姆大学医院(n = 31)的患者上验证模型。结果:在独立队列的验证中,受试者工作特征曲线下的面积得分约为0.8,与仅诊断的基线模型相比,1年死亡率的预测性能更好。该模型早在诊断后90天内就能准确预测死亡风险。特征重要性分析揭示了临床可信的特征-标签关系,支持可解释性。与IPSS-R和1年AML进展培训的比较显示了该方法的优势和可推广性。结论:该动态风险模型能够对MDS患者的1年死亡风险进行持续、个性化的评估,为诊断时使用的静态评分提供补充。我们的研究结果强调了在风险评估分析中纳入纵向参数的效用和重要性。
{"title":"Dynamic Mortality Risk Prediction in Myelodysplastic Syndromes Using Longitudinal Clinical Data.","authors":"Jonathan Bobak, Philipp Spohr, Sarah Richter, Alexander Streuer, Felicitas Isabel Schulz, Corinna Strupp, Catharina Gerhards, Nanni Schmitt, Thomas Luft, Sascha Dietrich, Ulrich Germing, Gunnar W Klau","doi":"10.1200/CCI-25-00236","DOIUrl":"10.1200/CCI-25-00236","url":null,"abstract":"<p><strong>Purpose: </strong>Patients with myelodysplastic syndromes (MDS) exhibit diverse disease trajectories necessitating different clinical approaches ranging from watch-and-wait strategies to hematopoietic stem cell transplantation. Existing risk scores like the IPSS-R or Endothelial Activation and Stress Index provide static risk stratification at diagnosis but do not capture evolving disease dynamics. We addressed this problem by introducing a dynamic, data-driven approach to repeatedly predict short-term mortality risks, across the patient's disease course.</p><p><strong>Materials and methods: </strong>We developed a machine learning model on the basis of gradient-boosted decision trees to estimate 1-year mortality risks from both longitudinal parameters from blood values and diagnosis-based features. We trained the model on a data set of patients from the MDS Registry Düsseldorf (n = 1,024) and validated it on patients from University Hospitals Heidelberg (n = 286) and Mannheim (n = 31).</p><p><strong>Results: </strong>Validations on independent cohorts achieved area under the receiver operating characteristic curve scores of around 0.8 and better predictive performance for 1-year mortality compared with a diagnosis-only baseline model. The model accurately predicted mortality risks as early as within the first 90 days of diagnosis. Feature importance analysis revealed clinically plausible feature-label relations, supporting interpretability. Comparison with the IPSS-R and training on 1-year AML progression revealed the advantages and generalizability of the approach.</p><p><strong>Conclusion: </strong>This dynamic risk model enables continuous, individualized assessment of 1-year mortality risk in patients with MDS, offering a supplement to static scores used at diagnosis. Our results highlight the utility and importance of including longitudinal parameters in risk assessment analysis.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500236"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12727070/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Building Pediatric Cancer Cohorts and Accessing Data Using Childhood Cancer Data Initiative Tools. 使用儿童癌症数据倡议工具建立儿童癌症队列和获取数据。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-12-01 Epub Date: 2025-12-10 DOI: 10.1200/CCI-25-00217
Subhashini Jagu, Jaime M Guidry Auvil, Mark Cunningham, Bahar Sayoldin, Patrick Dunn, Sean Burke, Catherine Bullen, Qiong Liu, Ricardo Flores Jimenez, Cynthia Winter, Hayley Dingerdissen, Janisha Patel, John Otridge, Brigitte Widemann, Warren Kibbe, Gregory Reaman

Purpose: Data sharing is necessary to advance understanding of the etiology and biology of cancer in children, adolescents, and young adults; drive therapeutic discoveries; and improve treatment outcomes. To meet this critical need, the National Cancer Institute's Childhood Cancer Data Initiative (CCDI) provides innovative, user-friendly tools and resources that enable researchers and pediatric oncologists to access and analyze the large volume of diverse childhood cancer data (over 1 million files) that has been collected and harmonized from multiple studies, including CCDI's Molecular Characterization Initiative, Pediatric MATCH, Childhood Cancer Survivor Study, etc. This article outlines how to find, request, access, download, and analyze data indexed in the CCDI Hub Explore Dashboard and Childhood Cancer Clinical Data Commons (C3DC), key components of the CCDI Data Ecosystem, to accelerate progress in pediatric cancer research.

Methods: Both CCDI resources support cohort-based analysis and use data models that include study, participant, sample, diagnosis, and treatment data. These models are updated in collaboration with field experts. Additionally, CCDI drafted a Pediatric Cancer Core common data elements list, which serves as a standard reference for researchers.

Results: The CCDI Hub is the primary access point for finding data, tools, and applications managed by CCDI. The C3DC provides harmonized, participant-level clinical data and the CCDI Hub Explore Dashboard catalogs data at the file level. These resources enable users to search for and download manifests of harmonized, de-identified participant data and build cohorts.

Conclusion: CCDI prioritizes data accessibility and interoperability and, with its resources and data, continues to aid in pediatric cancer research discovery, data-driven insights, and collaboration across the pediatric cancer community.

目的:数据共享对于促进对儿童、青少年和年轻人癌症的病因学和生物学的理解是必要的;推动治疗发现;改善治疗效果。为了满足这一关键需求,国家癌症研究所的儿童癌症数据倡议(CCDI)提供了创新的,用户友好的工具和资源,使研究人员和儿科肿瘤学家能够访问和分析大量不同的儿童癌症数据(超过100万份文件),这些数据已经从多个研究中收集和协调,包括CCDI的分子表征倡议,儿科MATCH,儿童癌症幸存者研究等。本文概述了如何查找、请求、访问、下载和分析CCDI Hub Explore Dashboard和儿童癌症临床数据共享(C3DC) (CCDI数据生态系统的关键组成部分)中索引的数据,以加速儿科癌症研究的进展。方法:CCDI资源都支持基于队列的分析,并使用包括研究、参与者、样本、诊断和治疗数据的数据模型。这些模型是与现场专家合作更新的。此外,CCDI还起草了一份儿科癌症核心公共数据元素列表,作为研究人员的标准参考。结果:CCDI Hub是查找由CCDI管理的数据、工具和应用程序的主要访问点。C3DC提供统一的参与者级临床数据,CCDI Hub Explore Dashboard在文件级对数据进行编目。这些资源使用户能够搜索和下载统一的、去识别的参与者数据清单,并构建队列。结论:CCDI优先考虑数据的可访问性和互操作性,并利用其资源和数据继续帮助儿科癌症研究发现,数据驱动的见解以及儿科癌症社区的合作。
{"title":"Building Pediatric Cancer Cohorts and Accessing Data Using Childhood Cancer Data Initiative Tools.","authors":"Subhashini Jagu, Jaime M Guidry Auvil, Mark Cunningham, Bahar Sayoldin, Patrick Dunn, Sean Burke, Catherine Bullen, Qiong Liu, Ricardo Flores Jimenez, Cynthia Winter, Hayley Dingerdissen, Janisha Patel, John Otridge, Brigitte Widemann, Warren Kibbe, Gregory Reaman","doi":"10.1200/CCI-25-00217","DOIUrl":"10.1200/CCI-25-00217","url":null,"abstract":"<p><strong>Purpose: </strong>Data sharing is necessary to advance understanding of the etiology and biology of cancer in children, adolescents, and young adults; drive therapeutic discoveries; and improve treatment outcomes. To meet this critical need, the National Cancer Institute's Childhood Cancer Data Initiative (CCDI) provides innovative, user-friendly tools and resources that enable researchers and pediatric oncologists to access and analyze the large volume of diverse childhood cancer data (over 1 million files) that has been collected and harmonized from multiple studies, including CCDI's Molecular Characterization Initiative, Pediatric MATCH, Childhood Cancer Survivor Study, etc. This article outlines how to find, request, access, download, and analyze data indexed in the CCDI Hub Explore Dashboard and Childhood Cancer Clinical Data Commons (C3DC), key components of the CCDI Data Ecosystem, to accelerate progress in pediatric cancer research.</p><p><strong>Methods: </strong>Both CCDI resources support cohort-based analysis and use data models that include study, participant, sample, diagnosis, and treatment data. These models are updated in collaboration with field experts. Additionally, CCDI drafted a Pediatric Cancer Core common data elements list, which serves as a standard reference for researchers.</p><p><strong>Results: </strong>The CCDI Hub is the primary access point for finding data, tools, and applications managed by CCDI. The C3DC provides harmonized, participant-level clinical data and the CCDI Hub Explore Dashboard catalogs data at the file level. These resources enable users to search for and download manifests of harmonized, de-identified participant data and build cohorts.</p><p><strong>Conclusion: </strong>CCDI prioritizes data accessibility and interoperability and, with its resources and data, continues to aid in pediatric cancer research discovery, data-driven insights, and collaboration across the pediatric cancer community.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500217"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12724070/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145727168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Identification of Radiotherapy Courses From US Department of Veterans Affairs Administrative Data. 来自美国退伍军人事务部行政数据的放射治疗课程的自动识别。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-12-01 Epub Date: 2025-12-08 DOI: 10.1200/CCI-25-00088
William Schreyer, Ryan Melson, Christopher Anderson, Cecelia Madison, Evangelia Katsoulakis, Reid F Thompson

Purpose: Radiotherapy is a critically important cancer treatment; however, its details are often not well represented in electronic health record data sets. US Veterans' radiation courses are further distributed across a range of medical centers, both internal and external to the Veterans Health Administration (VHA), inhibiting analysis of radiotherapy treatment across this population.

Methods: We train and test a suite of supervised machine learning models for the accurate prediction of radiation course dates using billing and diagnostic codes from a combination of VHA and Centers for Medicare and Medicaid Services (CMS) databases. We use a separate heuristic algorithm to assemble course date predictions into complete radiation treatments.

Results: Our top model predicts radiation course dates with compelling accuracy (macro-average of 0.914 across classes). The retrospective application of our model and assembly algorithm to radiation procedure dates for 1,331,342 patients identified 1,526,660 predicted courses of radiotherapy.

Conclusion: The identified courses were collected into a shared resource to facilitate future VHA-based studies, and our predictive model is available for application to a wider range of non-VHA data sets, particularly those leveraging CMS data.

目的:放疗是一种至关重要的癌症治疗方法;然而,其细节往往不能很好地体现在电子健康记录数据集中。美国退伍军人的放射课程进一步分布在退伍军人健康管理局(VHA)内部和外部的一系列医疗中心,抑制了对该人群放射治疗的分析。方法:我们训练和测试了一套有监督的机器学习模型,使用来自VHA和医疗保险和医疗补助服务中心(CMS)数据库的账单和诊断代码来准确预测辐射课程日期。我们使用单独的启发式算法将课程日期预测整合到完整的放射治疗中。结果:我们的顶级模型预测放射课程日期具有令人信服的准确性(跨类宏观平均值为0.914)。我们的模型和集合算法回顾性应用于1,331,342例患者的放射治疗日期,确定了1,526,660个预测的放射治疗疗程。结论:确定的病程被收集到一个共享资源中,以促进未来基于vha的研究,我们的预测模型可用于更广泛的非vha数据集,特别是那些利用CMS数据的数据集。
{"title":"Automated Identification of Radiotherapy Courses From US Department of Veterans Affairs Administrative Data.","authors":"William Schreyer, Ryan Melson, Christopher Anderson, Cecelia Madison, Evangelia Katsoulakis, Reid F Thompson","doi":"10.1200/CCI-25-00088","DOIUrl":"https://doi.org/10.1200/CCI-25-00088","url":null,"abstract":"<p><strong>Purpose: </strong>Radiotherapy is a critically important cancer treatment; however, its details are often not well represented in electronic health record data sets. US Veterans' radiation courses are further distributed across a range of medical centers, both internal and external to the Veterans Health Administration (VHA), inhibiting analysis of radiotherapy treatment across this population.</p><p><strong>Methods: </strong>We train and test a suite of supervised machine learning models for the accurate prediction of radiation course dates using billing and diagnostic codes from a combination of VHA and Centers for Medicare and Medicaid Services (CMS) databases. We use a separate heuristic algorithm to assemble course date predictions into complete radiation treatments.</p><p><strong>Results: </strong>Our top model predicts radiation course dates with compelling accuracy (macro-average of 0.914 across classes). The retrospective application of our model and assembly algorithm to radiation procedure dates for 1,331,342 patients identified 1,526,660 predicted courses of radiotherapy.</p><p><strong>Conclusion: </strong>The identified courses were collected into a shared resource to facilitate future VHA-based studies, and our predictive model is available for application to a wider range of non-VHA data sets, particularly those leveraging CMS data.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500088"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145709570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reimagining Cancer Care With Generative Artificial Intelligence: The Promise of Large Language Models. 用生成式人工智能重新构想癌症治疗:大型语言模型的前景。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-12-01 DOI: 10.1200/CCI-25-00134
Ji-Eun Irene Yum, Syed Arsalan Ahmed Naqvi, Ben Zhou, Irbaz Bin Riaz

The emergence of state-of-the-art large language models (LLMs), which hold the ability to generalize to diverse natural language processing tasks, has led to new opportunities in health care. Oncology is especially well-suited to leverage these resources as the journeys of patients with cancer inherently yield extensive, longitudinal data sets comprising clinical narratives, pathology and radiology reports, and genomic sequencing reports. This review begins with an overview of the fundamental concepts behind LLMs, including the definitions, architecture, training paradigm, and performance optimization through prompt engineering and retrieval-augmented generation. We also take a moment to explore the newly emerging paradigm of LLMs in a multiagentic framework. We then synthesize current research on how LLMs may benefit stakeholders within the practice of oncology, including patients, oncologists, researchers, and learners. Finally, we address the limitations and risks of LLMs, including hallucinations, inherent biases, patient privacy, and clinician deskilling. While research thus far shows significant potential for LLMs to transform cancer care, necessary future directions include studies emphasizing patient stakeholder perspectives on LLM incorporation in clinical workflows, the development of relevant clinical benchmarks for LLM evaluation, a greater focus on real-world prospective testing, and deeper exploration of LLM reasoning capabilities.

最先进的大型语言模型(llm)的出现,具有推广到各种自然语言处理任务的能力,为医疗保健带来了新的机会。肿瘤学特别适合利用这些资源,因为癌症患者的旅程固有地产生广泛的纵向数据集,包括临床叙述、病理和放射学报告以及基因组测序报告。本文首先概述了llm背后的基本概念,包括定义、体系结构、训练范例,以及通过快速工程和检索增强生成实现的性能优化。我们还花了一点时间来探索在多机构框架中新兴的法学硕士范式。然后,我们综合当前的研究如何法学硕士可能有利于肿瘤实践中的利益相关者,包括患者,肿瘤学家,研究人员和学习者。最后,我们讨论了法学硕士的局限性和风险,包括幻觉、固有偏见、患者隐私和临床医生的技能。虽然迄今为止的研究表明LLM在改变癌症治疗方面具有巨大的潜力,但未来必要的方向包括强调将LLM纳入临床工作流程的患者利益相关者观点的研究,为LLM评估制定相关的临床基准,更加关注现实世界的前瞻性测试,以及更深入地探索LLM推理能力。
{"title":"Reimagining Cancer Care With Generative Artificial Intelligence: The Promise of Large Language Models.","authors":"Ji-Eun Irene Yum, Syed Arsalan Ahmed Naqvi, Ben Zhou, Irbaz Bin Riaz","doi":"10.1200/CCI-25-00134","DOIUrl":"https://doi.org/10.1200/CCI-25-00134","url":null,"abstract":"<p><p>The emergence of state-of-the-art large language models (LLMs), which hold the ability to generalize to diverse natural language processing tasks, has led to new opportunities in health care. Oncology is especially well-suited to leverage these resources as the journeys of patients with cancer inherently yield extensive, longitudinal data sets comprising clinical narratives, pathology and radiology reports, and genomic sequencing reports. This review begins with an overview of the fundamental concepts behind LLMs, including the definitions, architecture, training paradigm, and performance optimization through prompt engineering and retrieval-augmented generation. We also take a moment to explore the newly emerging paradigm of LLMs in a multiagentic framework. We then synthesize current research on how LLMs may benefit stakeholders within the practice of oncology, including patients, oncologists, researchers, and learners. Finally, we address the limitations and risks of LLMs, including hallucinations, inherent biases, patient privacy, and clinician deskilling. While research thus far shows significant potential for LLMs to transform cancer care, necessary future directions include studies emphasizing patient stakeholder perspectives on LLM incorporation in clinical workflows, the development of relevant clinical benchmarks for LLM evaluation, a greater focus on real-world prospective testing, and deeper exploration of LLM reasoning capabilities.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500134"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145656461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geospatial Analysis of Commission on Cancer-Accredited Centers Within Cancer Care Utilization-Based Catchment Areas. 基于癌症治疗利用的集水区内癌症委员会认可中心的地理空间分析。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-12-01 Epub Date: 2025-12-23 DOI: 10.1200/CCI-25-00163
Nicole Rademacher, Connor Sisk, Joshua S Richman, Kristy Broman, Changzhen Wang

Purpose: The Commission on Cancer (CoC) seeks to expand access to high-quality care through community engagement standards targeting centers' catchment areas and efforts to accredit centers in more areas including rural hospitals. Little is known about the social, environmental, and geographic characteristics of their catchment areas. To support future investigation into the impact of CoC-accredited centers, this study compares characteristics of cancer care utilization-based catchment areas, termed Cancer Service Areas (CSAs), with and without CoC-accredited centers.

Methods: Geocoded CoC-accredited centers and cancer care patient flows extracted from Medicare claims data were used to delineate CSAs using a spatially constrained community detection method. Characteristics including environmental justice index (EJI), social vulnerability index (SVI), rurality, travel time, and localization index (LI, a ratio of cancer care received by patients within a CSA) were aggregated by CSA. A logistic regression model was created to evaluate characteristics associated with the presence of a CoC-accredited center within a CSA.

Results: Six hundred sixty-eight CSAs were defined, of which 511 CSAs had at least one CoC-accredited center. CSAs with CoC-accredited centers had lower health vulnerability (odds ratio [OR], 0.65 [95% CI, 0.427 to 0.993]) and lower racial and ethnic minority status vulnerability (OR, 0.61 [95% CI, 0.424 to 0.886]), but no differences for other components of the EJI or SVI. These CSAs also had higher LIs, meaning patients remained in their local CSA for care (OR, 9.00 [95% CI, 2.408 to 33.640] for high v low LIs).

Conclusion: Minority and comorbid populations may have more difficulty accessing cancer center care, further exacerbating observed variations in cancer outcomes. Cancer centers may address this by broadening their outreach into at-risk catchment areas.

目的:癌症委员会(CoC)力求通过针对中心集水区的社区参与标准和努力在包括农村医院在内的更多地区对中心进行认证,扩大获得高质量护理的机会。人们对其集水区的社会、环境和地理特征知之甚少。为了支持未来对coc认证中心影响的调查,本研究比较了有和没有coc认证中心的基于癌症护理利用的集水区(称为癌症服务区(csa))的特征。方法:使用地理编码的coc认证中心和从医疗保险索赔数据中提取的癌症护理患者流,使用空间受限的社区检测方法来描述csa。通过CSA对环境正义指数(EJI)、社会脆弱性指数(SVI)、乡村性、出行时间和本地化指数(LI,一个CSA内患者接受癌症治疗的比率)等特征进行汇总。建立了一个逻辑回归模型来评估与CSA内coc认证中心存在相关的特征。结果:共确定668家csa,其中511家csa至少有一家coc认证中心。coc认证中心的csa具有较低的健康脆弱性(优势比[OR], 0.65 [95% CI, 0.427至0.993])和较低的种族和少数民族地位脆弱性(OR, 0.61 [95% CI, 0.424至0.886]),但EJI或SVI的其他组成部分没有差异。这些CSA也具有较高的LIs,这意味着患者仍留在当地CSA接受护理(对于高和低LIs, OR为9.00 [95% CI, 2.408至33.640])。结论:少数民族和合并症人群可能更难以获得癌症中心的护理,进一步加剧了观察到的癌症结局的变化。癌症中心可以通过扩大他们在高危地区的服务范围来解决这个问题。
{"title":"Geospatial Analysis of Commission on Cancer-Accredited Centers Within Cancer Care Utilization-Based Catchment Areas.","authors":"Nicole Rademacher, Connor Sisk, Joshua S Richman, Kristy Broman, Changzhen Wang","doi":"10.1200/CCI-25-00163","DOIUrl":"https://doi.org/10.1200/CCI-25-00163","url":null,"abstract":"<p><strong>Purpose: </strong>The Commission on Cancer (CoC) seeks to expand access to high-quality care through community engagement standards targeting centers' catchment areas and efforts to accredit centers in more areas including rural hospitals. Little is known about the social, environmental, and geographic characteristics of their catchment areas. To support future investigation into the impact of CoC-accredited centers, this study compares characteristics of cancer care utilization-based catchment areas, termed <i>Cancer Service Areas</i> (<i>CSAs</i>), with and without CoC-accredited centers.</p><p><strong>Methods: </strong>Geocoded CoC-accredited centers and cancer care patient flows extracted from Medicare claims data were used to delineate CSAs using a spatially constrained community detection method. Characteristics including environmental justice index (EJI), social vulnerability index (SVI), rurality, travel time, and localization index (LI, a ratio of cancer care received by patients within a CSA) were aggregated by CSA. A logistic regression model was created to evaluate characteristics associated with the presence of a CoC-accredited center within a CSA.</p><p><strong>Results: </strong>Six hundred sixty-eight CSAs were defined, of which 511 CSAs had at least one CoC-accredited center. CSAs with CoC-accredited centers had lower health vulnerability (odds ratio [OR], 0.65 [95% CI, 0.427 to 0.993]) and lower racial and ethnic minority status vulnerability (OR, 0.61 [95% CI, 0.424 to 0.886]), but no differences for other components of the EJI or SVI. These CSAs also had higher LIs, meaning patients remained in their local CSA for care (OR, 9.00 [95% CI, 2.408 to 33.640] for high <i>v</i> low LIs).</p><p><strong>Conclusion: </strong>Minority and comorbid populations may have more difficulty accessing cancer center care, further exacerbating observed variations in cancer outcomes. Cancer centers may address this by broadening their outreach into at-risk catchment areas.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500163"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development, External Validation, and Deployment of RFAN-ML: A Machine Learning Model to Estimate Renal Function After Nephrectomy. RFAN-ML的开发、外部验证和部署:一种评估肾切除术后肾功能的机器学习模型。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-07 DOI: 10.1200/CCI-25-00086
Jesse Persily, Steven L Chang, Chen Chen, Yassamin Neshatvar, Siri Desiraju, Rajesh Ranganath, Katie Murray, Adam Feldman, Douglas Dahl, Samir S Taneja, William C Huang, Madhur Nayan

Purpose: Partial nephrectomy has been advocated as the preferred surgical approach for small kidney tumors over total nephrectomy. However, partial nephrectomy is associated with increased perioperative risk. Estimating renal function after nephrectomy can facilitate personalized patient counseling, guide surgical approach, and identify patients who could benefit from perioperative interventions. Existing prediction models have several limitations including the lack of external validation or a user-friendly tool or application, and most have used traditional statistical methods.

Methods: We used data from two academic medical institutions and machine learning (ML) methods to develop and externally validate renal function after nephrectomy-machine learning (RFAN-ML), a model to estimate long-term renal function after partial or total nephrectomy. Boruta feature selection was used to select four routinely available clinical features, specifically age, BMI, preoperative renal function, and nephrectomy type. In the training set of 1,932 patients, we compared six ML regression models representing a set of both ensemble and nonensemble ML algorithms and optimized for root mean squared error (RMSE). This model was evaluated in a test set of 1,995 patients, and the best performing model was selected as RFAN-ML.

Results: We compared RFAN-ML with existing renal function prediction benchmarks and found that RFAN-ML outperformed or had competitive performance with benchmarks on RMSE (16.6 [95% CI, 15.6 to 17.5]), R2, and mean absolute error.

Conclusion: We developed and externally validated RFAN-ML, a ML model to predict renal function after nephrectomy, and have deployed our model online. RFAN-ML has the potential to improve the care and outcomes in patients with kidney tumors by informing personalized patient counseling and guiding surgical planning.

目的:相对于全肾切除术,部分肾切除术被认为是治疗小肾肿瘤的首选手术方法。然而,部分肾切除术与围手术期风险增加有关。评估肾切除术后的肾功能可以促进个性化患者咨询,指导手术方法,并确定可以从围手术期干预中受益的患者。现有的预测模型有一些局限性,包括缺乏外部验证或用户友好的工具或应用程序,并且大多数使用传统的统计方法。方法:我们使用来自两家学术医疗机构的数据和机器学习(ML)方法来开发和外部验证肾切除术后肾功能-机器学习(RFAN-ML)模型,这是一个评估部分或全部肾切除术后长期肾功能的模型。采用Boruta特征选择方法选择4个常规可用的临床特征,特别是年龄、BMI、术前肾功能和肾切除术类型。在1932例患者的训练集中,我们比较了六种ML回归模型,这些模型代表了一组集成和非集成ML算法,并对均方根误差(RMSE)进行了优化。该模型在1995例患者的测试集中进行评估,选择表现最好的模型为RFAN-ML。结果:我们将RFAN-ML与现有的肾功能预测基准进行了比较,发现RFAN-ML在RMSE (16.6 [95% CI, 15.6至17.5])、R2和平均绝对误差方面优于基准或具有竞争力。结论:我们开发并外部验证了RFAN-ML,这是一个预测肾切除术后肾功能的ML模型,并已在线部署我们的模型。RFAN-ML通过提供个性化的患者咨询和指导手术计划,有可能改善肾脏肿瘤患者的护理和预后。
{"title":"Development, External Validation, and Deployment of RFAN-ML: A Machine Learning Model to Estimate Renal Function After Nephrectomy.","authors":"Jesse Persily, Steven L Chang, Chen Chen, Yassamin Neshatvar, Siri Desiraju, Rajesh Ranganath, Katie Murray, Adam Feldman, Douglas Dahl, Samir S Taneja, William C Huang, Madhur Nayan","doi":"10.1200/CCI-25-00086","DOIUrl":"https://doi.org/10.1200/CCI-25-00086","url":null,"abstract":"<p><strong>Purpose: </strong>Partial nephrectomy has been advocated as the preferred surgical approach for small kidney tumors over total nephrectomy. However, partial nephrectomy is associated with increased perioperative risk. Estimating renal function after nephrectomy can facilitate personalized patient counseling, guide surgical approach, and identify patients who could benefit from perioperative interventions. Existing prediction models have several limitations including the lack of external validation or a user-friendly tool or application, and most have used traditional statistical methods.</p><p><strong>Methods: </strong>We used data from two academic medical institutions and machine learning (ML) methods to develop and externally validate renal function after nephrectomy-machine learning (RFAN-ML), a model to estimate long-term renal function after partial or total nephrectomy. Boruta feature selection was used to select four routinely available clinical features, specifically age, BMI, preoperative renal function, and nephrectomy type. In the training set of 1,932 patients, we compared six ML regression models representing a set of both ensemble and nonensemble ML algorithms and optimized for root mean squared error (RMSE). This model was evaluated in a test set of 1,995 patients, and the best performing model was selected as RFAN-ML.</p><p><strong>Results: </strong>We compared RFAN-ML with existing renal function prediction benchmarks and found that RFAN-ML outperformed or had competitive performance with benchmarks on RMSE (16.6 [95% CI, 15.6 to 17.5]), R<sup>2</sup>, and mean absolute error.</p><p><strong>Conclusion: </strong>We developed and externally validated RFAN-ML, a ML model to predict renal function after nephrectomy, and have deployed our model online. RFAN-ML has the potential to improve the care and outcomes in patients with kidney tumors by informing personalized patient counseling and guiding surgical planning.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500086"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145472458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harvesting Risk: An Ecologic Study of Agricultural Practices and Patterns and Melanoma Incidence in Pennsylvania. 收获风险:宾夕法尼亚州农业实践和模式与黑色素瘤发病率的生态学研究。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-14 DOI: 10.1200/CCI-25-00160
Benjamin J Marks, Jiangang Liao, Charlene Lam, Camille Moeckel, Eugene J Lengerich

Purpose: To examine the geospatial distribution of melanoma incidence in Pennsylvania (PA), quantify its association with agriculture practices and patterns, and consider its relevance for cancer control.

Methods: The study used an ecologic design with county-level PA data on the 2017-2021 incidence of invasive melanoma among adults 50 years and older, as well as agricultural patterns and practices, ultraviolet radiation (UVR), and demographics/socioeconomics. Spatial clustering was examined using local indicators of spatial association and Getis-Ord Gi*. Separate adjacency-weighted Conway-Maxwell-Poisson models, adjusted for UVR and social vulnerability, quantified the association between melanoma and (1) cultivated and pasture/hay acreage and (2) herbicide-, insecticide-, fungicide-, and manure-treated acreage.

Results: Melanoma incidence was 57.1% greater in a 15-county cluster (P < .05) in South Central PA; eight counties were designated as metropolitan. Compared with noncluster counties, cluster counties had significantly more cultivated land (mean 19.8% v 6.9%, P < .001) and herbicide-treated land (16.8% v 6.5%, P < .001). In adjusted models, a 10% increase in cultivated land and a 9% increase in herbicide-treated acreage each independently corresponded to a 14% increase in incidence.

Conclusion: Melanoma incidence clustered in South Central PA, an area with substantial agricultural industry. However, a majority of counties in the cluster were designated as metropolitan, challenging the concept that agriculture is primarily an industry of counties designated as nonmetropolitan (rural). Agricultural practices and patterns were associated with incidence, suggesting that cancer control adopt an integrated One Health approach to concurrently address occupational, environmental, and behavioral risks. The cluster was entirely within the 28-county catchment area of the Penn State Cancer Institute, demonstrating the utility of geospatial data and analysis for cancer control by a cancer center.

目的:研究宾夕法尼亚州(PA)黑色素瘤发病率的地理空间分布,量化其与农业实践和模式的关系,并考虑其与癌症控制的相关性。方法:该研究采用生态设计,结合2017-2021年50岁及以上成年人侵袭性黑色素瘤发病率的县级PA数据,以及农业模式和实践、紫外线辐射(UVR)和人口统计学/社会经济学数据。利用空间关联局部指标和Getis-Ord Gi*检验空间聚类。单独的邻接加权康威-麦克斯韦-泊松模型,对紫外线辐射和社会脆弱性进行了调整,量化了黑色素瘤与(1)耕地和牧场/干草面积以及(2)除草剂、杀虫剂、杀菌剂和肥料处理面积之间的关系。结果:PA中南部15个县的黑色素瘤发病率高出57.1% (P < 0.05);8个县被指定为都会县。与非聚类县相比,聚类县的耕地(平均19.8% vs 6.9%, P < .001)和除草剂处理土地(16.8% vs 6.5%, P < .001)显著增加。在调整后的模型中,耕地面积增加10%和除草剂处理面积增加9%各自对应于发病率增加14%。结论:黑色素瘤发病集中在PA中南部,该地区农业产业丰富。然而,集群中的大多数县被指定为大都市,挑战了农业主要是被指定为非大都市(农村)县的产业的概念。农业实践和模式与发病率相关,这表明癌症控制应采用综合的“同一个健康”方法,同时处理职业、环境和行为风险。该集群完全位于宾夕法尼亚州立癌症研究所的28个县的集水区内,展示了癌症中心在癌症控制方面的地理空间数据和分析的效用。
{"title":"Harvesting Risk: An Ecologic Study of Agricultural Practices and Patterns and Melanoma Incidence in Pennsylvania.","authors":"Benjamin J Marks, Jiangang Liao, Charlene Lam, Camille Moeckel, Eugene J Lengerich","doi":"10.1200/CCI-25-00160","DOIUrl":"10.1200/CCI-25-00160","url":null,"abstract":"<p><strong>Purpose: </strong>To examine the geospatial distribution of melanoma incidence in Pennsylvania (PA), quantify its association with agriculture practices and patterns, and consider its relevance for cancer control.</p><p><strong>Methods: </strong>The study used an ecologic design with county-level PA data on the 2017-2021 incidence of invasive melanoma among adults 50 years and older, as well as agricultural patterns and practices, ultraviolet radiation (UVR), and demographics/socioeconomics. Spatial clustering was examined using local indicators of spatial association and Getis-Ord Gi*. Separate adjacency-weighted Conway-Maxwell-Poisson models, adjusted for UVR and social vulnerability, quantified the association between melanoma and (1) cultivated and pasture/hay acreage and (2) herbicide-, insecticide-, fungicide-, and manure-treated acreage.</p><p><strong>Results: </strong>Melanoma incidence was 57.1% greater in a 15-county cluster (<i>P</i> < .05) in South Central PA; eight counties were designated as metropolitan. Compared with noncluster counties, cluster counties had significantly more cultivated land (mean 19.8% <i>v</i> 6.9%, <i>P</i> < .001) and herbicide-treated land (16.8% <i>v</i> 6.5%, <i>P</i> < .001). In adjusted models, a 10% increase in cultivated land and a 9% increase in herbicide-treated acreage each independently corresponded to a 14% increase in incidence.</p><p><strong>Conclusion: </strong>Melanoma incidence clustered in South Central PA, an area with substantial agricultural industry. However, a majority of counties in the cluster were designated as metropolitan, challenging the concept that agriculture is primarily an industry of counties designated as nonmetropolitan (rural). Agricultural practices and patterns were associated with incidence, suggesting that cancer control adopt an integrated One Health approach to concurrently address occupational, environmental, and behavioral risks. The cluster was entirely within the 28-county catchment area of the Penn State Cancer Institute, demonstrating the utility of geospatial data and analysis for cancer control by a cancer center.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500160"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629121/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RadOncRAG: A Novel Retrieval-Augmented Generation Framework Improves Large Language Model Benchmark Performance in Radiation Oncology. RadOncRAG:一种新的检索增强生成框架,提高了放射肿瘤学中大型语言模型的基准性能。
IF 2.8 Q2 ONCOLOGY Pub Date : 2025-11-01 Epub Date: 2025-11-14 DOI: 10.1200/CCI-25-00220
Nikhil Gautam Thaker, Navid Redjal, Adam Dicker, Arturo Loaiza-Bonilla, Trevor Royce, Vivek Subbiah, Vikash Deendyal, Jonathan R Gabriel, Neena Shetty, Ajay Choudhri, Gautam H Thaker

Purpose: Large language models (LLMs) show promise in assisting knowledge-intensive fields such as oncology, where up-to-date information and multidisciplinary expertise are critical. Traditional LLMs risk hallucinations and reliance on static, possibly outdated data that lack domain-specific context. Retrieval-augmented generation (RAG) has emerged as a strategy to address these issues by incorporating domain-specific information from external knowledge repositories.

Methods: We evaluated 15 LLMs, including Meta Llama-2/3, generative pretrained transformer (GPT)-3.5/4/4o variants, Claude-3, Gemini-2.0, and DeepSeek-R1. In a zero-shot workflow, each LLM answered 298 scorable questions from the 2021 American College of Radiology in-training examination. We implemented a RAG pipeline (Iridium Model) that transforms user prompts into vector embeddings, queries a specialized radiation oncology database, and merges relevant text with the original prompt to form an augmented query. We compared zero-shot versus RAG-augmented performance.

Results: Larger-parameter LLMs had higher zero-shot accuracy, with six models outscoring graduating residents (P < .01). Top scorers were reasoning models GPT-4o1, o3-mini, and DeepSeek-R1, which achieved 91.6%, 86.6%, and 91.6% without RAG, respectively. Gemini-2.0 improved 6.7% (to 79.2%), Llama-3-70b 8.4% (to 75.8%), and GPT-4o 5.7% (to 85.6%) with RAG. Top scoring reasoning models surpassed graduating resident averages by 17.7%-20% (P < .01), but had no improvement or detriment with RAG. Domain-specific gains occurred in clinical, biology, and physics. Majority voting boosted aggregate accuracy when individual model performance exceeded 50%. RAG workflows and reasoning models incurred higher computational costs.

Conclusion: Radiation-oncology-specific retrieval-augmented generation pipeline enhances nonreasoning LLM performance in radiation oncology by integrating domain-specific evidence, whereas it does not improve performance of reasoning models. These findings demonstrate that RAG can elevate clinical decision support by enabling simpler, cost-effective nonreasoning models to tackle complex tasks through retrieval capabilities-an efficient alternative to extensive model training that also yields citable, evidence-based explanations.

目的:大型语言模型(llm)在协助肿瘤学等知识密集型领域显示出希望,在这些领域中,最新信息和多学科专业知识至关重要。传统法学硕士有产生幻觉的风险,并且依赖于静态的、可能过时的、缺乏特定领域背景的数据。检索增强生成(retrieve -augmented generation, RAG)作为一种策略出现,通过合并来自外部知识库的特定领域信息来解决这些问题。方法:我们评估了15个llm,包括Meta Llama-2/3、生成式预训练变压器(GPT)-3.5/4/ 40变体、claud -3、Gemini-2.0和DeepSeek-R1。在一个零射击的工作流程中,每个LLM回答了2021年美国放射学院在职考试中的298个可计分问题。我们实现了一个RAG管道(铱模型),它将用户提示转换为矢量嵌入,查询专门的放射肿瘤学数据库,并将相关文本与原始提示合并以形成增强查询。我们比较了零射击和ragar增强性能。结果:大参数LLMs的零射击精度更高,其中6个模型的零射击精度高于毕业居民(P < 0.01)。得分最高的是推理模型gpt - 410、o3-mini和DeepSeek-R1,它们在没有RAG的情况下分别达到了91.6%、86.6%和91.6%。使用RAG后,Gemini-2.0改善了6.7%(至79.2%),Llama-3-70b改善了8.4%(至75.8%),gpt - 40改善了5.7%(至85.6%)。得分最高的推理模型比毕业居民平均水平高出17.7% ~ 20% (P < 0.01),但对RAG没有改善或损害。特定领域的收益发生在临床、生物学和物理学。当单个模型的性能超过50%时,多数投票提高了总体准确性。RAG工作流和推理模型产生了更高的计算成本。结论:放射肿瘤学特异性检索增强生成管道通过整合特定领域的证据来提高非推理LLM在放射肿瘤学中的性能,而它并没有提高推理模型的性能。这些发现表明,RAG可以提高临床决策支持,使更简单、成本效益高的非推理模型通过检索能力来处理复杂的任务,这是一种有效的替代广泛的模型训练,也可以产生可引用的、基于证据的解释。
{"title":"RadOncRAG: A Novel Retrieval-Augmented Generation Framework Improves Large Language Model Benchmark Performance in Radiation Oncology.","authors":"Nikhil Gautam Thaker, Navid Redjal, Adam Dicker, Arturo Loaiza-Bonilla, Trevor Royce, Vivek Subbiah, Vikash Deendyal, Jonathan R Gabriel, Neena Shetty, Ajay Choudhri, Gautam H Thaker","doi":"10.1200/CCI-25-00220","DOIUrl":"https://doi.org/10.1200/CCI-25-00220","url":null,"abstract":"<p><strong>Purpose: </strong>Large language models (LLMs) show promise in assisting knowledge-intensive fields such as oncology, where up-to-date information and multidisciplinary expertise are critical. Traditional LLMs risk hallucinations and reliance on static, possibly outdated data that lack domain-specific context. Retrieval-augmented generation (RAG) has emerged as a strategy to address these issues by incorporating domain-specific information from external knowledge repositories.</p><p><strong>Methods: </strong>We evaluated 15 LLMs, including Meta Llama-2/3, generative pretrained transformer (GPT)-3.5/4/4o variants, Claude-3, Gemini-2.0, and DeepSeek-R1. In a zero-shot workflow, each LLM answered 298 scorable questions from the 2021 American College of Radiology in-training examination. We implemented a RAG pipeline (Iridium Model) that transforms user prompts into vector embeddings, queries a specialized radiation oncology database, and merges relevant text with the original prompt to form an augmented query. We compared zero-shot versus RAG-augmented performance.</p><p><strong>Results: </strong>Larger-parameter LLMs had higher zero-shot accuracy, with six models outscoring graduating residents (<i>P</i> < .01). Top scorers were reasoning models GPT-4o1, o3-mini, and DeepSeek-R1, which achieved 91.6%, 86.6%, and 91.6% without RAG, respectively. Gemini-2.0 improved 6.7% (to 79.2%), Llama-3-70b 8.4% (to 75.8%), and GPT-4o 5.7% (to 85.6%) with RAG. Top scoring reasoning models surpassed graduating resident averages by 17.7%-20% (<i>P</i> < .01), but had no improvement or detriment with RAG. Domain-specific gains occurred in clinical, biology, and physics. Majority voting boosted aggregate accuracy when individual model performance exceeded 50%. RAG workflows and reasoning models incurred higher computational costs.</p><p><strong>Conclusion: </strong>Radiation-oncology-specific retrieval-augmented generation pipeline enhances nonreasoning LLM performance in radiation oncology by integrating domain-specific evidence, whereas it does not improve performance of reasoning models. These findings demonstrate that RAG can elevate clinical decision support by enabling simpler, cost-effective nonreasoning models to tackle complex tasks through retrieval capabilities-an efficient alternative to extensive model training that also yields citable, evidence-based explanations.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500220"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JCO Clinical Cancer Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1