JCO Clinical Cancer Informatics最新文献_第8页

Measuring the Association Between the COVID-19 Pandemic and Cancer Incidence by Sex Using a Quasi-Experimental Study Design. 使用准实验研究设计测量COVID-19大流行与性别癌症发病率之间的关系。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-11-01 Epub Date: 2025-10-30 DOI: 10.1200/CCI-24-00327

Kathleen M Decker, Allison Feely, Iresha Ratnayake, Oliver Bucher, Piotr Czaykowski, Katie Galloway, Pamela Hebbard, Julian O Kim, Grace Musto, Marshall Pitz, Harminder Singh, Pascal Lambert

Purpose: This study examined the association between COVID-19 and cancer incidence by sex in Manitoba, Canada.

Methods: We used a population-based quasi-experimental study design and an interrupted time-series analysis to compare the rate of new cancer diagnoses between males and females before (January 2015 until December 2019) and after the start of the COVID-19 pandemic (April 2020 until December 2022).

Results: A total of 16,200 females and 20,631 males diagnosed with cancer between 2015 and 2022 in Manitoba were included. Colon cancer incidence decreased by 34% for males and females from April to September 2020. Incidence then remained stable for males but decreased by 22% from October 2021 to December 2022 for females. Brain and CNS cancer incidence decreased by 37% for males during 2021 and 2022 but only for females during the last quarter of 2020 and the first quarter of 2021 (77%). Urinary cancer decreased by 18% for males from April 2020 to December 2022 but was stable for females. Head and neck cancers decreased by 22% for males during 2020, but was stable for females. As of December 2022, the largest estimated cumulative differences in the number of cases occurred for males diagnosed with brain and CNS cancer (31.6% deficit for males, 76 cases), urinary cancer (18.4% deficit, 186 cases), and endocrine cancer (52.4% surplus, 56 cases), and females diagnosed with colon cancer (19.7% deficit, 187 cases).

Conclusion: Sex-based differences in the association between age-standardized cancer incidence and the COVID-19 pandemic exist for several cancer sites. Sex-based differences on postpandemic cancer incidence, especially for brain, CNS, urinary, and colon cancers, need follow-up because of the ongoing deficits documented in this study.

目的：本研究调查了加拿大马尼托巴省按性别划分的COVID-19与癌症发病率之间的关系。方法：我们采用基于人群的准实验研究设计和中断时间序列分析，比较在2019冠状病毒病大流行开始之前（2015年1月至2019年12月）和之后（2020年4月至2022年12月）男性和女性的新癌症诊断率。结果：2015年至2022年间，马尼托巴共有16,200名女性和20,631名男性被诊断患有癌症。从2020年4月到9月，男性和女性的结肠癌发病率下降了34%。随后，男性发病率保持稳定，但从2021年10月至2022年12月，女性发病率下降了22%。男性脑癌和中枢神经系统癌发病率在2021年和2022年期间下降了37%，但仅在2020年最后一个季度和2021年第一季度下降了77%。从2020年4月到2022年12月，男性尿路癌发病率下降了18%，但女性尿路癌发病率保持稳定。2020年，男性头颈癌发病率下降了22%，但女性发病率保持稳定。截至2022年12月，男性诊断为脑癌和中枢神经系统癌（男性缺额31.6%，76例）、泌尿癌（缺额18.4%，186例）、内分泌癌（缺额52.4%，56例）和女性诊断为结肠癌（缺额19.7%，187例）的病例数估计累积差异最大。结论：在一些癌症部位，年龄标准化癌症发病率与COVID-19大流行之间存在性别差异。基于性别的大流行后癌症发病率差异，特别是脑癌、中枢神经系统癌、泌尿系癌和结肠癌，由于本研究中记录的持续缺陷，需要随访。

{"title":"Measuring the Association Between the COVID-19 Pandemic and Cancer Incidence by Sex Using a Quasi-Experimental Study Design.","authors":"Kathleen M Decker, Allison Feely, Iresha Ratnayake, Oliver Bucher, Piotr Czaykowski, Katie Galloway, Pamela Hebbard, Julian O Kim, Grace Musto, Marshall Pitz, Harminder Singh, Pascal Lambert","doi":"10.1200/CCI-24-00327","DOIUrl":"10.1200/CCI-24-00327","url":null,"abstract":"Purpose: This study examined the association between COVID-19 and cancer incidence by sex in Manitoba, Canada.Methods: We used a population-based quasi-experimental study design and an interrupted time-series analysis to compare the rate of new cancer diagnoses between males and females before (January 2015 until December 2019) and after the start of the COVID-19 pandemic (April 2020 until December 2022).Results: A total of 16,200 females and 20,631 males diagnosed with cancer between 2015 and 2022 in Manitoba were included. Colon cancer incidence decreased by 34% for males and females from April to September 2020. Incidence then remained stable for males but decreased by 22% from October 2021 to December 2022 for females. Brain and CNS cancer incidence decreased by 37% for males during 2021 and 2022 but only for females during the last quarter of 2020 and the first quarter of 2021 (77%). Urinary cancer decreased by 18% for males from April 2020 to December 2022 but was stable for females. Head and neck cancers decreased by 22% for males during 2020, but was stable for females. As of December 2022, the largest estimated cumulative differences in the number of cases occurred for males diagnosed with brain and CNS cancer (31.6% deficit for males, 76 cases), urinary cancer (18.4% deficit, 186 cases), and endocrine cancer (52.4% surplus, 56 cases), and females diagnosed with colon cancer (19.7% deficit, 187 cases).Conclusion: Sex-based differences in the association between age-standardized cancer incidence and the COVID-19 pandemic exist for several cancer sites. Sex-based differences on postpandemic cancer incidence, especially for brain, CNS, urinary, and colon cancers, need follow-up because of the ongoing deficits documented in this study.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400327"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12591556/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145410757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Acute Care Utilization Patterns During Chemotherapy and Predictive Model Development at a Rural Community Cancer Center. 农村社区癌症中心化疗期间的急性护理利用模式和预测模型的发展。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-11-01 Epub Date: 2025-11-13 DOI: 10.1200/CCI-25-00186

McKenna Perrin, Crystal Hattum, Jamie Arens, Tobias Meissner

Purpose: Acute care use (ACU) is more costly and prolonged for oncology patients and often leads to treatment disruptions and worsened outcomes. Reducing ACU requires understanding risk factors and proactively identifying at-risk patients. This study addresses research gaps by developing predictive models to assess all-cause acute care use (A-ACU) versus preventable acute care use (P-ACU) and rural-specific barriers.

Patients and methods: We conducted a retrospective cohort study of adult oncology patients who received intravenous cancer treatment between October 2021 and April 2024 within a rural midwestern regional cancer network. We used predictor and outcome data from electronic medical records and insurance claims. We defined P-ACU using the Centers for Medicare & Medicaid Services' OP-35 criteria and classified A-ACU as any emergency department visit or hospitalization, regardless of reason. We trained LASSO and Random Forest models on 80% of the cohort to predict 30-, 90-, and 180-day risk of P-ACU and A-ACU after regimen initiation.

Results: Among 2,922 patients, 45.3% experienced A-ACU and 10.3% had P-ACU within 180 days of chemotherapy regimen initiation. Key predictors included number of previous inpatient stays and comorbidities. Insurance type and age were more influential in predicting P-ACU, whereas laboratory values (albumin, sodium, and neutrophil-to-lymphocyte ratio) were more important in A-ACU models. Nearly all LASSO and Random Forest models showed strong performance (mean area under the receiver operating characteristic curve = 0.73, mean F1 score = 0.79).

Conclusion: Our models effectively identify patients at high risk for ACU using routinely collected data and validate known risk factors in a large rural oncology population. Future work should integrate these tools into practice and address rural-specific challenges to reduce ACU during chemotherapy.

目的：急性护理使用（ACU）更昂贵和延长肿瘤患者，往往导致治疗中断和恶化的结果。降低ACU需要了解危险因素并主动识别高危患者。本研究通过开发预测模型来评估全因急性护理使用（A-ACU）与可预防急性护理使用（P-ACU）和农村特异性障碍，解决了研究空白。患者和方法：我们对2021年10月至2024年4月在中西部农村地区癌症网络中接受静脉注射癌症治疗的成人肿瘤患者进行了一项回顾性队列研究。我们使用了来自电子医疗记录和保险索赔的预测和结果数据。我们使用医疗保险和医疗补助服务中心的OP-35标准定义了P-ACU，并将A-ACU分类为任何急诊或住院，无论原因如何。我们对80%的队列进行LASSO和Random Forest模型训练，以预测方案开始后30、90和180天P-ACU和A-ACU的风险。结果：在2922例患者中，45.3%的患者在化疗方案开始的180天内发生了A-ACU， 10.3%的患者发生了P-ACU。主要预测因素包括以前的住院次数和合并症。保险类型和年龄对预测P-ACU更有影响，而实验室值（白蛋白、钠和中性粒细胞与淋巴细胞比率）在A-ACU模型中更重要。几乎所有LASSO和Random Forest模型都表现出较强的性能（接收者工作特征曲线下的平均面积= 0.73，平均F1得分= 0.79）。结论：我们的模型使用常规收集的数据有效地识别ACU高危患者，并验证了大量农村肿瘤人群中已知的危险因素。未来的工作应该将这些工具整合到实践中，并解决农村特定的挑战，以减少化疗期间的ACU。

{"title":"Acute Care Utilization Patterns During Chemotherapy and Predictive Model Development at a Rural Community Cancer Center.","authors":"McKenna Perrin, Crystal Hattum, Jamie Arens, Tobias Meissner","doi":"10.1200/CCI-25-00186","DOIUrl":"10.1200/CCI-25-00186","url":null,"abstract":"Purpose: Acute care use (ACU) is more costly and prolonged for oncology patients and often leads to treatment disruptions and worsened outcomes. Reducing ACU requires understanding risk factors and proactively identifying at-risk patients. This study addresses research gaps by developing predictive models to assess all-cause acute care use (A-ACU) versus preventable acute care use (P-ACU) and rural-specific barriers.Patients and methods: We conducted a retrospective cohort study of adult oncology patients who received intravenous cancer treatment between October 2021 and April 2024 within a rural midwestern regional cancer network. We used predictor and outcome data from electronic medical records and insurance claims. We defined P-ACU using the Centers for Medicare & Medicaid Services' OP-35 criteria and classified A-ACU as any emergency department visit or hospitalization, regardless of reason. We trained LASSO and Random Forest models on 80% of the cohort to predict 30-, 90-, and 180-day risk of P-ACU and A-ACU after regimen initiation.Results: Among 2,922 patients, 45.3% experienced A-ACU and 10.3% had P-ACU within 180 days of chemotherapy regimen initiation. Key predictors included number of previous inpatient stays and comorbidities. Insurance type and age were more influential in predicting P-ACU, whereas laboratory values (albumin, sodium, and neutrophil-to-lymphocyte ratio) were more important in A-ACU models. Nearly all LASSO and Random Forest models showed strong performance (mean area under the receiver operating characteristic curve = 0.73, mean F1 score = 0.79).Conclusion: Our models effectively identify patients at high risk for ACU using routinely collected data and validate known risk factors in a large rural oncology population. Future work should integrate these tools into practice and address rural-specific challenges to reduce ACU during chemotherapy.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500186"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12637137/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145514685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised Large Language Models to Identify Topics in Cancer Center Patient Portal Messages. 无监督大型语言模型在癌症中心患者门户消息中识别主题。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-10-01 DOI: 10.1200/CCI-25-00102

Ji Hyun Chang, Amir Ashraf-Ganjouei, Isabel Friesner, Ryzen Benson, Travis Zack, Sumi Sinha, Jason Chan, Steve Braunstein, Amy Lin, Lisa Singer, Julian C Hong

Purpose: The increasing use of patient portal messages has enhanced patient-provider communication. However, the high volume of these messages has also contributed to physician burnout.

Methods: Patient-generated portal messages sent to a single cancer center from 2011 to 2023 were extracted. BERTopic, a natural language processing topic modeling technique based on large language models, was optimized. For further categorization, the topic words were labeled using GPT-4, followed by review by two oncologists. Uniform Manifold Approximation and Projection was used for dimensionality reduction and visualizing topics. Message volume changes over time were assessed using a Student's t test.

Results: A total of 2,280,851 messages were analyzed. The monthly average number of messages increased from 2,071 in 2012 to 43,430 in 2022 (P < .001). There was a significant rise in message volume after the COVID-19 pandemic, with a posterior probability of a causal effect of 96.4% (P = .04). Scheduling-related messages were the most frequent across departments, whereas symptoms and health concerns were second or third most common topics. In medical oncology and surgical oncology, topics on prescriptions and medications were more common compared with radiation oncology and gynecologic oncology. Despite concurrent institutional changes in self-scheduling systems, scheduling-related messages did not decrease over time.

Conclusion: The substantial increase in patient portal messages, particularly scheduling-related inquiries, underscores the need for streamlined communication to reduce the burden on health care providers. These findings highlight the need for strategies to manage message volume and mitigate physician burnout, laying groundwork for artificial intelligence-driven future triage systems to improve message management and patient care.

目的：越来越多地使用患者门户消息增强了患者与提供者之间的沟通。然而，这些大量的信息也导致了医生的倦怠。方法：提取2011年至2023年发送到单个癌症中心的患者生成的门户信息。对基于大型语言模型的自然语言处理主题建模技术BERTopic进行了优化。为了进一步分类，使用GPT-4标记主题词，然后由两名肿瘤学家进行审查。统一流形逼近和投影用于降维和可视化主题。使用学生t检验评估消息量随时间的变化。结果：共分析了2,280,851条信息。月平均短信数从2012年的2071条增加到2022年的43430条（P < 0.001）。COVID-19大流行后，信息量显著增加，因果效应的后验概率为96.4% （P = 0.04）。与计划相关的消息是各部门之间最常见的，而症状和健康问题是第二或第三常见的主题。在内科肿瘤学和外科肿瘤学中，与放射肿瘤学和妇科肿瘤学相比，关于处方和药物的话题更为常见。尽管自调度系统同时发生了制度上的变化，但与调度相关的信息并没有随着时间的推移而减少。结论：患者门户信息的大量增加，特别是与调度相关的查询，强调了简化沟通以减轻卫生保健提供者负担的必要性。这些发现强调了管理信息量和减轻医生职业倦怠的策略的必要性，为人工智能驱动的未来分类系统奠定了基础，以改善信息管理和患者护理。

{"title":"Unsupervised Large Language Models to Identify Topics in Cancer Center Patient Portal Messages.","authors":"Ji Hyun Chang, Amir Ashraf-Ganjouei, Isabel Friesner, Ryzen Benson, Travis Zack, Sumi Sinha, Jason Chan, Steve Braunstein, Amy Lin, Lisa Singer, Julian C Hong","doi":"10.1200/CCI-25-00102","DOIUrl":"10.1200/CCI-25-00102","url":null,"abstract":"Purpose: The increasing use of patient portal messages has enhanced patient-provider communication. However, the high volume of these messages has also contributed to physician burnout.Methods: Patient-generated portal messages sent to a single cancer center from 2011 to 2023 were extracted. BERTopic, a natural language processing topic modeling technique based on large language models, was optimized. For further categorization, the topic words were labeled using GPT-4, followed by review by two oncologists. Uniform Manifold Approximation and Projection was used for dimensionality reduction and visualizing topics. Message volume changes over time were assessed using a Student's t test.Results: A total of 2,280,851 messages were analyzed. The monthly average number of messages increased from 2,071 in 2012 to 43,430 in 2022 (P < .001). There was a significant rise in message volume after the COVID-19 pandemic, with a posterior probability of a causal effect of 96.4% (P = .04). Scheduling-related messages were the most frequent across departments, whereas symptoms and health concerns were second or third most common topics. In medical oncology and surgical oncology, topics on prescriptions and medications were more common compared with radiation oncology and gynecologic oncology. Despite concurrent institutional changes in self-scheduling systems, scheduling-related messages did not decrease over time.Conclusion: The substantial increase in patient portal messages, particularly scheduling-related inquiries, underscores the need for streamlined communication to reduce the burden on health care providers. These findings highlight the need for strategies to manage message volume and mitigate physician burnout, laying groundwork for artificial intelligence-driven future triage systems to improve message management and patient care.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500102"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490804/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145208048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine Learning Designed for Any Hematologic Flow Cytometry Data Set. 机器学习设计的任何血液学流式细胞术数据集。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-10-01 Epub Date: 2025-10-29 DOI: 10.1200/CCI-24-00259

Johannes Mammen, Calin-Petru Manta, Sarah Richter, Nora Liebers, Tobias Roider, Felix Czernilofsky, Katharina Kriegsmann, Carsten Müller-Tidow, Michael Hundemer, Sascha Dietrich

Purpose: Flow cytometry is a key diagnostic technique in hematology that provides protein information at a single-cell level. Traditionally interpreted manually in a sequence of two-dimensional plots, automated analysis techniques have grown in significance in both research and clinics improving interrater reliability and speeding up analysis. Published tools usually require a specific diagnostic setup, which hinders widespread implementation.

Methods: In this paper, we present the development of a software package and web app (diagnFlow) for the automated analysis of any in-house clinical flow cytometry data set. We exemplify the application of this classifier and its clinical benefit in lymphoma diagnosis and other settings.

Results: Routine performance for the focused diagnostic task was evaluated in a blinded one-examiner setup. Multiple customary workflows solving the task in an automated manner were designed using diagnFlow. Each workflow could improve on the performance of the manual interpretation. The most easily interpretable and computationally efficient workflow out-performed more complicated approaches and was made available as an easy-to-use web app. Same-sample wet laboratory data further elucidated the biological signal the classifier is based on. The approach made available as a web app was validated in additional data sets outperforming a competition-winning clustering-based approach.

Conclusion: diagnFlow provides a valuable data set-agnostic approach to flow cytometry data sets previously not leveraged for automatic analysis while maintaining interpretability and resource efficiency.

目的：流式细胞术是血液学中的一项关键诊断技术，可提供单细胞水平的蛋白质信息。传统上，人工在二维图序列中进行解释，自动化分析技术在研究和临床中都越来越重要，提高了互译器的可靠性并加快了分析速度。已发布的工具通常需要特定的诊断设置，这阻碍了广泛实现。方法：在本文中，我们介绍了一个软件包和web应用程序（diagnFlow）的开发，用于任何内部临床流式细胞术数据集的自动分析。我们举例说明该分类器的应用及其在淋巴瘤诊断和其他设置中的临床益处。结果：集中诊断任务的常规表现是在盲法一个考官设置中评估的。使用diagnFlow设计了以自动化方式解决任务的多个习惯工作流。每个工作流都可以改进手动解释的性能。最容易解释和计算效率的工作流程优于更复杂的方法，并作为易于使用的web应用程序提供。相同样本的湿实验室数据进一步阐明了分类器所基于的生物信号。作为web应用程序提供的方法在其他数据集中得到了验证，其性能优于竞争获胜的基于聚类的方法。结论：在保持可解释性和资源效率的同时，diagnFlow为流式细胞术数据集提供了一种有价值的数据集不可知方法。

{"title":"Machine Learning Designed for Any Hematologic Flow Cytometry Data Set.","authors":"Johannes Mammen, Calin-Petru Manta, Sarah Richter, Nora Liebers, Tobias Roider, Felix Czernilofsky, Katharina Kriegsmann, Carsten Müller-Tidow, Michael Hundemer, Sascha Dietrich","doi":"10.1200/CCI-24-00259","DOIUrl":"https://doi.org/10.1200/CCI-24-00259","url":null,"abstract":"Purpose: Flow cytometry is a key diagnostic technique in hematology that provides protein information at a single-cell level. Traditionally interpreted manually in a sequence of two-dimensional plots, automated analysis techniques have grown in significance in both research and clinics improving interrater reliability and speeding up analysis. Published tools usually require a specific diagnostic setup, which hinders widespread implementation.Methods: In this paper, we present the development of a software package and web app (diagnFlow) for the automated analysis of any in-house clinical flow cytometry data set. We exemplify the application of this classifier and its clinical benefit in lymphoma diagnosis and other settings.Results: Routine performance for the focused diagnostic task was evaluated in a blinded one-examiner setup. Multiple customary workflows solving the task in an automated manner were designed using diagnFlow. Each workflow could improve on the performance of the manual interpretation. The most easily interpretable and computationally efficient workflow out-performed more complicated approaches and was made available as an easy-to-use web app. Same-sample wet laboratory data further elucidated the biological signal the classifier is based on. The approach made available as a web app was validated in additional data sets outperforming a competition-winning clustering-based approach.Conclusion: diagnFlow provides a valuable data set-agnostic approach to flow cytometry data sets previously not leveraged for automatic analysis while maintaining interpretability and resource efficiency.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400259"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145402756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Using Natural Language Processing to Assess Goals-of-Care Conversations for Patients With Cancer. 使用自然语言处理来评估癌症患者的护理目标对话。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-10-01 Epub Date: 2025-10-16 DOI: 10.1200/CCI-24-00239

Melissa K Greene, Gloria Broadwater, Donna Niedzwiecki, Thomas W LeBlanc, Jessica E Ma, David J Casarett, Brittany A Davidson

Purpose: Goals-of-care (GOC) discussions during advanced serious illness and end-of-life (EOL) care are critical. Institutions are increasingly tracking the frequency and timing of GOC documentation, but large-scale content assessments have been limited. We aimed to use natural language processing (NLP) to assess GOC documentation quality and associations with EOL care for patients with cancer.

Methods: This is a retrospective review of patients at a single US center who died with cancer between 2018 and 2022, and had documented GOC notes in the last 12 months of life. Eight GOC components were identified: current understanding of illness, information preferences, prognostic disclosure, goals, fears, acceptable function, trade-offs, and family involvement. NLP software searched for the aggregate presence of these components at the patient level within extracted GOC notes. We evaluated associations between these eight components and receipt of aggressive EOL care (chemotherapy within 14 days of death, no hospice care, or hospice admission ≤3 days of death).

Results: Two thousand thirty-one patients met inclusion criteria. The most common GOC component addressed was family involvement (75.0%) and the least common was fears (21.1%). Only 5.4% had all eight components documented. More comprehensive GOC notes were associated with lower rates of aggressive EOL care; 73.2% received aggressive care when 0/8 components were documented, compared with 56.8% and 50.3% with six or seven components discussed, respectively. In multivariate logistic regression, GOC components documented (≤6 v ≥7: OR, 2.13; P < .0001) and primary tumor site (lymphoma: OR, 2.86; P < .0001) were independent predictors of aggressive EOL care.

Conclusion: Increasingly comprehensive and higher-quality GOC documentation is associated with a lower likelihood of receiving aggressive EOL care. Opportunities to improve the quality and documentation of GOC conversations may affect EOL care for patients with cancer.

目的：在晚期严重疾病和生命终结（EOL）护理期间，护理目标（GOC）的讨论是至关重要的。机构越来越多地跟踪GOC文件的频率和时间，但大规模的内容评估受到限制。我们的目的是使用自然语言处理（NLP）来评估GOC文件的质量及其与癌症患者EOL护理的关系。方法：这是一项对2018年至2022年期间在美国一个中心死于癌症的患者的回顾性研究，这些患者在生命的最后12个月内记录了GOC记录。确定了八个GOC组成部分：当前对疾病的理解、信息偏好、预后披露、目标、恐惧、可接受功能、权衡和家庭参与。NLP软件在提取的GOC记录中搜索这些成分在患者水平上的总体存在。我们评估了这八个组成部分与接受积极的EOL护理（死亡14天内化疗，无临终关怀，或死亡≤3天的临终关怀入院）之间的关系。结果：231例患者符合纳入标准。最常见的GOC组成部分是家庭参与（75.0%），最不常见的是恐惧（21.1%）。只有5.4%的人记录了所有8个组件。更全面的GOC记录与较低的积极EOL护理率相关；当0/8个成分被记录时，73.2%的患者接受了积极治疗，相比之下，分别有56.8%和50.3%的患者接受了6或7个成分的治疗。在多因素logistic回归中，记录的GOC成分（≤6 v≥7:OR, 2.13; P < 0.0001）和原发肿瘤部位（淋巴瘤：OR， 2.86; P < 0.0001）是积极EOL治疗的独立预测因子。结论：越来越全面和高质量的GOC文件与接受积极EOL治疗的可能性降低有关。改善GOC对话的质量和记录的机会可能会影响癌症患者的EOL护理。

{"title":"Using Natural Language Processing to Assess Goals-of-Care Conversations for Patients With Cancer.","authors":"Melissa K Greene, Gloria Broadwater, Donna Niedzwiecki, Thomas W LeBlanc, Jessica E Ma, David J Casarett, Brittany A Davidson","doi":"10.1200/CCI-24-00239","DOIUrl":"https://doi.org/10.1200/CCI-24-00239","url":null,"abstract":"Purpose: Goals-of-care (GOC) discussions during advanced serious illness and end-of-life (EOL) care are critical. Institutions are increasingly tracking the frequency and timing of GOC documentation, but large-scale content assessments have been limited. We aimed to use natural language processing (NLP) to assess GOC documentation quality and associations with EOL care for patients with cancer.Methods: This is a retrospective review of patients at a single US center who died with cancer between 2018 and 2022, and had documented GOC notes in the last 12 months of life. Eight GOC components were identified: current understanding of illness, information preferences, prognostic disclosure, goals, fears, acceptable function, trade-offs, and family involvement. NLP software searched for the aggregate presence of these components at the patient level within extracted GOC notes. We evaluated associations between these eight components and receipt of aggressive EOL care (chemotherapy within 14 days of death, no hospice care, or hospice admission ≤3 days of death).Results: Two thousand thirty-one patients met inclusion criteria. The most common GOC component addressed was family involvement (75.0%) and the least common was fears (21.1%). Only 5.4% had all eight components documented. More comprehensive GOC notes were associated with lower rates of aggressive EOL care; 73.2% received aggressive care when 0/8 components were documented, compared with 56.8% and 50.3% with six or seven components discussed, respectively. In multivariate logistic regression, GOC components documented (≤6 v ≥7: OR, 2.13; P < .0001) and primary tumor site (lymphoma: OR, 2.86; P < .0001) were independent predictors of aggressive EOL care.Conclusion: Increasingly comprehensive and higher-quality GOC documentation is associated with a lower likelihood of receiving aggressive EOL care. Opportunities to improve the quality and documentation of GOC conversations may affect EOL care for patients with cancer.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400239"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145309951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

OncovigIA: Artificial Intelligence for Early Lung Cancer Detection and Referral in a Chilean Public Hospital. OncovigIA：智利一家公立医院早期肺癌检测和转诊的人工智能。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-10-01 Epub Date: 2025-10-02 DOI: 10.1200/CCI-25-00035

Jose Peña, Sebastián Santana, Juan Cristobal Morales, Natalie Pinto, Mariano Suárez, Carola Sánchez, Juan Carlos Opazo, Rodrigo Villarroel, Claudio Montenegro, Bruno Nervi, Richard Weber

Purpose: Lung cancer is a leading cause of death in Chile, where late-stage diagnoses and high mortality rates prevail. Here, we describe the development of OncovigIA, a novel digital tool powered by natural language processing that enhances the identification of potential lung cancer cases by surveilling computed tomography (CT) reports in a large public Hospital in Santiago, Chile.

Materials and methods: We combined natural language processing and large language models with state-of-the-art machine learning techniques and approaches to treat unbalanced data sets and determine the best solution to implement in OncovigIA. Focusing on key sections of the reports and using various machine learning models, including a balanced Random Forest, the tool achieved high performance with 0.90 accuracy and 0.84 F1-score on the test set.

Results: When applied to 13,326 CT chest reports from 2022, it successfully identified 377 CTs of patients with suspected lung cancer previously undetected and not managed by the multidisciplinary local lung cancer team.

Conclusion: This study underscores the potential of artificial intelligence in early cancer detection and highlights the importance of its integration into local health care ecosystems. By promptly increasing the number of patients referred for specialized management, the tool OncovigIA offers a promising path toward improving lung cancer survival rates in Chile and beyond. Moreover, this article provides avenues for its broader implementation, extending it to other cancer types and/or health care-related texts for continuous surveillance, aiming at the early referral and treatment of cancer in low-resource settings.

目的：肺癌是智利的主要死亡原因，在智利，晚期诊断和高死亡率普遍存在。在这里，我们描述了OncovigIA的发展，这是一种由自然语言处理驱动的新型数字工具，通过监测智利圣地亚哥一家大型公立医院的计算机断层扫描（CT）报告，增强了对潜在肺癌病例的识别。材料和方法：我们将自然语言处理和大型语言模型与最先进的机器学习技术和方法相结合，以处理不平衡的数据集，并确定在OncovigIA中实施的最佳解决方案。该工具专注于报告的关键部分，并使用各种机器学习模型，包括平衡随机森林，在测试集中实现了0.90精度和0.84 f1分数的高性能。结果：将其应用于2022年的13326例CT胸部报告，成功识别出377例以前未被当地多学科肺癌团队发现和管理的疑似肺癌患者的CT。结论：本研究强调了人工智能在早期癌症检测中的潜力，并强调了将其融入当地卫生保健生态系统的重要性。通过迅速增加接受专门治疗的患者数量，OncovigIA工具为提高智利及其他地区的肺癌生存率提供了一条有希望的途径。此外，本文为其更广泛的实施提供了途径，将其扩展到其他癌症类型和/或卫生保健相关文本，以进行持续监测，旨在低资源环境中癌症的早期转诊和治疗。

{"title":"OncovigIA: Artificial Intelligence for Early Lung Cancer Detection and Referral in a Chilean Public Hospital.","authors":"Jose Peña, Sebastián Santana, Juan Cristobal Morales, Natalie Pinto, Mariano Suárez, Carola Sánchez, Juan Carlos Opazo, Rodrigo Villarroel, Claudio Montenegro, Bruno Nervi, Richard Weber","doi":"10.1200/CCI-25-00035","DOIUrl":"https://doi.org/10.1200/CCI-25-00035","url":null,"abstract":"Purpose: Lung cancer is a leading cause of death in Chile, where late-stage diagnoses and high mortality rates prevail. Here, we describe the development of OncovigIA, a novel digital tool powered by natural language processing that enhances the identification of potential lung cancer cases by surveilling computed tomography (CT) reports in a large public Hospital in Santiago, Chile.Materials and methods: We combined natural language processing and large language models with state-of-the-art machine learning techniques and approaches to treat unbalanced data sets and determine the best solution to implement in OncovigIA. Focusing on key sections of the reports and using various machine learning models, including a balanced Random Forest, the tool achieved high performance with 0.90 accuracy and 0.84 F1-score on the test set.Results: When applied to 13,326 CT chest reports from 2022, it successfully identified 377 CTs of patients with suspected lung cancer previously undetected and not managed by the multidisciplinary local lung cancer team.Conclusion: This study underscores the potential of artificial intelligence in early cancer detection and highlights the importance of its integration into local health care ecosystems. By promptly increasing the number of patients referred for specialized management, the tool OncovigIA offers a promising path toward improving lung cancer survival rates in Chile and beyond. Moreover, this article provides avenues for its broader implementation, extending it to other cancer types and/or health care-related texts for continuous surveillance, aiming at the early referral and treatment of cancer in low-resource settings.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500035"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145214463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Development and Validation of a Simulation Model-Based Tool to Support Individualized Physical Activity Discussions and Prescriptions for Breast Cancer Survivors. 开发和验证一个模拟模型为基础的工具，以支持个性化的体育活动讨论和处方乳腺癌幸存者。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-10-01 Epub Date: 2025-10-17 DOI: 10.1200/CCI-25-00151

Jinani C Jayasekera, Oliver W A Wilson, Clyde Schechter, Jennifer L Caswell Jin, Kaitlyn M Wojcik, Nicolien T van Ravesteyn, Jonathan Wall, Jacob Schneider, Lia L D'Addario, Janise M Roh, Swarnavo Sarkar, Lisa Cadmus-Bertram, John P Pierce, Amy Trentham-Dietz, Lawrence H Kushi, Charles E Matthews

Purpose: Clinical guidelines recommend offering individualized physical activity prescriptions to cancer survivors. However, there are limited tools to support individualized physical activity discussions and prescriptions. We developed and validated a simulation model-based tool to estimate individualized survival outcomes for postdiagnosis physical activity among postmenopausal breast cancer survivors.

Methods: We adapted an established simulation modeling approach developed within the Cancer Intervention and Surveillance Modeling Network to estimate breast cancer-specific and all-cause survival associated with postdiagnosis physical activity for 50- to 75-year-old (postmenopausal) women with stage I to III invasive breast cancer. Model estimates were generated for 60,480 subgroups based on age, weight status (BMI), stage, tumor subtype, treatment, aerobic (<30 min/wk [no/minimal], ≥30 to <150 min/wk [insufficient], ≥150 to <300 min/wk [active], ≥300 min/wk [highly active]), and muscle-strengthening (<2 or ≥2 d/wk) activity. The outcomes were 10-year survival and absolute survival benefits for different levels of physical activity by individual characteristics and treatment. Model inputs were derived from trials, cohort studies, registry, and surveillance data. External validation used independent data.

Results: Survival rates and absolute benefits for physical activity varied by age, weight status, stage, tumor subtype, and amount and type of activity. For example, the 10-year breast cancer-specific and all-cause survival for no/minimal activity in a 65- to 69-year-old-woman with stage II, hormone receptor-positive, human epidermal growth factor receptor 2-negative breast cancer with obesity was 79.2% and 72.2%, respectively. Increasing aerobic activity from no/minimal to insufficient activity with <2 d/wk of muscle-strengthening was associated with absolute increases in 10-year breast cancer-specific and all-cause survival by 2.8 and 3.4 percentage points, respectively. The model closely replicated survival rates in independent data.

Conclusion: Simulation model-based estimates could support clinical tools for guideline-recommended individualized discussions and physical activity prescriptions for breast cancer survivors.

目的：临床指南建议为癌症幸存者提供个性化的体育活动处方。然而，支持个性化体育活动讨论和处方的工具有限。我们开发并验证了一种基于模拟模型的工具，用于评估绝经后乳腺癌幸存者诊断后体育活动的个体化生存结果。方法：我们采用在癌症干预和监测建模网络中开发的一种成熟的模拟建模方法来估计50至75岁（绝经后）I至III期浸润性乳腺癌妇女的乳腺癌特异性和全因生存率与诊断后体育活动相关。基于年龄、体重状况（BMI）、分期、肿瘤亚型、治疗、有氧运动对60,480个亚组进行了模型估计（结果：生存率和体育活动的绝对益处因年龄、体重状况、分期、肿瘤亚型、活动量和类型而异）。例如，65- 69岁患有II期、激素受体阳性、人表皮生长因子受体2阴性乳腺癌并肥胖的妇女，无活动或最低活动的10年乳腺癌特异性和全因生存率分别为79.2%和72.2%。结论：基于模拟模型的估计可以支持指南推荐的乳腺癌幸存者个体化讨论和体育活动处方的临床工具。

{"title":"Development and Validation of a Simulation Model-Based Tool to Support Individualized Physical Activity Discussions and Prescriptions for Breast Cancer Survivors.","authors":"Jinani C Jayasekera, Oliver W A Wilson, Clyde Schechter, Jennifer L Caswell Jin, Kaitlyn M Wojcik, Nicolien T van Ravesteyn, Jonathan Wall, Jacob Schneider, Lia L D'Addario, Janise M Roh, Swarnavo Sarkar, Lisa Cadmus-Bertram, John P Pierce, Amy Trentham-Dietz, Lawrence H Kushi, Charles E Matthews","doi":"10.1200/CCI-25-00151","DOIUrl":"10.1200/CCI-25-00151","url":null,"abstract":"Purpose: Clinical guidelines recommend offering individualized physical activity prescriptions to cancer survivors. However, there are limited tools to support individualized physical activity discussions and prescriptions. We developed and validated a simulation model-based tool to estimate individualized survival outcomes for postdiagnosis physical activity among postmenopausal breast cancer survivors.Methods: We adapted an established simulation modeling approach developed within the Cancer Intervention and Surveillance Modeling Network to estimate breast cancer-specific and all-cause survival associated with postdiagnosis physical activity for 50- to 75-year-old (postmenopausal) women with stage I to III invasive breast cancer. Model estimates were generated for 60,480 subgroups based on age, weight status (BMI), stage, tumor subtype, treatment, aerobic (<30 min/wk [no/minimal], ≥30 to <150 min/wk [insufficient], ≥150 to <300 min/wk [active], ≥300 min/wk [highly active]), and muscle-strengthening (<2 or ≥2 d/wk) activity. The outcomes were 10-year survival and absolute survival benefits for different levels of physical activity by individual characteristics and treatment. Model inputs were derived from trials, cohort studies, registry, and surveillance data. External validation used independent data.Results: Survival rates and absolute benefits for physical activity varied by age, weight status, stage, tumor subtype, and amount and type of activity. For example, the 10-year breast cancer-specific and all-cause survival for no/minimal activity in a 65- to 69-year-old-woman with stage II, hormone receptor-positive, human epidermal growth factor receptor 2-negative breast cancer with obesity was 79.2% and 72.2%, respectively. Increasing aerobic activity from no/minimal to insufficient activity with <2 d/wk of muscle-strengthening was associated with absolute increases in 10-year breast cancer-specific and all-cause survival by 2.8 and 3.4 percentage points, respectively. The model closely replicated survival rates in independent data.Conclusion: Simulation model-based estimates could support clinical tools for guideline-recommended individualized discussions and physical activity prescriptions for breast cancer survivors.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500151"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12543000/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145314305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Large Language Models for Translational Cancer Informatics. 翻译癌症信息学的大型语言模型。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-10-01 Epub Date: 2025-10-14 DOI: 10.1200/CCI-25-00108

Yining Pan, Yanfei Wang, Guangyu Wang, Jing Su, Umit Topaloglu, Qianqian Song

Purpose: Cancer remains a leading cause of death worldwide. The growing volume of high-throughput single-cell and spatial transcriptomic data sets-particularly those related to cancer-offers immense opportunities as well as analytical challenges for effective data analysis and interpretation. Large language models (LLMs), pretrained on vast data sets and capable of various biomedical tasks, offer a promising solution. This review explores the application of LLMs in cancer research from both cellular and pathologic perspectives, aiming to showcase their potential in advancing precision oncology.

Materials and methods: We systematically review current LLMs in analyzing single-cell RNA sequencing, spatial transcriptomic, and histology image data, emphasizing their relevance to cancer biology and translational research.

Results: A total of 24 LLMs, published or in preprint between 2022 and 2025, were selected for review. In single-cell transcriptomics, LLMs have primarily been used for cell type annotation, batch integration, and drug-response prediction. In spatial transcriptomics, LLMs support multislide and multimodal spatial data integration, gene expression imputation, niche and region label prediction, spatial domain identification, cell-cell communication inference, and marker gene detection. In computational pathology, LLMs have been applied to cancer subtyping, detection of rare malignancies, genomic mutation prediction, image segmentation, as well as cross-modal retrieval. Despite these advances, many models remain underoptimized for cancer-specific applications, highlighting the need for domain-specific fine-tuning and scalable adaptation strategies.

Conclusion: LLMs have the potential to significantly advance cancer research by providing scalable and effective tools for analyzing and interpreting single-cell, spatial transcriptomic, and pathology data. Future efforts should prioritize tailoring these models to cancer-specific contexts to enhance their utility in uncovering disease mechanisms, identifying biomarkers, and informing therapeutic strategies.

目的：癌症仍然是世界范围内死亡的主要原因。越来越多的高通量单细胞和空间转录组数据集，特别是与癌症相关的数据集，为有效的数据分析和解释提供了巨大的机会，也带来了巨大的分析挑战。大型语言模型（llm），在大量数据集上进行预训练，能够完成各种生物医学任务，提供了一个有前途的解决方案。本文将从细胞和病理两方面探讨llm在癌症研究中的应用，旨在展示llm在推进精准肿瘤学方面的潜力。材料和方法：我们系统地回顾了当前llm在分析单细胞RNA测序、空间转录组学和组织学图像数据方面的研究，强调了它们与癌症生物学和转化研究的相关性。结果：共有24篇在2022 - 2025年间发表或预印本的法学硕士论文入选。在单细胞转录组学中，llm主要用于细胞类型注释、批量整合和药物反应预测。在空间转录组学中，llm支持多载片和多模态空间数据集成、基因表达插入、生态位和区域标记预测、空间域识别、细胞-细胞通信推断和标记基因检测。在计算病理学中，llm已应用于癌症亚型分型、罕见恶性肿瘤检测、基因组突变预测、图像分割以及跨模态检索。尽管取得了这些进展，但许多模型对于癌症特定应用的优化仍然不足，这突出了对特定领域微调和可扩展适应策略的需求。结论：llm通过提供可扩展和有效的工具来分析和解释单细胞、空间转录组和病理数据，具有显著推进癌症研究的潜力。未来的工作应优先考虑将这些模型定制为癌症特定环境，以增强其在揭示疾病机制，识别生物标志物和告知治疗策略方面的效用。

{"title":"Large Language Models for Translational Cancer Informatics.","authors":"Yining Pan, Yanfei Wang, Guangyu Wang, Jing Su, Umit Topaloglu, Qianqian Song","doi":"10.1200/CCI-25-00108","DOIUrl":"10.1200/CCI-25-00108","url":null,"abstract":"Purpose: Cancer remains a leading cause of death worldwide. The growing volume of high-throughput single-cell and spatial transcriptomic data sets-particularly those related to cancer-offers immense opportunities as well as analytical challenges for effective data analysis and interpretation. Large language models (LLMs), pretrained on vast data sets and capable of various biomedical tasks, offer a promising solution. This review explores the application of LLMs in cancer research from both cellular and pathologic perspectives, aiming to showcase their potential in advancing precision oncology.Materials and methods: We systematically review current LLMs in analyzing single-cell RNA sequencing, spatial transcriptomic, and histology image data, emphasizing their relevance to cancer biology and translational research.Results: A total of 24 LLMs, published or in preprint between 2022 and 2025, were selected for review. In single-cell transcriptomics, LLMs have primarily been used for cell type annotation, batch integration, and drug-response prediction. In spatial transcriptomics, LLMs support multislide and multimodal spatial data integration, gene expression imputation, niche and region label prediction, spatial domain identification, cell-cell communication inference, and marker gene detection. In computational pathology, LLMs have been applied to cancer subtyping, detection of rare malignancies, genomic mutation prediction, image segmentation, as well as cross-modal retrieval. Despite these advances, many models remain underoptimized for cancer-specific applications, highlighting the need for domain-specific fine-tuning and scalable adaptation strategies.Conclusion: LLMs have the potential to significantly advance cancer research by providing scalable and effective tools for analyzing and interpreting single-cell, spatial transcriptomic, and pathology data. Future efforts should prioritize tailoring these models to cancer-specific contexts to enhance their utility in uncovering disease mechanisms, identifying biomarkers, and informing therapeutic strategies.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500108"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145294228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging Centralized Health System Data Management and Large Language Model-Based Data Preprocessing to Identify Predictors for Radiation Therapy Interruption. 利用集中式卫生系统数据管理和基于大语言模型的数据预处理来识别放射治疗中断的预测因素。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-10-01 Epub Date: 2025-10-28 DOI: 10.1200/CCI-25-00218

Fekede Asefa Kumsa, Christopher L Brett, Soheil Hashtarkhani, Rezaur Rashid, Lokesh Chinthala, Janet A Zink, Robert L Davis, Arash Shaban-Nejad, David L Schwartz

Purpose: Unplanned treatment interruptions represent an important care quality shortfall for patients undergoing cancer radiotherapy. This study aimed to evaluate use of a centralized electronic health record warehouse and large language model-based data preprocessing to facilitate identification of risk factors for radiation therapy interruptions (RTI).

Methods: We analyzed demographic, behavioral, clinical, and neighborhood-level data for 2,130 patients treated with radiotherapy at the University of Tennessee Medical Center in Knoxville. Treatment interruptions were measured as missed days, adjusted for weekends and holidays. Multinomial logistic regression was used to identify factors associated with moderate (2-4 days) and severe (≥5 days) RTI.

Results: Moderate RTI occurred in 15.8% of patients, while 7.7% experienced severe RTI. Moderate delays were associated with genitourinary cancer (adjusted odds ratio (AOR), 3.81; 95% CI, 1.24 to 11.66), prostate cancer (AOR, 2.44; 95% CI, 1.34 to 4.46), and Medicaid coverage (AOR, 2.22; 95% CI, 1.32 to 3.73). Severe RTI was associated with marital status (AOR for divorced or separated patients, 1.86; 95% CI, 1.18 to 2.94), head and neck cancer (AOR, 2.31; 95% CI, 1.10 to 4.87), gynecologic cancer (AOR, 2.97; 95% CI, 1.30 to 6.79), Medicaid insurance (AOR, 3.43; 95% CI, 1.77 to 6.64), daily dose of ≤225 cGy (AOR, 2.55; 95% CI, 1.21 to 5.37), and a total dose of ≥6,000 cGy (AOR, 2.30; 95% CI, 1.09 to 4.88). Severe interruptions were also significantly associated with high neighborhood social vulnerability (AOR, 2.60; 95% CI, 1.32 to 5.09).

Conclusion: Automated data preprocessing permitted efficient identification of treatment course length, marital status, disease site, Medicaid coverage, and socially vulnerable locations as significant factors associated with RTI. These findings underscore the need for data-driven risk assessment and intervention strategies to maintain cancer treatment quality at scale.

目的：计划外的治疗中断是癌症放疗患者护理质量的一个重要缺陷。本研究旨在评估集中电子健康记录仓库和基于大型语言模型的数据预处理的使用，以促进放射治疗中断（RTI）风险因素的识别。方法：我们分析了在诺克斯维尔田纳西大学医学中心接受放疗的2130例患者的人口学、行为、临床和社区数据。治疗中断以错过的天数来衡量，并根据周末和节假日进行调整。采用多项逻辑回归来确定中度（2-4天）和重度（≥5天）RTI的相关因素。结果：中度RTI发生率为15.8%，重度RTI发生率为7.7%。中度延迟与泌尿生殖系统癌相关(调整优势比（AOR）， 3.81；95% CI, 1.24 - 11.66)、前列腺癌（AOR, 2.44; 95% CI, 1.34 - 4.46）和医疗补助覆盖率（AOR, 2.22; 95% CI, 1.32 - 3.73）。严重RTI与婚姻状况（离婚或分居患者的AOR， 1.86; 95% CI， 1.18至2.94）、头颈癌（AOR, 2.31; 95% CI， 1.10至4.87）、妇科癌（AOR, 2.97; 95% CI， 1.30至6.79）、医疗补助保险（AOR, 3.43; 95% CI， 1.77至6.64）、每日剂量≤225 cGy （AOR, 2.55; 95% CI， 1.21至5.37）和总剂量≥6,000 cGy （AOR, 2.30; 95% CI， 1.09至4.88）相关。严重的中断也与高社区社会脆弱性显著相关（AOR, 2.60; 95% CI， 1.32至5.09）。结论：自动化数据预处理可以有效识别治疗疗程长度、婚姻状况、疾病地点、医疗补助覆盖范围和社会弱势群体是与RTI相关的重要因素。这些发现强调了数据驱动的风险评估和干预策略的必要性，以维持大规模的癌症治疗质量。

{"title":"Leveraging Centralized Health System Data Management and Large Language Model-Based Data Preprocessing to Identify Predictors for Radiation Therapy Interruption.","authors":"Fekede Asefa Kumsa, Christopher L Brett, Soheil Hashtarkhani, Rezaur Rashid, Lokesh Chinthala, Janet A Zink, Robert L Davis, Arash Shaban-Nejad, David L Schwartz","doi":"10.1200/CCI-25-00218","DOIUrl":"10.1200/CCI-25-00218","url":null,"abstract":"Purpose: Unplanned treatment interruptions represent an important care quality shortfall for patients undergoing cancer radiotherapy. This study aimed to evaluate use of a centralized electronic health record warehouse and large language model-based data preprocessing to facilitate identification of risk factors for radiation therapy interruptions (RTI).Methods: We analyzed demographic, behavioral, clinical, and neighborhood-level data for 2,130 patients treated with radiotherapy at the University of Tennessee Medical Center in Knoxville. Treatment interruptions were measured as missed days, adjusted for weekends and holidays. Multinomial logistic regression was used to identify factors associated with moderate (2-4 days) and severe (≥5 days) RTI.Results: Moderate RTI occurred in 15.8% of patients, while 7.7% experienced severe RTI. Moderate delays were associated with genitourinary cancer (adjusted odds ratio (AOR), 3.81; 95% CI, 1.24 to 11.66), prostate cancer (AOR, 2.44; 95% CI, 1.34 to 4.46), and Medicaid coverage (AOR, 2.22; 95% CI, 1.32 to 3.73). Severe RTI was associated with marital status (AOR for divorced or separated patients, 1.86; 95% CI, 1.18 to 2.94), head and neck cancer (AOR, 2.31; 95% CI, 1.10 to 4.87), gynecologic cancer (AOR, 2.97; 95% CI, 1.30 to 6.79), Medicaid insurance (AOR, 3.43; 95% CI, 1.77 to 6.64), daily dose of ≤225 cGy (AOR, 2.55; 95% CI, 1.21 to 5.37), and a total dose of ≥6,000 cGy (AOR, 2.30; 95% CI, 1.09 to 4.88). Severe interruptions were also significantly associated with high neighborhood social vulnerability (AOR, 2.60; 95% CI, 1.32 to 5.09).Conclusion: Automated data preprocessing permitted efficient identification of treatment course length, marital status, disease site, Medicaid coverage, and socially vulnerable locations as significant factors associated with RTI. These findings underscore the need for data-driven risk assessment and intervention strategies to maintain cancer treatment quality at scale.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500218"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12558007/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145395087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Large Language Models in Population Oncology: A Contemporary Review on the Use of Large Language Models to Support Data Collection, Aggregation, and Analysis in Cancer Care and Research. 人口肿瘤学中的大型语言模型：在癌症护理和研究中使用大型语言模型来支持数据收集、汇总和分析的当代综述。

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics

Pub Date : 2025-10-01 Epub Date: 2025-10-24 DOI: 10.1200/CCI-25-00112

Ryzen Benson, Clodagh Kenny, Amir Ashraf Ganjouei, Michelle Zhao, Rami Darawsheh, Alexander Qian, Julian C Hong

Over the past 5 years, large language models (LLMs) have emerged and continued to improve in their generative abilities and are now capable of generating human-understandable text and performing complex data analyses. As these models continue to improve in their capabilities, they are increasingly used to support population oncology, including clinical information extraction, cancer care education, and clinical decision support. This narrative review provides a high-level description of the use of LLMs in cancer with an overview of the current literature, along with research gaps. Despite increasing interest in using LLMs for cancer care, prevention, and research, applied methods in cancer still lag advancements published in the computer science literature. Therefore, we recommend that cancer-focused LLM research and applications better incorporate technical advancements and techniques found in the computer science literature. Additionally, standardized evaluation metrics and approaches need to be better studied and adopted in oncology, along with data governance and computational infrastructure to support state-of-the-art model integration and the use of real-world data. Finally, we describe the need for researchers to incorporate principles and frameworks from implementation and dissemination science to promote LLM-based tool adaptation, effectiveness, fit, and sustainability.

在过去的5年中，大型语言模型（llm）已经出现，并且其生成能力不断提高，现在能够生成人类可理解的文本并执行复杂的数据分析。随着这些模型的功能不断提高，它们越来越多地用于支持人群肿瘤学，包括临床信息提取、癌症护理教育和临床决策支持。这篇叙述性综述提供了法学硕士在癌症中的使用的高层次描述，概述了当前的文献，以及研究差距。尽管人们对法学硕士在癌症护理、预防和研究方面的应用越来越感兴趣，但在癌症方面的应用方法仍然落后于计算机科学文献中发表的进展。因此，我们建议以癌症为重点的法学硕士研究和应用更好地结合计算机科学文献中的技术进步和技术。此外，肿瘤学需要更好地研究和采用标准化评估指标和方法，以及数据治理和计算基础设施，以支持最先进的模型集成和实际数据的使用。最后，我们描述了研究人员需要将实施和传播科学的原则和框架结合起来，以促进基于法学硕士的工具的适应性、有效性、适应性和可持续性。

{"title":"Large Language Models in Population Oncology: A Contemporary Review on the Use of Large Language Models to Support Data Collection, Aggregation, and Analysis in Cancer Care and Research.","authors":"Ryzen Benson, Clodagh Kenny, Amir Ashraf Ganjouei, Michelle Zhao, Rami Darawsheh, Alexander Qian, Julian C Hong","doi":"10.1200/CCI-25-00112","DOIUrl":"10.1200/CCI-25-00112","url":null,"abstract":"Over the past 5 years, large language models (LLMs) have emerged and continued to improve in their generative abilities and are now capable of generating human-understandable text and performing complex data analyses. As these models continue to improve in their capabilities, they are increasingly used to support population oncology, including clinical information extraction, cancer care education, and clinical decision support. This narrative review provides a high-level description of the use of LLMs in cancer with an overview of the current literature, along with research gaps. Despite increasing interest in using LLMs for cancer care, prevention, and research, applied methods in cancer still lag advancements published in the computer science literature. Therefore, we recommend that cancer-focused LLM research and applications better incorporate technical advancements and techniques found in the computer science literature. Additionally, standardized evaluation metrics and approaches need to be better studied and adopted in oncology, along with data governance and computational infrastructure to support state-of-the-art model integration and the use of real-world data. Finally, we describe the need for researchers to incorporate principles and frameworks from implementation and dissemination science to promote LLM-based tool adaptation, effectiveness, fit, and sustainability.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500112"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12707173/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145369214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0