Pub Date : 2025-11-01Epub Date: 2025-11-10DOI: 10.1200/CCI-25-00211
Rob G Stirling, David R Baldwin, David Heineman, Michel W J M Wouters, Neal Navani, Paul Dawkins, Angela Melder, John Zalcberg, Erik Jakobsen
Purpose: Lung cancer is the leading global cause of cancer mortality with substantial evidence of inequity, disparity in process and outcomes, and unwarranted clinical variation. Over the last decades, there has been major evolution and discovery in best evidence-based practice (EBP), enhancing diagnostics, management, and the delivery of precision medicine. However, questions remain about the completeness of translation of best EBP into delivered care.
Design: Learning health systems (LHSs) have been defined as improvement environments where knowledge generation processes are embedded into daily clinical practice to continually improve the quality, safety, and outcomes of health care delivery. Lung cancer clinical quality registries (CQRs) provide a rigorous infrastructure supporting LHS function through the collection, analysis, and reporting of care process and outcome information delivered by health service organizations. CQRs measure the appropriateness and effectiveness of delivered care and report on the degree of best EBP delivery by stakeholder providers. The provision of risk-adjusted, benchmark reporting to stakeholders describes equity, disparity, and unwarranted clinical variation and is a fundamental driver of improvement in the safety and quality of care provided to consumers.
Results: There is mounting international evidence of the positive impacts of CQR reporting on management processes, health care infrastructure, survival, quality improvement, and education within lung cancer communities. The use of implementation science approaches including the Knowledge to Action framework targets bridging the gaps between evidence-based knowledge and practice.
Conclusion: Registry evolution is exampled by the Danish Lung Cancer Registry, National Lung Cancer Audit (United Kingdom), Dutch Lung Cancer Audit, and Victorian Lung Cancer Registry (Australia), which identify innovation opportunities to close the evidence-practice gap, overcome service deficits, and lead to better decision making for health care improvement.
{"title":"Utilization of Lung Cancer Registries in Learning Health Systems for Health Care Improvement.","authors":"Rob G Stirling, David R Baldwin, David Heineman, Michel W J M Wouters, Neal Navani, Paul Dawkins, Angela Melder, John Zalcberg, Erik Jakobsen","doi":"10.1200/CCI-25-00211","DOIUrl":"10.1200/CCI-25-00211","url":null,"abstract":"<p><strong>Purpose: </strong>Lung cancer is the leading global cause of cancer mortality with substantial evidence of inequity, disparity in process and outcomes, and unwarranted clinical variation. Over the last decades, there has been major evolution and discovery in best evidence-based practice (EBP), enhancing diagnostics, management, and the delivery of precision medicine. However, questions remain about the completeness of translation of best EBP into delivered care.</p><p><strong>Design: </strong>Learning health systems (LHSs) have been defined as improvement environments where knowledge generation processes are embedded into daily clinical practice to continually improve the quality, safety, and outcomes of health care delivery. Lung cancer clinical quality registries (CQRs) provide a rigorous infrastructure supporting LHS function through the collection, analysis, and reporting of care process and outcome information delivered by health service organizations. CQRs measure the appropriateness and effectiveness of delivered care and report on the degree of best EBP delivery by stakeholder providers. The provision of risk-adjusted, benchmark reporting to stakeholders describes equity, disparity, and unwarranted clinical variation and is a fundamental driver of improvement in the safety and quality of care provided to consumers.</p><p><strong>Results: </strong>There is mounting international evidence of the positive impacts of CQR reporting on management processes, health care infrastructure, survival, quality improvement, and education within lung cancer communities. The use of implementation science approaches including the Knowledge to Action framework targets bridging the gaps between evidence-based knowledge and practice.</p><p><strong>Conclusion: </strong>Registry evolution is exampled by the Danish Lung Cancer Registry, National Lung Cancer Audit (United Kingdom), Dutch Lung Cancer Audit, and Victorian Lung Cancer Registry (Australia), which identify innovation opportunities to close the evidence-practice gap, overcome service deficits, and lead to better decision making for health care improvement.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500211"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12622281/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145490874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-11-06DOI: 10.1200/CCI-25-00033
Elena Zazzetti, Saverio D'Amico, Flavia Jacobs, Rita De Sanctis, Lorenzo Chiudinelli, Mariangela Gaudio, Gianluca Asti, Mattia Delleani, Elisabetta Sauta, Mirco Quintavalla, Alessandro Bruseghini, Luca Lanino, Giulia Maggioni, Alessia Campagna, Victor Savevski, Matteo G Della Porta, Alberto Zambelli
Purpose: Real-world data (RWD) are critical for breast cancer (BC) research but are limited by privacy concerns, missing information, and data fragmentation. This study explores synthetic data (SD) generated through advanced generative models to address these challenges and create harmonized longitudinal data sets.
Methods: A data set of 1052 patients with human epidermal growth factor receptor 2-positive and triple-negative BC from the Informatics for Integrating Biology and the Bedside (i2b2) platform was used. Advanced generative models, including generative adversarial networks (GANs), variational autoencoders (VAEs), and language models (LMs), were applied to generate synthetic longitudinal data sets replicating disease progression, treatment patterns, and clinical outcomes. The Synthethic Validation Framework (SAFE) powered by Train was used to evaluate the fidelity, utility, and privacy. SD were tested across three settings: (1) integration with i2b2 for privacy-preserving data sets; (2) multistate disease modeling to predict clinical outcomes; and (3) generation of synthetic control groups for clinical trials.
Results: The synthetic data sets exhibited high fidelity (score 0.94) and ensured privacy, with temporal patterns validated through time-series analyses and Uniform Manifold Approximation and Projection embeddings. In setting A, SD accurately mirrored RWD on the i2b2 platform while maintaining privacy. In setting B, incorporating SD improved the predictive performance of a multistate disease progression model, increasing the C-index by up to 10%. In setting C, SD replicated the end points of the APT trial, demonstrating its feasibility for generating synthetic control arms with preserved statistical properties of the real data set.
Conclusion: AI-generated longitudinal SD effectively address key challenges in RWD use in BC. This approach can improve translational research and clinical trial design while ensuring robust privacy protection. Integration with platforms such as i2b2 highlights their scalability and potential for broader applications in oncology.
目的:真实世界数据(RWD)对乳腺癌(BC)研究至关重要,但受到隐私问题、信息缺失和数据碎片化的限制。本研究探讨了通过先进的生成模型生成的合成数据(SD),以解决这些挑战,并创建统一的纵向数据集。方法:使用来自Informatics for integrated Biology and the床边(i2b2)平台的1052例人表皮生长因子受体2阳性和三阴性BC患者的数据集。先进的生成模型,包括生成对抗网络(gan)、变分自动编码器(VAEs)和语言模型(lm),被用于生成复制疾病进展、治疗模式和临床结果的综合纵向数据集。使用由Train提供支持的综合验证框架(SAFE)来评估保真度、实用性和隐私性。SD通过三种设置进行测试:(1)与i2b2集成以保护隐私数据集;(2)建立多状态疾病模型,预测临床预后;(3)临床试验合成对照组的生成。结果:合成数据集具有高保真度(得分0.94)和保密性,通过时间序列分析和均匀流形逼近和投影嵌入验证了时间模式。在设置A中,SD准确地镜像了i2b2平台上的RWD,同时保持了隐私性。在组B中,纳入SD提高了多状态疾病进展模型的预测性能,将c指数提高了10%。在设置C中,SD复制了APT试验的终点,证明了其生成保留真实数据集统计特性的合成对照臂的可行性。结论:人工智能生成的纵向SD有效地解决了不列颠哥伦比亚省RWD使用中的关键挑战。这种方法可以改善转化研究和临床试验设计,同时确保强大的隐私保护。与i2b2等平台的集成突出了其可扩展性和在肿瘤学中更广泛应用的潜力。
{"title":"Longitudinal Synthetic Data Generation by Artificial Intelligence to Accelerate Clinical and Translational Research in Breast Cancer.","authors":"Elena Zazzetti, Saverio D'Amico, Flavia Jacobs, Rita De Sanctis, Lorenzo Chiudinelli, Mariangela Gaudio, Gianluca Asti, Mattia Delleani, Elisabetta Sauta, Mirco Quintavalla, Alessandro Bruseghini, Luca Lanino, Giulia Maggioni, Alessia Campagna, Victor Savevski, Matteo G Della Porta, Alberto Zambelli","doi":"10.1200/CCI-25-00033","DOIUrl":"10.1200/CCI-25-00033","url":null,"abstract":"<p><strong>Purpose: </strong>Real-world data (RWD) are critical for breast cancer (BC) research but are limited by privacy concerns, missing information, and data fragmentation. This study explores synthetic data (SD) generated through advanced generative models to address these challenges and create harmonized longitudinal data sets.</p><p><strong>Methods: </strong>A data set of 1052 patients with human epidermal growth factor receptor 2-positive and triple-negative BC from the Informatics for Integrating Biology and the Bedside (i2b2) platform was used. Advanced generative models, including generative adversarial networks (GANs), variational autoencoders (VAEs), and language models (LMs), were applied to generate synthetic longitudinal data sets replicating disease progression, treatment patterns, and clinical outcomes. The Synthethic Validation Framework (SAFE) powered by Train was used to evaluate the fidelity, utility, and privacy. SD were tested across three settings: (1) integration with i2b2 for privacy-preserving data sets; (2) multistate disease modeling to predict clinical outcomes; and (3) generation of synthetic control groups for clinical trials.</p><p><strong>Results: </strong>The synthetic data sets exhibited high fidelity (score 0.94) and ensured privacy, with temporal patterns validated through time-series analyses and Uniform Manifold Approximation and Projection embeddings. In setting A, SD accurately mirrored RWD on the i2b2 platform while maintaining privacy. In setting B, incorporating SD improved the predictive performance of a multistate disease progression model, increasing the C-index by up to 10%. In setting C, SD replicated the end points of the APT trial, demonstrating its feasibility for generating synthetic control arms with preserved statistical properties of the real data set.</p><p><strong>Conclusion: </strong>AI-generated longitudinal SD effectively address key challenges in RWD use in BC. This approach can improve translational research and clinical trial design while ensuring robust privacy protection. Integration with platforms such as i2b2 highlights their scalability and potential for broader applications in oncology.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500033"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12614387/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145460677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-11-07DOI: 10.1200/CCI-25-00122
Meredith C B Adams, Cody L Hudson, Matthew L Perkins, Robert W Hurley, Umit Topaloglu
Purpose: We developed and validated a dual-purpose, open-access Rural-Urban Commuting Area (RUCA) tool to standardize geographic coding for cancer disparities research, addressing National Institutes of Health (NIH) Helping to End Addiction Long-term (HEAL) Initiative Common Data Element requirements while supporting institutional catchment area analyses.
Methods: This web-based tool16 integrates US Department of Agriculture RUCA codes with census tract data and electronic health record systems, meeting NIH HEAL Initiative Findable, Accessible, Interoperable, and Reusable (FAIR) data ecosystem requirements. We implemented the tool using Wake Forest Cancer Center's 2023 registry data (n = 21,219) and conducted systematic comparison with county-level Rural-Urban Continuum Code (RUCC) classifications using 18,714 cancer cases across 336 ZIP codes, focusing on breast, colon, and lung cancers to demonstrate enhanced geographic granularity.
Results: Among 21,219 patients with cancer, 19.51% (n = 4,140) resided in rural areas, with 4.81% (n = 1,022) in the most rural census tracts (RUCA codes 7-10). Comparative analysis revealed 9.4% disagreement between RUCA and RUCC classifications, affecting 1,765 patients. Twenty-eight ZIP codes classified as rural by RUCA were located within metropolitan counties according to RUCC, encompassing 109 patients with cancer who would be misclassified using county-level measures. As a separate use case, integration with NIH HEAL Initiative standardized rurality data collection across 15 research studies.
Conclusion: The RUCA tool addresses critical gaps in geographic data standardization by providing census tract-level precision that county-level classifications miss. This dual-application framework aligns institutional catchment analyses with national standardization efforts, identifying 109 patients with cancer who would be misclassified as urban residents using traditional county-level approaches, thereby enhancing targeted interventions for rural cancer care access.
{"title":"Leveraging the Rural-Urban Commuting Area Tool to Address Geographic Disparities in Cancer Care: A Dual-Application Framework for Institutional and National Initiatives.","authors":"Meredith C B Adams, Cody L Hudson, Matthew L Perkins, Robert W Hurley, Umit Topaloglu","doi":"10.1200/CCI-25-00122","DOIUrl":"10.1200/CCI-25-00122","url":null,"abstract":"<p><strong>Purpose: </strong>We developed and validated a dual-purpose, open-access Rural-Urban Commuting Area (RUCA) tool to standardize geographic coding for cancer disparities research, addressing National Institutes of Health (NIH) Helping to End Addiction Long-term (HEAL) Initiative Common Data Element requirements while supporting institutional catchment area analyses.</p><p><strong>Methods: </strong>This web-based tool<sup>16</sup> integrates US Department of Agriculture RUCA codes with census tract data and electronic health record systems, meeting NIH HEAL Initiative Findable, Accessible, Interoperable, and Reusable (FAIR) data ecosystem requirements. We implemented the tool using Wake Forest Cancer Center's 2023 registry data (n = 21,219) and conducted systematic comparison with county-level Rural-Urban Continuum Code (RUCC) classifications using 18,714 cancer cases across 336 ZIP codes, focusing on breast, colon, and lung cancers to demonstrate enhanced geographic granularity.</p><p><strong>Results: </strong>Among 21,219 patients with cancer, 19.51% (n = 4,140) resided in rural areas, with 4.81% (n = 1,022) in the most rural census tracts (RUCA codes 7-10). Comparative analysis revealed 9.4% disagreement between RUCA and RUCC classifications, affecting 1,765 patients. Twenty-eight ZIP codes classified as rural by RUCA were located within metropolitan counties according to RUCC, encompassing 109 patients with cancer who would be misclassified using county-level measures. As a separate use case, integration with NIH HEAL Initiative standardized rurality data collection across 15 research studies.</p><p><strong>Conclusion: </strong>The RUCA tool addresses critical gaps in geographic data standardization by providing census tract-level precision that county-level classifications miss. This dual-application framework aligns institutional catchment analyses with national standardization efforts, identifying 109 patients with cancer who would be misclassified as urban residents using traditional county-level approaches, thereby enhancing targeted interventions for rural cancer care access.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500122"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12614380/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145472513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-11-13DOI: 10.1200/CCI-25-00228
Desiree R Azizoddin, Sara M DeForge, Jian Zhao, Meng Chen, Kyla Smith, Kristin L Schreiber, Robert R Edwards, Matthew Allsop, Ashton Baltazar, Ryan Nipp, Misty Walker, James A Tulsky, Michael Businelle, Andrea C Enzinger
Purpose: Patients with advanced cancer often experience pain symptoms. Pain-cognitive behavioral therapy (pain-CBT) represents an effective psychological treatment for chronic pain, yet access remains limited. We conducted a pilot study to assess the feasibility and acceptability of a mobile health (mHealth) intervention that integrates pain-CBT with opioid education and tracking to improve chronic pain management in patients with advanced cancer.
Methods: Adults with advanced cancer and pain (≥4/10, Numeric Rating Scale) using opioids tested the smartphone-based intervention for 28 days, completed baseline, end-of-study, and 2-week postintervention surveys, and participated in optional qualitative interviews. The intervention assessed pain, mood, catastrophizing, sleep, and opioid use, and provided tailored just-in-time adaptive interventions, and daily psychoeducation (articles, serious game). We assessed feasibility (≥50% app-use), acceptability (acceptability E-scale), and pre-post intervention changes in pain, and conducted thematic analysis of perceived impact and usefulness.
Results: Among 64 eligible patients, 32 (mean age, 55.41 years; 55% female; 32% rural-dwelling) enrolled. Of those, 59% (n = 19) used the app ≥50% of days on study, and rated the intervention with good acceptability (mean, 24.85; standard deviation, 3.72). Nonsignificant reductions in pain intensity, pain interference, and pain catastrophizing were observed from baseline to 4- and 6-week follow-ups. In debriefing interviews, patients described that the intervention contributed to pain self-management knowledge, promoted pain coping skills, and reduced opioid stigma.
Conclusion: Study results support feasibility and acceptability of a pain-CBT intervention for patients with advanced cancer pain. Although exploratory analyses showed nonsignificant improvements in pain outcomes, qualitative findings indicate meaningful engagement and skill development. Future testing is needed to determine intervention efficacy.
{"title":"Pilot Testing of a Multicomponent Cancer Pain-Cognitive Behavioral Therapy mHealth App for Patients With Advanced Cancer.","authors":"Desiree R Azizoddin, Sara M DeForge, Jian Zhao, Meng Chen, Kyla Smith, Kristin L Schreiber, Robert R Edwards, Matthew Allsop, Ashton Baltazar, Ryan Nipp, Misty Walker, James A Tulsky, Michael Businelle, Andrea C Enzinger","doi":"10.1200/CCI-25-00228","DOIUrl":"10.1200/CCI-25-00228","url":null,"abstract":"<p><strong>Purpose: </strong>Patients with advanced cancer often experience pain symptoms. Pain-cognitive behavioral therapy (pain-CBT) represents an effective psychological treatment for chronic pain, yet access remains limited. We conducted a pilot study to assess the feasibility and acceptability of a mobile health (mHealth) intervention that integrates pain-CBT with opioid education and tracking to improve chronic pain management in patients with advanced cancer.</p><p><strong>Methods: </strong>Adults with advanced cancer and pain (≥4/10, Numeric Rating Scale) using opioids tested the smartphone-based intervention for 28 days, completed baseline, end-of-study, and 2-week postintervention surveys, and participated in optional qualitative interviews. The intervention assessed pain, mood, catastrophizing, sleep, and opioid use, and provided tailored just-in-time adaptive interventions, and daily psychoeducation (articles, serious game). We assessed feasibility (≥50% app-use), acceptability (acceptability E-scale), and pre-post intervention changes in pain, and conducted thematic analysis of perceived impact and usefulness.</p><p><strong>Results: </strong>Among 64 eligible patients, 32 (mean age, 55.41 years; 55% female; 32% rural-dwelling) enrolled. Of those, 59% (n = 19) used the app ≥50% of days on study, and rated the intervention with good acceptability (mean, 24.85; standard deviation, 3.72). Nonsignificant reductions in pain intensity, pain interference, and pain catastrophizing were observed from baseline to 4- and 6-week follow-ups. In debriefing interviews, patients described that the intervention contributed to pain self-management knowledge, promoted pain coping skills, and reduced opioid stigma.</p><p><strong>Conclusion: </strong>Study results support feasibility and acceptability of a pain-CBT intervention for patients with advanced cancer pain. Although exploratory analyses showed nonsignificant improvements in pain outcomes, qualitative findings indicate meaningful engagement and skill development. Future testing is needed to determine intervention efficacy.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500228"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12616478/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145514753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-11-10DOI: 10.1200/CCI-25-00044
Ramez Kouzy, Elaine E Cha, Allison Rosen, Danielle S Bitterman
This narrative review examines the current landscape and evidence regarding large language model (LLM) applications designed to support patients with cancer and caregivers. We analyzed peer-reviewed literature, conference proceedings, and implementation studies exploring LLM use in oncology patient support. Applications cluster in four primary domains: education and information delivery, symptom checking and triage, telehealth integration, and clinical trial participation. Studies demonstrate promising accuracy for basic cancer information delivery, although performance varies for complex clinical scenarios. Early research shows preclinical feasibility and acceptability of LLM-enhanced tools for patients, but effectiveness data remain limited. Implementation barriers include scalable monitoring, equitable access, maintaining privacy standards, and validating accuracy across diverse populations. We also examine potential future applications across the cancer care continuum, from prevention through end-of-life care, and propose strategies for development and implementation. Additionally, we present a framework to guide physician-patient discussions regarding LLM use in oncology, addressing privacy concerns, setting appropriate expectations, and ensuring safe integration into care delivery. Future research should use robust evaluation frameworks focused on safety and patient-centered outcomes while carefully considering health equity implications. As these technologies evolve, maintaining focus on evidence-based validation will be crucial for realizing their potential to enhance cancer care delivery, engagement, and patient satisfaction.
{"title":"Review of Large Language Models for Patient and Caregiver Support in Cancer Care Delivery.","authors":"Ramez Kouzy, Elaine E Cha, Allison Rosen, Danielle S Bitterman","doi":"10.1200/CCI-25-00044","DOIUrl":"10.1200/CCI-25-00044","url":null,"abstract":"<p><p>This narrative review examines the current landscape and evidence regarding large language model (LLM) applications designed to support patients with cancer and caregivers. We analyzed peer-reviewed literature, conference proceedings, and implementation studies exploring LLM use in oncology patient support. Applications cluster in four primary domains: education and information delivery, symptom checking and triage, telehealth integration, and clinical trial participation. Studies demonstrate promising accuracy for basic cancer information delivery, although performance varies for complex clinical scenarios. Early research shows preclinical feasibility and acceptability of LLM-enhanced tools for patients, but effectiveness data remain limited. Implementation barriers include scalable monitoring, equitable access, maintaining privacy standards, and validating accuracy across diverse populations. We also examine potential future applications across the cancer care continuum, from prevention through end-of-life care, and propose strategies for development and implementation. Additionally, we present a framework to guide physician-patient discussions regarding LLM use in oncology, addressing privacy concerns, setting appropriate expectations, and ensuring safe integration into care delivery. Future research should use robust evaluation frameworks focused on safety and patient-centered outcomes while carefully considering health equity implications. As these technologies evolve, maintaining focus on evidence-based validation will be crucial for realizing their potential to enhance cancer care delivery, engagement, and patient satisfaction.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500044"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145490898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-11-03DOI: 10.1200/CCI-25-00143
Anlan Cao, Kristina L Johnson, Ijeamaka Anyene Fumagalli, Emma S Armstrong, Wendy Y Chen, Edward Giovannucci, Kenneth L Kehl, Jeffrey A Meyerhardt, Charles Quesenberry, Michael H Rosenthal, Elizabeth M Cespedes Feliciano
Purpose: Cancer recurrence in clinical settings is documented in unstructured text, requiring labor-intensive manual record review to extract this outcome. A shareable natural language processing model developed at Dana-Farber Cancer Institute (DFCI)-DFCI-imaging-student-efficiently extracts cancer outcomes from radiology reports. We applied this model in a community oncology setting, aggregating report-level predictions to derive patient-level outcomes, and evaluated its performance in determining recurrence and time-to-recurrence in patients with breast cancer (BC) or colorectal cancer (CRC).
Methods: We randomly sampled 200 patients with BC and 200 patients with CRC from two cohorts at Kaiser Permanente Northern California. Patients were diagnosed with stage III disease (2005-2019) and followed until July 31, 2024, death, or disenrollment. We manually reviewed recurrence (local/regional/distant), recurrence date, and sites of recurrence using oncology, radiology, and pathology information in electronic health records. We then applied the DFCI-imaging-student model to radiology reports and compared recurrence based on the model outcomes against manual review.
Results: A total of 7,195 radiology reports were processed. During a median follow-up of 8.4 years for BC and 6.8 years for CRC, manual review identified 78 recurrence cases in BC (39%) and 70 in CRC (35%). The DFCI-imaging-student model demonstrated high sensitivity and specificity for recurrence detection in both cancers (breast: 92.3% and 92.6%, CRC: 94.3% and 86.9%) and moderate-to-high accuracy in identifying the sites of distant metastasis. Among true positives, the median error in time-to-recurrence was 0.16 months for breast and 0.48 months for CRC.
Conclusion: Outcomes derived from the DFCI-imaging-student model output demonstrated high accuracy, providing an efficient determination of recurrence and time-to-recurrence in large-scale research to improve recurrence surveillance and facilitate collaborative research.
{"title":"Integrating a Shareable Artificial Intelligence Model Into Clinical Research for Cancer Recurrence in Patients With Breast and Colorectal Cancer.","authors":"Anlan Cao, Kristina L Johnson, Ijeamaka Anyene Fumagalli, Emma S Armstrong, Wendy Y Chen, Edward Giovannucci, Kenneth L Kehl, Jeffrey A Meyerhardt, Charles Quesenberry, Michael H Rosenthal, Elizabeth M Cespedes Feliciano","doi":"10.1200/CCI-25-00143","DOIUrl":"10.1200/CCI-25-00143","url":null,"abstract":"<p><strong>Purpose: </strong>Cancer recurrence in clinical settings is documented in unstructured text, requiring labor-intensive manual record review to extract this outcome. A shareable natural language processing model developed at Dana-Farber Cancer Institute (DFCI)-DFCI-imaging-student-efficiently extracts cancer outcomes from radiology reports. We applied this model in a community oncology setting, aggregating report-level predictions to derive patient-level outcomes, and evaluated its performance in determining recurrence and time-to-recurrence in patients with breast cancer (BC) or colorectal cancer (CRC).</p><p><strong>Methods: </strong>We randomly sampled 200 patients with BC and 200 patients with CRC from two cohorts at Kaiser Permanente Northern California. Patients were diagnosed with stage III disease (2005-2019) and followed until July 31, 2024, death, or disenrollment. We manually reviewed recurrence (local/regional/distant), recurrence date, and sites of recurrence using oncology, radiology, and pathology information in electronic health records. We then applied the DFCI-imaging-student model to radiology reports and compared recurrence based on the model outcomes against manual review.</p><p><strong>Results: </strong>A total of 7,195 radiology reports were processed. During a median follow-up of 8.4 years for BC and 6.8 years for CRC, manual review identified 78 recurrence cases in BC (39%) and 70 in CRC (35%). The DFCI-imaging-student model demonstrated high sensitivity and specificity for recurrence detection in both cancers (breast: 92.3% and 92.6%, CRC: 94.3% and 86.9%) and moderate-to-high accuracy in identifying the sites of distant metastasis. Among true positives, the median error in time-to-recurrence was 0.16 months for breast and 0.48 months for CRC.</p><p><strong>Conclusion: </strong>Outcomes derived from the DFCI-imaging-student model output demonstrated high accuracy, providing an efficient determination of recurrence and time-to-recurrence in large-scale research to improve recurrence surveillance and facilitate collaborative research.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500143"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12700351/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145439886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-10-30DOI: 10.1200/CCI-24-00327
Kathleen M Decker, Allison Feely, Iresha Ratnayake, Oliver Bucher, Piotr Czaykowski, Katie Galloway, Pamela Hebbard, Julian O Kim, Grace Musto, Marshall Pitz, Harminder Singh, Pascal Lambert
Purpose: This study examined the association between COVID-19 and cancer incidence by sex in Manitoba, Canada.
Methods: We used a population-based quasi-experimental study design and an interrupted time-series analysis to compare the rate of new cancer diagnoses between males and females before (January 2015 until December 2019) and after the start of the COVID-19 pandemic (April 2020 until December 2022).
Results: A total of 16,200 females and 20,631 males diagnosed with cancer between 2015 and 2022 in Manitoba were included. Colon cancer incidence decreased by 34% for males and females from April to September 2020. Incidence then remained stable for males but decreased by 22% from October 2021 to December 2022 for females. Brain and CNS cancer incidence decreased by 37% for males during 2021 and 2022 but only for females during the last quarter of 2020 and the first quarter of 2021 (77%). Urinary cancer decreased by 18% for males from April 2020 to December 2022 but was stable for females. Head and neck cancers decreased by 22% for males during 2020, but was stable for females. As of December 2022, the largest estimated cumulative differences in the number of cases occurred for males diagnosed with brain and CNS cancer (31.6% deficit for males, 76 cases), urinary cancer (18.4% deficit, 186 cases), and endocrine cancer (52.4% surplus, 56 cases), and females diagnosed with colon cancer (19.7% deficit, 187 cases).
Conclusion: Sex-based differences in the association between age-standardized cancer incidence and the COVID-19 pandemic exist for several cancer sites. Sex-based differences on postpandemic cancer incidence, especially for brain, CNS, urinary, and colon cancers, need follow-up because of the ongoing deficits documented in this study.
{"title":"Measuring the Association Between the COVID-19 Pandemic and Cancer Incidence by Sex Using a Quasi-Experimental Study Design.","authors":"Kathleen M Decker, Allison Feely, Iresha Ratnayake, Oliver Bucher, Piotr Czaykowski, Katie Galloway, Pamela Hebbard, Julian O Kim, Grace Musto, Marshall Pitz, Harminder Singh, Pascal Lambert","doi":"10.1200/CCI-24-00327","DOIUrl":"10.1200/CCI-24-00327","url":null,"abstract":"<p><strong>Purpose: </strong>This study examined the association between COVID-19 and cancer incidence by sex in Manitoba, Canada.</p><p><strong>Methods: </strong>We used a population-based quasi-experimental study design and an interrupted time-series analysis to compare the rate of new cancer diagnoses between males and females before (January 2015 until December 2019) and after the start of the COVID-19 pandemic (April 2020 until December 2022).</p><p><strong>Results: </strong>A total of 16,200 females and 20,631 males diagnosed with cancer between 2015 and 2022 in Manitoba were included. Colon cancer incidence decreased by 34% for males and females from April to September 2020. Incidence then remained stable for males but decreased by 22% from October 2021 to December 2022 for females. Brain and CNS cancer incidence decreased by 37% for males during 2021 and 2022 but only for females during the last quarter of 2020 and the first quarter of 2021 (77%). Urinary cancer decreased by 18% for males from April 2020 to December 2022 but was stable for females. Head and neck cancers decreased by 22% for males during 2020, but was stable for females. As of December 2022, the largest estimated cumulative differences in the number of cases occurred for males diagnosed with brain and CNS cancer (31.6% deficit for males, 76 cases), urinary cancer (18.4% deficit, 186 cases), and endocrine cancer (52.4% surplus, 56 cases), and females diagnosed with colon cancer (19.7% deficit, 187 cases).</p><p><strong>Conclusion: </strong>Sex-based differences in the association between age-standardized cancer incidence and the COVID-19 pandemic exist for several cancer sites. Sex-based differences on postpandemic cancer incidence, especially for brain, CNS, urinary, and colon cancers, need follow-up because of the ongoing deficits documented in this study.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400327"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12591556/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145410757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Acute care use (ACU) is more costly and prolonged for oncology patients and often leads to treatment disruptions and worsened outcomes. Reducing ACU requires understanding risk factors and proactively identifying at-risk patients. This study addresses research gaps by developing predictive models to assess all-cause acute care use (A-ACU) versus preventable acute care use (P-ACU) and rural-specific barriers.
Patients and methods: We conducted a retrospective cohort study of adult oncology patients who received intravenous cancer treatment between October 2021 and April 2024 within a rural midwestern regional cancer network. We used predictor and outcome data from electronic medical records and insurance claims. We defined P-ACU using the Centers for Medicare & Medicaid Services' OP-35 criteria and classified A-ACU as any emergency department visit or hospitalization, regardless of reason. We trained LASSO and Random Forest models on 80% of the cohort to predict 30-, 90-, and 180-day risk of P-ACU and A-ACU after regimen initiation.
Results: Among 2,922 patients, 45.3% experienced A-ACU and 10.3% had P-ACU within 180 days of chemotherapy regimen initiation. Key predictors included number of previous inpatient stays and comorbidities. Insurance type and age were more influential in predicting P-ACU, whereas laboratory values (albumin, sodium, and neutrophil-to-lymphocyte ratio) were more important in A-ACU models. Nearly all LASSO and Random Forest models showed strong performance (mean area under the receiver operating characteristic curve = 0.73, mean F1 score = 0.79).
Conclusion: Our models effectively identify patients at high risk for ACU using routinely collected data and validate known risk factors in a large rural oncology population. Future work should integrate these tools into practice and address rural-specific challenges to reduce ACU during chemotherapy.
{"title":"Acute Care Utilization Patterns During Chemotherapy and Predictive Model Development at a Rural Community Cancer Center.","authors":"McKenna Perrin, Crystal Hattum, Jamie Arens, Tobias Meissner","doi":"10.1200/CCI-25-00186","DOIUrl":"10.1200/CCI-25-00186","url":null,"abstract":"<p><strong>Purpose: </strong>Acute care use (ACU) is more costly and prolonged for oncology patients and often leads to treatment disruptions and worsened outcomes. Reducing ACU requires understanding risk factors and proactively identifying at-risk patients. This study addresses research gaps by developing predictive models to assess all-cause acute care use (A-ACU) versus preventable acute care use (P-ACU) and rural-specific barriers.</p><p><strong>Patients and methods: </strong>We conducted a retrospective cohort study of adult oncology patients who received intravenous cancer treatment between October 2021 and April 2024 within a rural midwestern regional cancer network. We used predictor and outcome data from electronic medical records and insurance claims. We defined P-ACU using the Centers for Medicare & Medicaid Services' OP-35 criteria and classified A-ACU as any emergency department visit or hospitalization, regardless of reason. We trained LASSO and Random Forest models on 80% of the cohort to predict 30-, 90-, and 180-day risk of P-ACU and A-ACU after regimen initiation.</p><p><strong>Results: </strong>Among 2,922 patients, 45.3% experienced A-ACU and 10.3% had P-ACU within 180 days of chemotherapy regimen initiation. Key predictors included number of previous inpatient stays and comorbidities. Insurance type and age were more influential in predicting P-ACU, whereas laboratory values (albumin, sodium, and neutrophil-to-lymphocyte ratio) were more important in A-ACU models. Nearly all LASSO and Random Forest models showed strong performance (mean area under the receiver operating characteristic curve = 0.73, mean F1 score = 0.79).</p><p><strong>Conclusion: </strong>Our models effectively identify patients at high risk for ACU using routinely collected data and validate known risk factors in a large rural oncology population. Future work should integrate these tools into practice and address rural-specific challenges to reduce ACU during chemotherapy.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500186"},"PeriodicalIF":2.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12637137/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145514685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ji Hyun Chang, Amir Ashraf-Ganjouei, Isabel Friesner, Ryzen Benson, Travis Zack, Sumi Sinha, Jason Chan, Steve Braunstein, Amy Lin, Lisa Singer, Julian C Hong
Purpose: The increasing use of patient portal messages has enhanced patient-provider communication. However, the high volume of these messages has also contributed to physician burnout.
Methods: Patient-generated portal messages sent to a single cancer center from 2011 to 2023 were extracted. BERTopic, a natural language processing topic modeling technique based on large language models, was optimized. For further categorization, the topic words were labeled using GPT-4, followed by review by two oncologists. Uniform Manifold Approximation and Projection was used for dimensionality reduction and visualizing topics. Message volume changes over time were assessed using a Student's t test.
Results: A total of 2,280,851 messages were analyzed. The monthly average number of messages increased from 2,071 in 2012 to 43,430 in 2022 (P < .001). There was a significant rise in message volume after the COVID-19 pandemic, with a posterior probability of a causal effect of 96.4% (P = .04). Scheduling-related messages were the most frequent across departments, whereas symptoms and health concerns were second or third most common topics. In medical oncology and surgical oncology, topics on prescriptions and medications were more common compared with radiation oncology and gynecologic oncology. Despite concurrent institutional changes in self-scheduling systems, scheduling-related messages did not decrease over time.
Conclusion: The substantial increase in patient portal messages, particularly scheduling-related inquiries, underscores the need for streamlined communication to reduce the burden on health care providers. These findings highlight the need for strategies to manage message volume and mitigate physician burnout, laying groundwork for artificial intelligence-driven future triage systems to improve message management and patient care.
{"title":"Unsupervised Large Language Models to Identify Topics in Cancer Center Patient Portal Messages.","authors":"Ji Hyun Chang, Amir Ashraf-Ganjouei, Isabel Friesner, Ryzen Benson, Travis Zack, Sumi Sinha, Jason Chan, Steve Braunstein, Amy Lin, Lisa Singer, Julian C Hong","doi":"10.1200/CCI-25-00102","DOIUrl":"10.1200/CCI-25-00102","url":null,"abstract":"<p><strong>Purpose: </strong>The increasing use of patient portal messages has enhanced patient-provider communication. However, the high volume of these messages has also contributed to physician burnout.</p><p><strong>Methods: </strong>Patient-generated portal messages sent to a single cancer center from 2011 to 2023 were extracted. BERTopic, a natural language processing topic modeling technique based on large language models, was optimized. For further categorization, the topic words were labeled using GPT-4, followed by review by two oncologists. Uniform Manifold Approximation and Projection was used for dimensionality reduction and visualizing topics. Message volume changes over time were assessed using a Student's <i>t</i> test.</p><p><strong>Results: </strong>A total of 2,280,851 messages were analyzed. The monthly average number of messages increased from 2,071 in 2012 to 43,430 in 2022 (<i>P</i> < .001). There was a significant rise in message volume after the COVID-19 pandemic, with a posterior probability of a causal effect of 96.4% (<i>P</i> = .04). Scheduling-related messages were the most frequent across departments, whereas symptoms and health concerns were second or third most common topics. In medical oncology and surgical oncology, topics on prescriptions and medications were more common compared with radiation oncology and gynecologic oncology. Despite concurrent institutional changes in self-scheduling systems, scheduling-related messages did not decrease over time.</p><p><strong>Conclusion: </strong>The substantial increase in patient portal messages, particularly scheduling-related inquiries, underscores the need for streamlined communication to reduce the burden on health care providers. These findings highlight the need for strategies to manage message volume and mitigate physician burnout, laying groundwork for artificial intelligence-driven future triage systems to improve message management and patient care.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500102"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490804/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145208048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01Epub Date: 2025-10-29DOI: 10.1200/CCI-24-00259
Johannes Mammen, Calin-Petru Manta, Sarah Richter, Nora Liebers, Tobias Roider, Felix Czernilofsky, Katharina Kriegsmann, Carsten Müller-Tidow, Michael Hundemer, Sascha Dietrich
Purpose: Flow cytometry is a key diagnostic technique in hematology that provides protein information at a single-cell level. Traditionally interpreted manually in a sequence of two-dimensional plots, automated analysis techniques have grown in significance in both research and clinics improving interrater reliability and speeding up analysis. Published tools usually require a specific diagnostic setup, which hinders widespread implementation.
Methods: In this paper, we present the development of a software package and web app (diagnFlow) for the automated analysis of any in-house clinical flow cytometry data set. We exemplify the application of this classifier and its clinical benefit in lymphoma diagnosis and other settings.
Results: Routine performance for the focused diagnostic task was evaluated in a blinded one-examiner setup. Multiple customary workflows solving the task in an automated manner were designed using diagnFlow. Each workflow could improve on the performance of the manual interpretation. The most easily interpretable and computationally efficient workflow out-performed more complicated approaches and was made available as an easy-to-use web app. Same-sample wet laboratory data further elucidated the biological signal the classifier is based on. The approach made available as a web app was validated in additional data sets outperforming a competition-winning clustering-based approach.
Conclusion: diagnFlow provides a valuable data set-agnostic approach to flow cytometry data sets previously not leveraged for automatic analysis while maintaining interpretability and resource efficiency.
{"title":"Machine Learning Designed for Any Hematologic Flow Cytometry Data Set.","authors":"Johannes Mammen, Calin-Petru Manta, Sarah Richter, Nora Liebers, Tobias Roider, Felix Czernilofsky, Katharina Kriegsmann, Carsten Müller-Tidow, Michael Hundemer, Sascha Dietrich","doi":"10.1200/CCI-24-00259","DOIUrl":"https://doi.org/10.1200/CCI-24-00259","url":null,"abstract":"<p><strong>Purpose: </strong>Flow cytometry is a key diagnostic technique in hematology that provides protein information at a single-cell level. Traditionally interpreted manually in a sequence of two-dimensional plots, automated analysis techniques have grown in significance in both research and clinics improving interrater reliability and speeding up analysis. Published tools usually require a specific diagnostic setup, which hinders widespread implementation.</p><p><strong>Methods: </strong>In this paper, we present the development of a software package and web app (diagnFlow) for the automated analysis of any in-house clinical flow cytometry data set. We exemplify the application of this classifier and its clinical benefit in lymphoma diagnosis and other settings.</p><p><strong>Results: </strong>Routine performance for the focused diagnostic task was evaluated in a blinded one-examiner setup. Multiple customary workflows solving the task in an automated manner were designed using diagnFlow. Each workflow could improve on the performance of the manual interpretation. The most easily interpretable and computationally efficient workflow out-performed more complicated approaches and was made available as an easy-to-use web app. Same-sample wet laboratory data further elucidated the biological signal the classifier is based on. The approach made available as a web app was validated in additional data sets outperforming a competition-winning clustering-based approach.</p><p><strong>Conclusion: </strong>diagnFlow provides a valuable data set-agnostic approach to flow cytometry data sets previously not leveraged for automatic analysis while maintaining interpretability and resource efficiency.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400259"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145402756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}