Pub Date : 2026-03-01Epub Date: 2026-03-19DOI: 10.1200/CCI-25-00041
Abdul J Zakkar, Nazia Perwaiz, Vikram Harikrishnan, Weiheng Zhong, Vijeth Narra, Farah Yousef, Daniel Kim, Mason Burrage-Burton, Abdul Afeez Lawal, Vijayakrishna K Gadi, Mark C Korpics, Sage J Kim, Zhengjia Chen, Aly A Khan, Yamilé Molina, Yang Dai, G Elisabeta Marai, Hadi Meidani, Ryan H Nguyen, Ameen A Salahudeen
Purpose: Disparities in lung cancer incidence exist in Black populations, and screening criteria underserve Black populations due to disparately elevated risk in the screening-eligible population. Prediction models that integrate clinical and imaging-based features to individualize lung cancer risk are a potential means to mitigate these disparities.
Methods: This multicenter (National Lung Screening Trial [NLST]) and catchment population-based (University of Illinois Health [UIH], urban and suburban Cook County) cross-sectional study used participants at risk of lung cancer with available lung computed tomography (CT) imaging and follow-up between the years 2015 and 2024. In all, 53,452 in NLST and 11,654 in UIH were included on the basis of age and tobacco use-based risk factors for lung cancer. Cohorts were used for training and testing of deep and machine learning models using clinical features alone or combined with CT image features (hybrid computer vision).
Results: An optimized seven-feature clinical model achieved receiver operating characteristic (ROC)-AUC values ranging from 0.64 to 0.67 in NLST and 0.60 to 0.65 in UIH cohorts across multiple years. Incorporation of imaging features to form a hybrid computer vision model significantly improved ROC-AUC values to 0.78-0.91 in NLST but deteriorated in UIH with ROC-AUC values of 0.68-0.80, attributable to Black participants where ROC-AUC values ranged from 0.63 to 0.72 across multiple years. Retraining the hybrid computer vision model by incorporating Black and other participants from the UIH cohort improved performance with ROC-AUC values of 0.70-0.87 in a held-out UIH test set.
Conclusion: Hybrid computer vision predicted risk with improved accuracy compared with clinical risk models alone. However, potential biases in image training data reduced model generalizability in Black participants. Performance was improved upon retraining with a subset of the UIH cohort, suggesting that inclusive training and validation data sets can minimize racial disparities. Future studies incorporating vision models trained on representative data sets may demonstrate improved health equity upon clinical use.
{"title":"Hybrid Computer Vision Model to Predict Lung Cancer in Diverse Populations.","authors":"Abdul J Zakkar, Nazia Perwaiz, Vikram Harikrishnan, Weiheng Zhong, Vijeth Narra, Farah Yousef, Daniel Kim, Mason Burrage-Burton, Abdul Afeez Lawal, Vijayakrishna K Gadi, Mark C Korpics, Sage J Kim, Zhengjia Chen, Aly A Khan, Yamilé Molina, Yang Dai, G Elisabeta Marai, Hadi Meidani, Ryan H Nguyen, Ameen A Salahudeen","doi":"10.1200/CCI-25-00041","DOIUrl":"https://doi.org/10.1200/CCI-25-00041","url":null,"abstract":"<p><strong>Purpose: </strong>Disparities in lung cancer incidence exist in Black populations, and screening criteria underserve Black populations due to disparately elevated risk in the screening-eligible population. Prediction models that integrate clinical and imaging-based features to individualize lung cancer risk are a potential means to mitigate these disparities.</p><p><strong>Methods: </strong>This multicenter (National Lung Screening Trial [NLST]) and catchment population-based (University of Illinois Health [UIH], urban and suburban Cook County) cross-sectional study used participants at risk of lung cancer with available lung computed tomography (CT) imaging and follow-up between the years 2015 and 2024. In all, 53,452 in NLST and 11,654 in UIH were included on the basis of age and tobacco use-based risk factors for lung cancer. Cohorts were used for training and testing of deep and machine learning models using clinical features alone or combined with CT image features (hybrid computer vision).</p><p><strong>Results: </strong>An optimized seven-feature clinical model achieved receiver operating characteristic (ROC)-AUC values ranging from 0.64 to 0.67 in NLST and 0.60 to 0.65 in UIH cohorts across multiple years. Incorporation of imaging features to form a hybrid computer vision model significantly improved ROC-AUC values to 0.78-0.91 in NLST but deteriorated in UIH with ROC-AUC values of 0.68-0.80, attributable to Black participants where ROC-AUC values ranged from 0.63 to 0.72 across multiple years. Retraining the hybrid computer vision model by incorporating Black and other participants from the UIH cohort improved performance with ROC-AUC values of 0.70-0.87 in a held-out UIH test set.</p><p><strong>Conclusion: </strong>Hybrid computer vision predicted risk with improved accuracy compared with clinical risk models alone. However, potential biases in image training data reduced model generalizability in Black participants. Performance was improved upon retraining with a subset of the UIH cohort, suggesting that inclusive training and validation data sets can minimize racial disparities. Future studies incorporating vision models trained on representative data sets may demonstrate improved health equity upon clinical use.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2500041"},"PeriodicalIF":2.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147488370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-03-11DOI: 10.1200/CCI-25-00250
Abhishek Shivanna, Adam Spannaus, Jordan Tschida, John Gounley, Patrycja Krawczuk, Heidi Hanson
Purpose: Integrating artificial intelligence in cancer diagnostics has improved tumor classification beyond rule-based systems. Despite these advancements, these models may still encode demographic biases. We conducted a large-scale, applied bias-probing study of a deep learning-based cancer site classifier to quantify race information encoded in document embeddings. We then evaluated how performance changes when race-correlated embedding dimensions are removed in a post-training sensitivity analysis.
Methods: The cancer site classifier was trained using 3.5 million electronic cancer pathology reports from six of the National Cancer Institute's SEER registries. We trained a hierarchical self-attention network to generate 400-dimensional document embeddings. These embeddings were used to train two downstream, gradient-boosted decision tree classifiers: one to classify the cancer sites and another to predict racial categories. We identified overlapping features by intersecting the top 50 feature-importance rankings from the site and race models and computed their cumulative feature importance in each model. As a post hoc sensitivity analysis, we progressively pruned these overlapping dimensions, retrained the site model, and compared overall macro-F1 and accuracy, race-stratified macro-F1, and group fairness metrics on the basis of demographic parity and equalized odds before and after pruning.
Results: The analysis revealed minimal feature overlap between the cancer site and race prediction models, and the cumulative importance scores indicated a negligible influence of racial information on clinical predictions. Post-training pruning of overlapping features did not compromise the models' diagnostic accuracy, with a 0.07% loss in accuracy.
Conclusion: Our findings demonstrate that HiSAN-generated embeddings from SEER data can be used effectively in cancer site classification without significant demographic bias influencing the outcomes. Post-training pruning therefore functions as a practical audit and sensitivity check.
{"title":"Mitigating Algorithmic Bias in Cancer Site Classification Models.","authors":"Abhishek Shivanna, Adam Spannaus, Jordan Tschida, John Gounley, Patrycja Krawczuk, Heidi Hanson","doi":"10.1200/CCI-25-00250","DOIUrl":"10.1200/CCI-25-00250","url":null,"abstract":"<p><strong>Purpose: </strong>Integrating artificial intelligence in cancer diagnostics has improved tumor classification beyond rule-based systems. Despite these advancements, these models may still encode demographic biases. We conducted a large-scale, applied bias-probing study of a deep learning-based cancer site classifier to quantify race information encoded in document embeddings. We then evaluated how performance changes when race-correlated embedding dimensions are removed in a post-training sensitivity analysis.</p><p><strong>Methods: </strong>The cancer site classifier was trained using 3.5 million electronic cancer pathology reports from six of the National Cancer Institute's SEER registries. We trained a hierarchical self-attention network to generate 400-dimensional document embeddings. These embeddings were used to train two downstream, gradient-boosted decision tree classifiers: one to classify the cancer sites and another to predict racial categories. We identified overlapping features by intersecting the top 50 feature-importance rankings from the site and race models and computed their cumulative feature importance in each model. As a post hoc sensitivity analysis, we progressively pruned these overlapping dimensions, retrained the site model, and compared overall macro-F1 and accuracy, race-stratified macro-F1, and group fairness metrics on the basis of demographic parity and equalized odds before and after pruning.</p><p><strong>Results: </strong>The analysis revealed minimal feature overlap between the cancer site and race prediction models, and the cumulative importance scores indicated a negligible influence of racial information on clinical predictions. Post-training pruning of overlapping features did not compromise the models' diagnostic accuracy, with a 0.07% loss in accuracy.</p><p><strong>Conclusion: </strong>Our findings demonstrate that HiSAN-generated embeddings from SEER data can be used effectively in cancer site classification without significant demographic bias influencing the outcomes. Post-training pruning therefore functions as a practical audit and sensitivity check.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2500250"},"PeriodicalIF":2.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13001901/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147437470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-03-09DOI: 10.1200/CCI-25-00323
Chin Hang Yiu, Edward C Y Lau, Charlotte Thuy Tien Le, Christine Y Lu
Purpose: To systematically map how artificial intelligence (AI) is being applied to immune-related adverse events (irAEs) induced by immune checkpoint inhibitors (ICIs), and to identify key knowledge gaps and future directions to responsible implementation.
Methods: We conducted a scoping review in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guideline. MEDLINE (Ovid), Embase, and Scopus were searched from January 1, 2015, to August 24, 2025. Eligible studies applied at least one AI method (eg, machine learning, natural language processing) to investigate irAEs. Studies were grouped into three clinical domains: (1) risk prediction, (2) identification/detection, and (3) clinical information/decision support. Data were synthesized narratively and mapped descriptively.
Results: 40 studies met inclusion criteria, encompassing 45,897 ICI-treated patients. Most applied AI for risk prediction (n = 27), followed by identification/detection (n = 10) and decision support (n = 3). AI approaches showed promise in detecting irAEs from structured and unstructured data, stratifying patient-level risk, and supporting clinical decision making. However, methodological limitations were common: most studies used retrospective data and lacked external validation, limiting clinical applicability.
Conclusion: AI shows potential to enhance ICI safety by enabling earlier detection of irAEs, personalized risk prediction, and scalable clinical support tools. To support clinical translation, future research must prioritize external and prospective validation, standardized outcome reporting, and impact evaluation (eg, effects on clinical outcomes and workflows) within robust governance frameworks.
{"title":"Leveraging Artificial Intelligence for Immune Checkpoint Inhibitor Safety: A Scoping Review of Current Applications.","authors":"Chin Hang Yiu, Edward C Y Lau, Charlotte Thuy Tien Le, Christine Y Lu","doi":"10.1200/CCI-25-00323","DOIUrl":"https://doi.org/10.1200/CCI-25-00323","url":null,"abstract":"<p><strong>Purpose: </strong>To systematically map how artificial intelligence (AI) is being applied to immune-related adverse events (irAEs) induced by immune checkpoint inhibitors (ICIs), and to identify key knowledge gaps and future directions to responsible implementation.</p><p><strong>Methods: </strong>We conducted a scoping review in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guideline. MEDLINE (Ovid), Embase, and Scopus were searched from January 1, 2015, to August 24, 2025. Eligible studies applied at least one AI method (eg, machine learning, natural language processing) to investigate irAEs. Studies were grouped into three clinical domains: (1) risk prediction, (2) identification/detection, and (3) clinical information/decision support. Data were synthesized narratively and mapped descriptively.</p><p><strong>Results: </strong>40 studies met inclusion criteria, encompassing 45,897 ICI-treated patients. Most applied AI for risk prediction (n = 27), followed by identification/detection (n = 10) and decision support (n = 3). AI approaches showed promise in detecting irAEs from structured and unstructured data, stratifying patient-level risk, and supporting clinical decision making. However, methodological limitations were common: most studies used retrospective data and lacked external validation, limiting clinical applicability.</p><p><strong>Conclusion: </strong>AI shows potential to enhance ICI safety by enabling earlier detection of irAEs, personalized risk prediction, and scalable clinical support tools. To support clinical translation, future research must prioritize external and prospective validation, standardized outcome reporting, and impact evaluation (eg, effects on clinical outcomes and workflows) within robust governance frameworks.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2500323"},"PeriodicalIF":2.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147391608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-03-11DOI: 10.1200/CCI-25-00254
Kamyar Arzideh, René Hosch, Amin Turki, Bahadir Eryilmaz, Mikel Bahn, Henning Schäfer, Ahmad Idrissi-Yaghir, Sameh Khattab, Amin Dada, Hideo A Baba, Dirk Schadendorf, Martin Schuler, Jens Kleesiek, Sylvia Hartmann, Felix Nensa, Julius Keyl
Purpose: Manual coding of pathology reports with International Classification of Diseases for Oncology (ICD-O)-3 codes is time-consuming, error-prone, and resource-intensive for health care institutions. To evaluate the performance of multiple state-of-the-art large language models (LLMs) in extracting ICD-O-3 topography and morphology codes from real-world pathology reports and assess their potential for clinical implementation, this study compares the performance of state-of-the-art open-source models in multiple evaluation setups.
Methods: We analyzed 21,364 pathology reports from 10,823 patients documented between 2013 and 2025 at a large German hospital. Five LLMs were evaluated: Llama-3.3-70B-Instruct, DeepSeek-R1-Distill-Llama (8B and 70B variants), Qwen3-235B-A22B, and Gemma-3-12B-it. All models were deployed on secured private information technology hospital infrastructure. Three different prompts were developed for topography extraction (with and without anatomic context) and morphology extraction. Performance was evaluated using exact code matches and first three-position matches.
Results: For exact ICD-O topography code prediction, Qwen3-235B-A22B achieved the highest performance (microaverage F1: 71.6%), whereas Llama-3.3-70B-Instruct performed best at predicting the first three characters (micro-average F1: 84.6%). For morphology codes, DeepSeek-R1-Distill-Llama-70B outperformed other models (exact microaverage F1: 34.7%; first three characters' microaverage F1: 77.8%). Large disparities between micro- and macroaverage F1-scores indicated poor generalization to rare conditions.
Conclusion: Although LLMs demonstrate promising capabilities as support systems for expert-guided pathology coding, their performance is not yet sufficient for fully automated, unsupervised use in routine clinical workflows. LLMs showed poor performance on rare conditions, heavy dependence on contextual information, and substantially lower scores for morphology versus topography classification.
{"title":"Automated Tumor International Classification of Diseases Coding of Real-World Pathology Reports Using Self-Hosted Large Language Models.","authors":"Kamyar Arzideh, René Hosch, Amin Turki, Bahadir Eryilmaz, Mikel Bahn, Henning Schäfer, Ahmad Idrissi-Yaghir, Sameh Khattab, Amin Dada, Hideo A Baba, Dirk Schadendorf, Martin Schuler, Jens Kleesiek, Sylvia Hartmann, Felix Nensa, Julius Keyl","doi":"10.1200/CCI-25-00254","DOIUrl":"https://doi.org/10.1200/CCI-25-00254","url":null,"abstract":"<p><strong>Purpose: </strong>Manual coding of pathology reports with International Classification of Diseases for Oncology (ICD-O)-3 codes is time-consuming, error-prone, and resource-intensive for health care institutions. To evaluate the performance of multiple state-of-the-art large language models (LLMs) in extracting ICD-O-3 topography and morphology codes from real-world pathology reports and assess their potential for clinical implementation, this study compares the performance of state-of-the-art open-source models in multiple evaluation setups.</p><p><strong>Methods: </strong>We analyzed 21,364 pathology reports from 10,823 patients documented between 2013 and 2025 at a large German hospital. Five LLMs were evaluated: Llama-3.3-70B-Instruct, DeepSeek-R1-Distill-Llama (8B and 70B variants), Qwen3-235B-A22B, and Gemma-3-12B-it. All models were deployed on secured private information technology hospital infrastructure. Three different prompts were developed for topography extraction (with and without anatomic context) and morphology extraction. Performance was evaluated using exact code matches and first three-position matches.</p><p><strong>Results: </strong>For exact ICD-O topography code prediction, Qwen3-235B-A22B achieved the highest performance (microaverage F1: 71.6%), whereas <i>Llama-3.3-70B-Instruct</i> performed best at predicting the first three characters (micro-average F1: 84.6%). For morphology codes, <i>DeepSeek-R1-Distill-Llama-70B</i> outperformed other models (exact microaverage F1: 34.7%; first three characters' microaverage F1: 77.8%). Large disparities between micro- and macroaverage F1-scores indicated poor generalization to rare conditions.</p><p><strong>Conclusion: </strong>Although LLMs demonstrate promising capabilities as support systems for expert-guided pathology coding, their performance is not yet sufficient for fully automated, unsupervised use in routine clinical workflows. LLMs showed poor performance on rare conditions, heavy dependence on contextual information, and substantially lower scores for morphology versus topography classification.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2500254"},"PeriodicalIF":2.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147437445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-03-10DOI: 10.1200/CCI-26-00019
Irbaz Bin Riaz
{"title":"Large Language Models in Oncology: Navigating Promise and Prudence in a Rapidly Evolving Landscape.","authors":"Irbaz Bin Riaz","doi":"10.1200/CCI-26-00019","DOIUrl":"https://doi.org/10.1200/CCI-26-00019","url":null,"abstract":"","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2600019"},"PeriodicalIF":2.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147437482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-03-18DOI: 10.1200/CCI-25-00276
Tsung-Ying Lee, Eberechukwu Onukwugha, Abree Johnson, Chih Chun Tung, Jessica Dohler, Bindu Kanapuru, Catherine C Lerro, Eun-Shim Nahm, Colleen Reilly, Donna R Rivera, Jonathan Vallejo, Felice Yang, Jessica Wimbush, Joanne F Dorgan
Purpose: This study assessed the feasibility of developing the University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Center (UMGCCC)-Medicare-linked database infrastructure by integrating tumor registry, electronic health records (EHRs), and Medicare administrative claims data. The database was designed to support research identifying determinants of differences in cancer outcomes among patient populations commonly under-represented in clinical trials (based on the US population with the disease) including older adults.
Methods: Patients 65 years and older who were diagnosed and/or received their first course of treatment for a primary tumor at UMGCCC from 2018 to 2021 were included in the database. A two-stage data linkage process was used to merge cancer center tumor registry data with EHR and Medicare claims data. We performed data quality and linkage quality checks. Summary statistics were calculated for patient and tumor characteristics.
Results: Of the 3,322 patients identified from the tumor registry, 3,119 patients (94%) were included in the UMGCCC-Medicare database (mean age 73.1 years, 56% male, 31% Black). Lung cancers were the most common (15%) followed by oral cancers (12%) and non-Hodgkin lymphoma (6%).
Conclusion: The development of the UMGCCC-Medicare database serves as proof of concept for linking real-world data from different sources. The database is a valuable resource for research requiring detailed patient-level data and follow-up that may generate real-world evidence for older adults living in the United States and treated in routine oncology practice.
目的:本研究通过整合肿瘤登记、电子健康记录(EHRs)和医疗保险行政索赔数据,评估开发马里兰大学Marlene and Stewart Greenebaum综合癌症中心(UMGCCC)与医疗保险相关的数据库基础设施的可行性。该数据库旨在支持研究确定在临床试验中通常代表性不足的患者群体(基于美国患有该疾病的人群)中癌症结局差异的决定因素,包括老年人。方法:将2018年至2021年在UMGCCC诊断和/或接受第一疗程原发肿瘤治疗的65岁及以上患者纳入数据库。一个两阶段的数据链接过程用于合并癌症中心肿瘤登记数据与电子病历和医疗保险索赔数据。我们进行了数据质量和链接质量检查。对患者及肿瘤特征进行汇总统计。结果:在从肿瘤登记处确定的3322例患者中,有3119例(94%)患者被纳入UMGCCC-Medicare数据库(平均年龄73.1岁,56%男性,31%黑人)。肺癌是最常见的(15%),其次是口腔癌(12%)和非霍奇金淋巴瘤(6%)。结论:umgcc - medicare数据库的开发为连接来自不同来源的真实世界数据的概念提供了证明。该数据库对于需要详细的患者水平数据和随访的研究来说是一个宝贵的资源,可以为生活在美国并在常规肿瘤实践中治疗的老年人提供真实世界的证据。
{"title":"Building Capacity for Research on Cancer, Older Adults, and Under-Represented Populations: Methods and Lessons Learned From the Development of the University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Center-Medicare Database.","authors":"Tsung-Ying Lee, Eberechukwu Onukwugha, Abree Johnson, Chih Chun Tung, Jessica Dohler, Bindu Kanapuru, Catherine C Lerro, Eun-Shim Nahm, Colleen Reilly, Donna R Rivera, Jonathan Vallejo, Felice Yang, Jessica Wimbush, Joanne F Dorgan","doi":"10.1200/CCI-25-00276","DOIUrl":"https://doi.org/10.1200/CCI-25-00276","url":null,"abstract":"<p><strong>Purpose: </strong>This study assessed the feasibility of developing the University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Center (UMGCCC)-Medicare-linked database infrastructure by integrating tumor registry, electronic health records (EHRs), and Medicare administrative claims data. The database was designed to support research identifying determinants of differences in cancer outcomes among patient populations commonly under-represented in clinical trials (based on the US population with the disease) including older adults.</p><p><strong>Methods: </strong>Patients 65 years and older who were diagnosed and/or received their first course of treatment for a primary tumor at UMGCCC from 2018 to 2021 were included in the database. A two-stage data linkage process was used to merge cancer center tumor registry data with EHR and Medicare claims data. We performed data quality and linkage quality checks. Summary statistics were calculated for patient and tumor characteristics.</p><p><strong>Results: </strong>Of the 3,322 patients identified from the tumor registry, 3,119 patients (94%) were included in the UMGCCC-Medicare database (mean age 73.1 years, 56% male, 31% Black). Lung cancers were the most common (15%) followed by oral cancers (12%) and non-Hodgkin lymphoma (6%).</p><p><strong>Conclusion: </strong>The development of the UMGCCC-Medicare database serves as proof of concept for linking real-world data from different sources. The database is a valuable resource for research requiring detailed patient-level data and follow-up that may generate real-world evidence for older adults living in the United States and treated in routine oncology practice.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2500276"},"PeriodicalIF":2.8,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147482246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2026-02-03DOI: 10.1200/CCI-25-00028
Margaret Guo, Evan Passalacqua, Erik Bao, Brenda Miao, Atul Butte, Travis Zack
Purpose: Biomarkers, or specific somatic alterations, are increasingly required for clinical trial eligibility. Finding and enrolling patients with these biomarkers is essential not only for continuous progress in the treatment of disease but also for democratizing clinical trial participation. Here, we use data from the National Cancer Institute Clinical Trials Reporting Program (NCI CTRP), combined with large language model applications, to survey the current landscape of cancer clinical trials.
Methods: We extracted 20,894 trials from Cancer.gov from the application programming interface (API) of the NCI CTRP. We quantified biomarker rates in cancer subtypes, described the geographic distribution of trial sites, and identified failure causes for these trials. Finally, we built an application from this API to match patients with clinical trials.
Results: We showed that 5,044 of the 20,894 interventional clinical trials contained biomarker eligibility data and trials tended to cluster around large academic centers and cities. We identified 630 biomarkers in 36 cancer subtypes and show that most biomarkers are used as eligibility criteria for multiple cancer subtypes. We highlight that the difficulties with accrual and sponsorship were the most common reason for discontinuing clinical trials. Finally, we demonstrate a novel method to automatically match natural language queries with eligible clinical trials, NCI Clinical Trials Navigator.
Conclusion: A survey of our clinical genomics showed that many individuals likely have mutations that would make them eligible for biomarker-driven trials. We used the NCI Clinical Trials database to show that the distribution of biomarker trials across the United States limits access for many patients and likely leads to the frequent trial termination because of inadequate accrual. Finally, we built an automated publicly available tool that can improve patient-to-trial biomarker-based matching.
{"title":"Exploring the Past and Current Landscape of Biomarker-Driven Clinical Trials Through Large Language Models.","authors":"Margaret Guo, Evan Passalacqua, Erik Bao, Brenda Miao, Atul Butte, Travis Zack","doi":"10.1200/CCI-25-00028","DOIUrl":"10.1200/CCI-25-00028","url":null,"abstract":"<p><strong>Purpose: </strong>Biomarkers, or specific somatic alterations, are increasingly required for clinical trial eligibility. Finding and enrolling patients with these biomarkers is essential not only for continuous progress in the treatment of disease but also for democratizing clinical trial participation. Here, we use data from the National Cancer Institute Clinical Trials Reporting Program (NCI CTRP), combined with large language model applications, to survey the current landscape of cancer clinical trials.</p><p><strong>Methods: </strong>We extracted 20,894 trials from Cancer.gov from the application programming interface (API) of the NCI CTRP. We quantified biomarker rates in cancer subtypes, described the geographic distribution of trial sites, and identified failure causes for these trials. Finally, we built an application from this API to match patients with clinical trials.</p><p><strong>Results: </strong>We showed that 5,044 of the 20,894 interventional clinical trials contained biomarker eligibility data and trials tended to cluster around large academic centers and cities. We identified 630 biomarkers in 36 cancer subtypes and show that most biomarkers are used as eligibility criteria for multiple cancer subtypes. We highlight that the difficulties with accrual and sponsorship were the most common reason for discontinuing clinical trials. Finally, we demonstrate a novel method to automatically match natural language queries with eligible clinical trials, NCI Clinical Trials Navigator.</p><p><strong>Conclusion: </strong>A survey of our clinical genomics showed that many individuals likely have mutations that would make them eligible for biomarker-driven trials. We used the NCI Clinical Trials database to show that the distribution of biomarker trials across the United States limits access for many patients and likely leads to the frequent trial termination because of inadequate accrual. Finally, we built an automated publicly available tool that can improve patient-to-trial biomarker-based matching.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2500028"},"PeriodicalIF":2.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12871862/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2026-02-05DOI: 10.1200/CCI-25-00322
Luxiga Thanabalachandran, Khaled Zaza, Renee Hartzell, Kimberley Miller, Geneviève C Digby, Taylor Moffat, Melinda Mushonga, Kristin Wright, Melanie Powis, John Drover, Siddhartha Srivastava, Monika K Krzyzanowska, Yuchen Li
Purpose: Electronic health record (EHR) systems aim to improve efficiency, care coordination, and patient safety, yet implementation often introduces workflow challenges and staff burden. In 2024, the Cancer Centre of Southeastern Ontario (CCSEO), a regional academic cancer center in Canada, transitioned from a hybrid paper-electronic system to a fully integrated regional EHR. Although hospital EHR adoption has been studied, limited research has examined its impact within ambulatory oncology care, particularly among nonphysician staff, or how institutions responded to the findings. Our study explored oncology healthcare worker perspectives on EHR implementation at CCSEO and identified resulting quality-improvement (QI) initiatives.
Methods: Using purposeful maximum variation sampling, we recruited clinical, administrative, and research staff. Semistructured interviews explored workflow efficiency, documentation burden, staff wellness, patient safety, communication, and training. Data were audio-recorded, transcribed, and analyzed thematically using MAXQDA.
Results: Nineteen interviews were conducted until thematic saturation. Three major themes emerged. (1) Efficiency and workflow: Staff valued consolidated records and regional connectivity but reported navigation complexity, time burden, duplicate orders, reliance on multiple programs, and frequent workarounds. (2) Staff and patient wellness: Staff noted limited training, increased workload, cognitive overload, and reliance on peer support contributed to burnout. (3) Patient safety: Identified risks included order and medication errors, communication breakdowns, poor system visualization, imaging delays, and wristband or labeling issues. Several QI initiatives were implemented in response, including education and navigation rounds, formation of working groups, and integration of artificial intelligence.
Conclusion: EHR implementation introduced both benefits and challenges in oncology workflows. Findings informed multidisciplinary QI initiatives targeting role-specific training, workflow optimization, and safety, offering a framework for other cancer centers transitioning to new EHR systems.
{"title":"Health Care Worker Perspectives After New Electronic Health Record Implementation in an Oncology Ambulatory Clinic: Qualitative and Quality-Improvement Insights.","authors":"Luxiga Thanabalachandran, Khaled Zaza, Renee Hartzell, Kimberley Miller, Geneviève C Digby, Taylor Moffat, Melinda Mushonga, Kristin Wright, Melanie Powis, John Drover, Siddhartha Srivastava, Monika K Krzyzanowska, Yuchen Li","doi":"10.1200/CCI-25-00322","DOIUrl":"https://doi.org/10.1200/CCI-25-00322","url":null,"abstract":"<p><strong>Purpose: </strong>Electronic health record (EHR) systems aim to improve efficiency, care coordination, and patient safety, yet implementation often introduces workflow challenges and staff burden. In 2024, the Cancer Centre of Southeastern Ontario (CCSEO), a regional academic cancer center in Canada, transitioned from a hybrid paper-electronic system to a fully integrated regional EHR. Although hospital EHR adoption has been studied, limited research has examined its impact within ambulatory oncology care, particularly among nonphysician staff, or how institutions responded to the findings. Our study explored oncology healthcare worker perspectives on EHR implementation at CCSEO and identified resulting quality-improvement (QI) initiatives.</p><p><strong>Methods: </strong>Using purposeful maximum variation sampling, we recruited clinical, administrative, and research staff. Semistructured interviews explored workflow efficiency, documentation burden, staff wellness, patient safety, communication, and training. Data were audio-recorded, transcribed, and analyzed thematically using MAXQDA.</p><p><strong>Results: </strong>Nineteen interviews were conducted until thematic saturation. Three major themes emerged. (1) Efficiency and workflow: Staff valued consolidated records and regional connectivity but reported navigation complexity, time burden, duplicate orders, reliance on multiple programs, and frequent workarounds. (2) Staff and patient wellness: Staff noted limited training, increased workload, cognitive overload, and reliance on peer support contributed to burnout. (3) Patient safety: Identified risks included order and medication errors, communication breakdowns, poor system visualization, imaging delays, and wristband or labeling issues. Several QI initiatives were implemented in response, including education and navigation rounds, formation of working groups, and integration of artificial intelligence.</p><p><strong>Conclusion: </strong>EHR implementation introduced both benefits and challenges in oncology workflows. Findings informed multidisciplinary QI initiatives targeting role-specific training, workflow optimization, and safety, offering a framework for other cancer centers transitioning to new EHR systems.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2500322"},"PeriodicalIF":2.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2026-02-03DOI: 10.1200/CCI-25-00234
Paris A Kosmidis, Thanos Kosmidis, Kyriaki Papadopoulou, Nikolaos Korfiatis, Athanasios Vozikis, Sofia Lampaki, Panagiota Economopoulou, Elena Fountzilas, Athina Christopoulou, Epaminondas Samantas, Anastasios Vagionas, Giannis Socrates Mountzios, Georgios Goumas, Nikolaos Tsoukalas, Ilias Athanasiadis, Dimitris Bafaloukos, Chris Panopoulos, Margarita Ioanna Koufaki, George Fountzilas, Georgios Petrakis, Helena Linardou
Purpose: This trial aims to investigate the effectiveness of online digital intervention in patients with non-small cell lung cancer (NSCLC) in terms of adverse events (AEs) and quality of life (QoL).
Methods: This randomized trial recruited 200 patients with advanced NSCLC (March 2022-October 2023). All patients received standard-of-care precise treatment, predominantly immunochemotherapy. The study was designed to assess AEs and QoL improvement. Through the CareAcross online platform, all patients received information about their disease and treatment and reported any of the 22 predefined AEs at any time. Patients were randomly assigned 1:1 in the intervention (A) and control (B) arm; patients in arm A automatically received, additionally, evidence-based guidance for the reported AEs. EuroQol 5-dimension 5-level responses were collected at baseline and at each treatment cycle. Resulting scores were compared between baseline and after the sixth cycle. In addition, patient case-level hospitalization data were collected and costs were estimated based on reimbursed costs as defined by the Ministry of Health, enabling a post hoc analysis.
Results: Clinical characteristics were well-balanced. More AEs were reported by patients online versus to their clinicians (P < .01). Among the 22 AEs, 17 improved more in arm A, with the improvement in rash and stomatitis being statistically significant. In QoL, there was no improvement in any of the five EuroQol 5-Dimension dimensions. Digital intervention was cost-saving with lower mean costs for hospitalization (P < .001). Overall response rate, progression-free survival, and overall survival were not statistically different between the two arms, ensuring comparable clinical outcome.
Conclusion: Digital oncology tends to improve selected AEs and is cost saving. Patients report, digitally, more informative AEs. Digital oncology can be a complementary tool to the oncology team and warrants further exploration.
{"title":"SNF-CLIMEDIN: A Randomized Trial of Digital Support and Intervention in Patients With Advanced Non-Small Cell Lung Cancer. A Hellenic Cooperative Oncology Group Study.","authors":"Paris A Kosmidis, Thanos Kosmidis, Kyriaki Papadopoulou, Nikolaos Korfiatis, Athanasios Vozikis, Sofia Lampaki, Panagiota Economopoulou, Elena Fountzilas, Athina Christopoulou, Epaminondas Samantas, Anastasios Vagionas, Giannis Socrates Mountzios, Georgios Goumas, Nikolaos Tsoukalas, Ilias Athanasiadis, Dimitris Bafaloukos, Chris Panopoulos, Margarita Ioanna Koufaki, George Fountzilas, Georgios Petrakis, Helena Linardou","doi":"10.1200/CCI-25-00234","DOIUrl":"https://doi.org/10.1200/CCI-25-00234","url":null,"abstract":"<p><strong>Purpose: </strong>This trial aims to investigate the effectiveness of online digital intervention in patients with non-small cell lung cancer (NSCLC) in terms of adverse events (AEs) and quality of life (QoL).</p><p><strong>Methods: </strong>This randomized trial recruited 200 patients with advanced NSCLC (March 2022-October 2023). All patients received standard-of-care precise treatment, predominantly immunochemotherapy. The study was designed to assess AEs and QoL improvement. Through the CareAcross online platform, all patients received information about their disease and treatment and reported any of the 22 predefined AEs at any time. Patients were randomly assigned 1:1 in the intervention (A) and control (B) arm; patients in arm A automatically received, additionally, evidence-based guidance for the reported AEs. EuroQol 5-dimension 5-level responses were collected at baseline and at each treatment cycle. Resulting scores were compared between baseline and after the sixth cycle. In addition, patient case-level hospitalization data were collected and costs were estimated based on reimbursed costs as defined by the Ministry of Health, enabling a post hoc analysis.</p><p><strong>Results: </strong>Clinical characteristics were well-balanced. More AEs were reported by patients online versus to their clinicians (<i>P</i> < .01). Among the 22 AEs, 17 improved more in arm A, with the improvement in rash and stomatitis being statistically significant. In QoL, there was no improvement in any of the five EuroQol 5-Dimension dimensions. Digital intervention was cost-saving with lower mean costs for hospitalization (<i>P</i> < .001). Overall response rate, progression-free survival, and overall survival were not statistically different between the two arms, ensuring comparable clinical outcome.</p><p><strong>Conclusion: </strong>Digital oncology tends to improve selected AEs and is cost saving. Patients report, digitally, more informative AEs. Digital oncology can be a complementary tool to the oncology team and warrants further exploration.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2500234"},"PeriodicalIF":2.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2026-02-18DOI: 10.1200/CCI-25-00210
Filippo Pesapane, Emilia Giambersio, Anna Rotili, Roberto Grasso, Aurora Gaeta, Ottavia Battaglia, Lorenzo Conti, Silvia Francesca Maria Pizzoli, Sara Raimondi, Sara Gandini, Gabriella Pravettoni, Enrico Cassano
Purpose: Artificial intelligence (AI) is fast becoming a vital part of health care, dramatically affecting physicians' workflows and patients' outcomes. Understanding patients' opinions on its use is thus essential to ensure its successful adoption. This study aims to evaluate public perceptions of AI in health care and explore patient feedback through a survey.
Methods: From January 2023 to June 2024, a survey on AI in health care was distributed to the public via a QR code shared through social media, posters, and videos, reaching 454 participants, of whom 240 completed the survey. Adapted from a validated 2020 model by Esmaeilzadeh et al, the survey underwent careful translation and cultural adjustments for the Italian population, including forward-backward translation and pilot testing. The survey assessed topics like willingness to use AI, performance anxiety, liability concerns, privacy issues, and its effect on doctor-patient communication. Responses were scored, with lower scores indicating greater acceptance of AI.
Results: The survey showed that 96% supported AI as a tool to assist radiologists and 92% were open to using AI for diagnostics and treatments. Concerns included reliability (61%) and reduced personal interaction (58%). Seventy-two percent trusted AI with data privacy. Overall, 90.4% viewed AI positively.
Conclusion: The study highlights a balanced perspective on AI in health care. While recognizing its potential to enhance diagnostics and treatments, participants raised concerns about reliability, accountability, and interpersonal impacts. Most supported AI as a tool to complement, not replace, human expertise, emphasizing the need for transparent, reliable systems.
{"title":"Public Perspectives on Artificial Intelligence in Medicine and Radiology: Insights From a Survey in an Italian Cancer Referral Center.","authors":"Filippo Pesapane, Emilia Giambersio, Anna Rotili, Roberto Grasso, Aurora Gaeta, Ottavia Battaglia, Lorenzo Conti, Silvia Francesca Maria Pizzoli, Sara Raimondi, Sara Gandini, Gabriella Pravettoni, Enrico Cassano","doi":"10.1200/CCI-25-00210","DOIUrl":"https://doi.org/10.1200/CCI-25-00210","url":null,"abstract":"<p><strong>Purpose: </strong>Artificial intelligence (AI) is fast becoming a vital part of health care, dramatically affecting physicians' workflows and patients' outcomes. Understanding patients' opinions on its use is thus essential to ensure its successful adoption. This study aims to evaluate public perceptions of AI in health care and explore patient feedback through a survey.</p><p><strong>Methods: </strong>From January 2023 to June 2024, a survey on AI in health care was distributed to the public via a QR code shared through social media, posters, and videos, reaching 454 participants, of whom 240 completed the survey. Adapted from a validated 2020 model by Esmaeilzadeh et al, the survey underwent careful translation and cultural adjustments for the Italian population, including forward-backward translation and pilot testing. The survey assessed topics like willingness to use AI, performance anxiety, liability concerns, privacy issues, and its effect on doctor-patient communication. Responses were scored, with lower scores indicating greater acceptance of AI.</p><p><strong>Results: </strong>The survey showed that 96% supported AI as a tool to assist radiologists and 92% were open to using AI for diagnostics and treatments. Concerns included reliability (61%) and reduced personal interaction (58%). Seventy-two percent trusted AI with data privacy. Overall, 90.4% viewed AI positively.</p><p><strong>Conclusion: </strong>The study highlights a balanced perspective on AI in health care. While recognizing its potential to enhance diagnostics and treatments, participants raised concerns about reliability, accountability, and interpersonal impacts. Most supported AI as a tool to complement, not replace, human expertise, emphasizing the need for transparent, reliable systems.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"10 ","pages":"e2500210"},"PeriodicalIF":2.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146222064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}