Exploring machine learning tools in a retrospective case-study of patients with metastatic non-small cell lung cancer treated with first-line immunotherapy: A feasibility single-centre experience
Francesca Rita Ogliari , Alberto Traverso , Simone Barbieri , Marco Montagna , Filippo Chiabrando , Enrico Versino , Antonio Bosco , Alessia Lin , Roberto Ferrara , Sara Oresti , Giuseppe Damiano , Maria Grazia Viganò , Michele Ferrara , Silvia Teresa Riva , Antonio Nuccio , Francesco Maria Venanzi , Davide Vignale , Giuseppe Cicala , Anna Palmisano , Stefano Cascinu , Michele Reni
{"title":"Exploring machine learning tools in a retrospective case-study of patients with metastatic non-small cell lung cancer treated with first-line immunotherapy: A feasibility single-centre experience","authors":"Francesca Rita Ogliari , Alberto Traverso , Simone Barbieri , Marco Montagna , Filippo Chiabrando , Enrico Versino , Antonio Bosco , Alessia Lin , Roberto Ferrara , Sara Oresti , Giuseppe Damiano , Maria Grazia Viganò , Michele Ferrara , Silvia Teresa Riva , Antonio Nuccio , Francesco Maria Venanzi , Davide Vignale , Giuseppe Cicala , Anna Palmisano , Stefano Cascinu , Michele Reni","doi":"10.1016/j.lungcan.2024.108075","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Artificial intelligence (AI) models are emerging as promising tools to identify predictive features among data coming from health records. Their application in clinical routine is still challenging, due to technical limits and to explainability issues in this specific setting. Response to standard first-line immunotherapy (ICI) in metastatic Non-Small-Cell Lung Cancer (NSCLC) is an interesting population for machine learning (ML), since up to 30% of patients do not benefit.</div></div><div><h3>Methods</h3><div>We retrospectively collected all consecutive patients with PD-L1 ≥ 50 % metastatic NSCLC treated with first-line ICI at our institution between 2017 and 2021. Demographic, laboratory, molecular and clinical data were retrieved manually or automatically according to data sources. Primary aim was to explore feasibility of ML models in clinical routine setting and to detect problems and solutions for everyday implementation. Early progression was used as preliminary endpoint to test our algorithm.</div></div><div><h3>Results</h3><div>Out of 123 patients, 106 were included, 52/106 (49 %) had disease progression or died within 3 months of start of ICI. Early progression correlated with increased neutrophil percentage (>80 % of white blood cells), neutrophil/lymphocyte ratio (≥8) and lower-range PD-L1 status (<70 %) at baseline, which was consistent with literature. Automated ML (AutoML) models run on our dataset reached precision scores around 80 %, with Voting Ensemble emerging as best performing model, while white-box models (such as Shapley Additive exPlanations) provided better explainability. In all AutoML models, laboratory features were the top selected features, whilst clinical ones needed more pre-processing before gaining relevance, which was consistent with different data extraction (automatic versus manual) and missing data rates.</div></div><div><h3>Conclusions</h3><div>ML models’ application is feasible in clinical practice and can trustworthily predict early progression during first-line ICI for metastatic NSCLC. Solving pre-analytical issues is key for future improvement, focusing on automatic tools for data extraction, collection and explainability.</div></div>","PeriodicalId":18129,"journal":{"name":"Lung Cancer","volume":"199 ","pages":"Article 108075"},"PeriodicalIF":4.5000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lung Cancer","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169500224006093","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Artificial intelligence (AI) models are emerging as promising tools to identify predictive features among data coming from health records. Their application in clinical routine is still challenging, due to technical limits and to explainability issues in this specific setting. Response to standard first-line immunotherapy (ICI) in metastatic Non-Small-Cell Lung Cancer (NSCLC) is an interesting population for machine learning (ML), since up to 30% of patients do not benefit.
Methods
We retrospectively collected all consecutive patients with PD-L1 ≥ 50 % metastatic NSCLC treated with first-line ICI at our institution between 2017 and 2021. Demographic, laboratory, molecular and clinical data were retrieved manually or automatically according to data sources. Primary aim was to explore feasibility of ML models in clinical routine setting and to detect problems and solutions for everyday implementation. Early progression was used as preliminary endpoint to test our algorithm.
Results
Out of 123 patients, 106 were included, 52/106 (49 %) had disease progression or died within 3 months of start of ICI. Early progression correlated with increased neutrophil percentage (>80 % of white blood cells), neutrophil/lymphocyte ratio (≥8) and lower-range PD-L1 status (<70 %) at baseline, which was consistent with literature. Automated ML (AutoML) models run on our dataset reached precision scores around 80 %, with Voting Ensemble emerging as best performing model, while white-box models (such as Shapley Additive exPlanations) provided better explainability. In all AutoML models, laboratory features were the top selected features, whilst clinical ones needed more pre-processing before gaining relevance, which was consistent with different data extraction (automatic versus manual) and missing data rates.
Conclusions
ML models’ application is feasible in clinical practice and can trustworthily predict early progression during first-line ICI for metastatic NSCLC. Solving pre-analytical issues is key for future improvement, focusing on automatic tools for data extraction, collection and explainability.
期刊介绍:
Lung Cancer is an international publication covering the clinical, translational and basic science of malignancies of the lung and chest region.Original research articles, early reports, review articles, editorials and correspondence covering the prevention, epidemiology and etiology, basic biology, pathology, clinical assessment, surgery, chemotherapy, radiotherapy, combined treatment modalities, other treatment modalities and outcomes of lung cancer are welcome.