Remy J H Martens, William P T M van Doorn, Mathie P G Leers, Steven J R Meex, Floris Helmich
{"title":"揭示不确定性:生物和分析变异对分类预测模型预测不确定性的影响。","authors":"Remy J H Martens, William P T M van Doorn, Mathie P G Leers, Steven J R Meex, Floris Helmich","doi":"10.1093/jalm/jfae115","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Interest in prediction models, including machine learning (ML) models, based on laboratory data has increased tremendously. Uncertainty in laboratory measurements and predictions based on such data are inherently intertwined. This study developed a framework for assessing the impact of biological and analytical variation on the prediction uncertainty of categorical prediction models.</p><p><strong>Methods: </strong>Practical application was demonstrated for the prediction of renal function loss (Chronic Kidney Disease Epidemiology Collaboration [CKD-EPI] equation) and 31-day mortality (advanced ML model) in 6360 emergency department patients. Model outcome was calculated in 100 000 simulations of variation in laboratory parameters. Subsequently, the percentage of discordant predictions was calculated with the original prediction as reference. Simulations were repeated assuming increasing levels of analytical variation.</p><p><strong>Results: </strong>For the ML model, area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity were 0.90, 0.44, and 0.96, respectively. At base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-28.8%]. In addition, 7.2% of patients had >5% discordant predictions. At 6× base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-38.8%]. In addition, 11.7% of patients had >5% discordant predictions. However, the impact of analytical variation was limited compared with biological variation. AUROC, sensitivity, and specificity were not affected by variation in laboratory parameters.</p><p><strong>Conclusions: </strong>The impact of biological and analytical variation on the prediction uncertainty of categorical prediction models, including ML models, can be estimated by the occurrence of discordant predictions in a simulation model. Nevertheless, discordant predictions at the individual level do not necessarily affect model performance at the population level.</p>","PeriodicalId":46361,"journal":{"name":"Journal of Applied Laboratory Medicine","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unraveling Uncertainty: The Impact of Biological and Analytical Variation on the Prediction Uncertainty of Categorical Prediction Models.\",\"authors\":\"Remy J H Martens, William P T M van Doorn, Mathie P G Leers, Steven J R Meex, Floris Helmich\",\"doi\":\"10.1093/jalm/jfae115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Interest in prediction models, including machine learning (ML) models, based on laboratory data has increased tremendously. Uncertainty in laboratory measurements and predictions based on such data are inherently intertwined. This study developed a framework for assessing the impact of biological and analytical variation on the prediction uncertainty of categorical prediction models.</p><p><strong>Methods: </strong>Practical application was demonstrated for the prediction of renal function loss (Chronic Kidney Disease Epidemiology Collaboration [CKD-EPI] equation) and 31-day mortality (advanced ML model) in 6360 emergency department patients. Model outcome was calculated in 100 000 simulations of variation in laboratory parameters. Subsequently, the percentage of discordant predictions was calculated with the original prediction as reference. Simulations were repeated assuming increasing levels of analytical variation.</p><p><strong>Results: </strong>For the ML model, area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity were 0.90, 0.44, and 0.96, respectively. At base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-28.8%]. In addition, 7.2% of patients had >5% discordant predictions. At 6× base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-38.8%]. In addition, 11.7% of patients had >5% discordant predictions. However, the impact of analytical variation was limited compared with biological variation. AUROC, sensitivity, and specificity were not affected by variation in laboratory parameters.</p><p><strong>Conclusions: </strong>The impact of biological and analytical variation on the prediction uncertainty of categorical prediction models, including ML models, can be estimated by the occurrence of discordant predictions in a simulation model. Nevertheless, discordant predictions at the individual level do not necessarily affect model performance at the population level.</p>\",\"PeriodicalId\":46361,\"journal\":{\"name\":\"Journal of Applied Laboratory Medicine\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2024-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Laboratory Medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jalm/jfae115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICAL LABORATORY TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Laboratory Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jalm/jfae115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}
Unraveling Uncertainty: The Impact of Biological and Analytical Variation on the Prediction Uncertainty of Categorical Prediction Models.
Background: Interest in prediction models, including machine learning (ML) models, based on laboratory data has increased tremendously. Uncertainty in laboratory measurements and predictions based on such data are inherently intertwined. This study developed a framework for assessing the impact of biological and analytical variation on the prediction uncertainty of categorical prediction models.
Methods: Practical application was demonstrated for the prediction of renal function loss (Chronic Kidney Disease Epidemiology Collaboration [CKD-EPI] equation) and 31-day mortality (advanced ML model) in 6360 emergency department patients. Model outcome was calculated in 100 000 simulations of variation in laboratory parameters. Subsequently, the percentage of discordant predictions was calculated with the original prediction as reference. Simulations were repeated assuming increasing levels of analytical variation.
Results: For the ML model, area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity were 0.90, 0.44, and 0.96, respectively. At base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-28.8%]. In addition, 7.2% of patients had >5% discordant predictions. At 6× base analytical variation, the median [2.5th-97.5th percentiles] percentage of discordant predictions was 0% [0%-38.8%]. In addition, 11.7% of patients had >5% discordant predictions. However, the impact of analytical variation was limited compared with biological variation. AUROC, sensitivity, and specificity were not affected by variation in laboratory parameters.
Conclusions: The impact of biological and analytical variation on the prediction uncertainty of categorical prediction models, including ML models, can be estimated by the occurrence of discordant predictions in a simulation model. Nevertheless, discordant predictions at the individual level do not necessarily affect model performance at the population level.