{"title":"Machine learning for comprehensive interaction modelling improves disease risk prediction in the UK Biobank","authors":"Heli Julkunen, Juho Rousu","doi":"10.1101/2024.08.07.24311604","DOIUrl":null,"url":null,"abstract":"Understanding how risk factors interact to jointly influence disease risk can provide insights into disease development and improve risk prediction. We introduce survivalFM, a machine learning extension to the widely used Cox proportional hazards model that incorporates estimation of all potential pairwise interaction effects on time-to-event outcomes. The method relies on learning a low-rank factorized approximation of the interaction effects, hence overcoming the computational and statistical limitations of fitting these terms in models involving\nmany predictor variables. The resulting model is fully interpretable, providing access to the estimates of both individual effects and the approximated interactions. Comprehensive evaluation of survivalFM using the UK Biobank dataset across ten disease examples and a variety\nof clinical risk factors and omics data modalities shows improved discrimination and reclassification performance (65% and 97.5% of the scenarios tested, respectively). Considering a clinical scenario of cardiovascular risk prediction using predictors from the established\nQRISK3 model, we further show that the comprehensive interaction modelling adds predictive value beyond the individual and age interaction effects currently included. These results demonstrate that comprehensive modelling of interactions can facilitate advanced insights into disease development and improve risk predictions.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.07.24311604","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding how risk factors interact to jointly influence disease risk can provide insights into disease development and improve risk prediction. We introduce survivalFM, a machine learning extension to the widely used Cox proportional hazards model that incorporates estimation of all potential pairwise interaction effects on time-to-event outcomes. The method relies on learning a low-rank factorized approximation of the interaction effects, hence overcoming the computational and statistical limitations of fitting these terms in models involving
many predictor variables. The resulting model is fully interpretable, providing access to the estimates of both individual effects and the approximated interactions. Comprehensive evaluation of survivalFM using the UK Biobank dataset across ten disease examples and a variety
of clinical risk factors and omics data modalities shows improved discrimination and reclassification performance (65% and 97.5% of the scenarios tested, respectively). Considering a clinical scenario of cardiovascular risk prediction using predictors from the established
QRISK3 model, we further show that the comprehensive interaction modelling adds predictive value beyond the individual and age interaction effects currently included. These results demonstrate that comprehensive modelling of interactions can facilitate advanced insights into disease development and improve risk predictions.