Sara Rapuc, Blaž Stres, Ivan Verdenik, Miha Lučovnik, Damjan Osredkar
{"title":"Uncovering early predictors of cerebral palsy through the application of machine learning: a case-control study.","authors":"Sara Rapuc, Blaž Stres, Ivan Verdenik, Miha Lučovnik, Damjan Osredkar","doi":"10.1136/bmjpo-2024-002800","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Cerebral palsy (CP) is a group of neurological disorders with profound implications for children's development. The identification of perinatal risk factors for CP may lead to improved preventive and therapeutic strategies. This study aimed to identify the early predictors of CP using machine learning (ML).</p><p><strong>Design: </strong>This is a retrospective case-control study, using data from the two population-based databases, the Slovenian National Perinatal Information System and the Slovenian Registry of Cerebral Palsy. Multiple ML algorithms were evaluated to identify the best model for predicting CP.</p><p><strong>Setting: </strong>This is a population-based study of CP and control subjects born into one of Slovenia's 14 maternity wards.</p><p><strong>Participants: </strong>A total of 382 CP cases, born between 2002 and 2017, were identified. Controls were selected at a control-to-case ratio of 3:1, with matched gestational age and birth multiplicity. CP cases with congenital anomalies (<i>n</i>=44) were excluded from the analysis. A total of 338 CP cases and 1014 controls were included in the study.</p><p><strong>Exposure: </strong>135 variables relating to perinatal and maternal factors.</p><p><strong>Main outcome measures: </strong>Receiver operating characteristic (ROC), sensitivity and specificity.</p><p><strong>Results: </strong>The stochastic gradient boosting ML model (271 cases and 812 controls) demonstrated the highest mean ROC value of 0.81 (mean sensitivity=0.46 and mean specificity=0.95). Using this model with the validation dataset (67 cases and 202 controls) resulted in an area under the ROC curve of 0.77 (mean sensitivity=0.27 and mean specificity=0.94).</p><p><strong>Conclusions: </strong>Our final ML model using early perinatal factors could not reliably predict CP in our cohort. Future studies should evaluate models with additional factors, such as genetic and neuroimaging data.</p>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11367350/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/bmjpo-2024-002800","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: Cerebral palsy (CP) is a group of neurological disorders with profound implications for children's development. The identification of perinatal risk factors for CP may lead to improved preventive and therapeutic strategies. This study aimed to identify the early predictors of CP using machine learning (ML).
Design: This is a retrospective case-control study, using data from the two population-based databases, the Slovenian National Perinatal Information System and the Slovenian Registry of Cerebral Palsy. Multiple ML algorithms were evaluated to identify the best model for predicting CP.
Setting: This is a population-based study of CP and control subjects born into one of Slovenia's 14 maternity wards.
Participants: A total of 382 CP cases, born between 2002 and 2017, were identified. Controls were selected at a control-to-case ratio of 3:1, with matched gestational age and birth multiplicity. CP cases with congenital anomalies (n=44) were excluded from the analysis. A total of 338 CP cases and 1014 controls were included in the study.
Exposure: 135 variables relating to perinatal and maternal factors.
Main outcome measures: Receiver operating characteristic (ROC), sensitivity and specificity.
Results: The stochastic gradient boosting ML model (271 cases and 812 controls) demonstrated the highest mean ROC value of 0.81 (mean sensitivity=0.46 and mean specificity=0.95). Using this model with the validation dataset (67 cases and 202 controls) resulted in an area under the ROC curve of 0.77 (mean sensitivity=0.27 and mean specificity=0.94).
Conclusions: Our final ML model using early perinatal factors could not reliably predict CP in our cohort. Future studies should evaluate models with additional factors, such as genetic and neuroimaging data.