Jonathan Hicks, Karyn Robinson, Stephanie Lee, Adam Marsh, Robert Akins
{"title":"Novel Machine Learning of DNA Methylation Patterns to Diagnose Complex Disease: Identification of Cerebral Palsy with Concurrent Epilepsy.","authors":"Jonathan Hicks, Karyn Robinson, Stephanie Lee, Adam Marsh, Robert Akins","doi":"10.21203/rs.3.rs-4560364/v1","DOIUrl":null,"url":null,"abstract":"<p><p>Spastic cerebral palsy (CP) is a common pediatric-onset disability with an estimated prevalence of 0.2%. It is a complex condition characterized by muscle stiffness, contractures, and abnormal movement. Spastic CP is difficult to diagnose. Although nearly all affected children are born with it or acquire it immediately after birth, many are not identified until after 19 months of age with the diagnosis often not confirmed until 5 years of age. In addition, CP frequently co-occurs with other complex conditions that can complicate diagnosis and treatment. For example, an estimated 42% of spastic CP cases have co-occurring epilepsy. Recent studies indicate that altered DNA methylation patterns in peripheral blood cells are associated with CP and may have diagnostic value.Accordingly, the purpose of this study is to assess the diagnostic value of methylation in CP with more complex disease states. We evaluated machine learning classification for detecting CP based on DNA methylation pattern analysis in the context of co-occurrent epilepsy. Blood samples from 30 study participants diagnosed with epilepsy (n=4), spastic CP (n=10), both (n=8), or neither (n=8) were analyzed by Illumina MethylationEpic arrays. A novel machine learning algorithm using a Support Vector Machine (SVM) or Linear Discriminant Analysis (LDA) was developed to identify methylation loci that classified CP from controls and to measure the classification ability of identified methylation loci. The isolation of informative methylation loci was performed in a binary comparison between CP and controls, as well as in a 4-way comparison that included epilepsy. Median F1 scores for SVM-based analysis were 0.67 in 4-class comparison, and 1.0 in the binary classification. SVM outperformed LDA (median F1 0.57 and 0.86, respectively). Overall, the novel machine learning based algorithm was able to classify study participants with spastic CP and/or epilepsy from controls with significant performance.</p>","PeriodicalId":94282,"journal":{"name":"Research square","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11213172/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research square","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21203/rs.3.rs-4560364/v1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Spastic cerebral palsy (CP) is a common pediatric-onset disability with an estimated prevalence of 0.2%. It is a complex condition characterized by muscle stiffness, contractures, and abnormal movement. Spastic CP is difficult to diagnose. Although nearly all affected children are born with it or acquire it immediately after birth, many are not identified until after 19 months of age with the diagnosis often not confirmed until 5 years of age. In addition, CP frequently co-occurs with other complex conditions that can complicate diagnosis and treatment. For example, an estimated 42% of spastic CP cases have co-occurring epilepsy. Recent studies indicate that altered DNA methylation patterns in peripheral blood cells are associated with CP and may have diagnostic value.Accordingly, the purpose of this study is to assess the diagnostic value of methylation in CP with more complex disease states. We evaluated machine learning classification for detecting CP based on DNA methylation pattern analysis in the context of co-occurrent epilepsy. Blood samples from 30 study participants diagnosed with epilepsy (n=4), spastic CP (n=10), both (n=8), or neither (n=8) were analyzed by Illumina MethylationEpic arrays. A novel machine learning algorithm using a Support Vector Machine (SVM) or Linear Discriminant Analysis (LDA) was developed to identify methylation loci that classified CP from controls and to measure the classification ability of identified methylation loci. The isolation of informative methylation loci was performed in a binary comparison between CP and controls, as well as in a 4-way comparison that included epilepsy. Median F1 scores for SVM-based analysis were 0.67 in 4-class comparison, and 1.0 in the binary classification. SVM outperformed LDA (median F1 0.57 and 0.86, respectively). Overall, the novel machine learning based algorithm was able to classify study participants with spastic CP and/or epilepsy from controls with significant performance.