William A Russel, Jim Perry, Claire Bonzani, Amanda Dontino, Zeleke Mekonnen, Ahmet Ay, Bineyam Taye
{"title":"Feature selection and association rule learning identify risk factors of malnutrition among Ethiopian schoolchildren.","authors":"William A Russel, Jim Perry, Claire Bonzani, Amanda Dontino, Zeleke Mekonnen, Ahmet Ay, Bineyam Taye","doi":"10.3389/fepid.2023.1150619","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Previous studies have sought to identify risk factors for malnutrition in populations of schoolchildren, depending on traditional logistic regression methods. However, holistic machine learning (ML) approaches are emerging that may provide a more comprehensive analysis of risk factors.</p><p><strong>Methods: </strong>This study employed feature selection and association rule learning ML methods in conjunction with logistic regression on epidemiological survey data from 1,036 Ethiopian school children. Our first analysis used the entire dataset and then we reran this analysis on age, residence, and sex population subsets.</p><p><strong>Results: </strong>Both logistic regression and ML methods identified older childhood age as a significant risk factor, while females and vaccinated individuals showed reduced odds of stunting. Our machine learning analyses provided additional insights into the data, as feature selection identified that age, school latrine cleanliness, large family size, and nail trimming habits were significant risk factors for stunting, underweight, and thinness. Association rule learning revealed an association between co-occurring hygiene and socio-economical variables with malnutrition that was otherwise missed using traditional statistical methods.</p><p><strong>Discussion: </strong>Our analysis supports the benefit of integrating feature selection methods, association rules learning techniques, and logistic regression to identify comprehensive risk factors associated with malnutrition in young children.</p>","PeriodicalId":73083,"journal":{"name":"Frontiers in epidemiology","volume":" ","pages":"1150619"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10910994/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in epidemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fepid.2023.1150619","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Previous studies have sought to identify risk factors for malnutrition in populations of schoolchildren, depending on traditional logistic regression methods. However, holistic machine learning (ML) approaches are emerging that may provide a more comprehensive analysis of risk factors.
Methods: This study employed feature selection and association rule learning ML methods in conjunction with logistic regression on epidemiological survey data from 1,036 Ethiopian school children. Our first analysis used the entire dataset and then we reran this analysis on age, residence, and sex population subsets.
Results: Both logistic regression and ML methods identified older childhood age as a significant risk factor, while females and vaccinated individuals showed reduced odds of stunting. Our machine learning analyses provided additional insights into the data, as feature selection identified that age, school latrine cleanliness, large family size, and nail trimming habits were significant risk factors for stunting, underweight, and thinness. Association rule learning revealed an association between co-occurring hygiene and socio-economical variables with malnutrition that was otherwise missed using traditional statistical methods.
Discussion: Our analysis supports the benefit of integrating feature selection methods, association rules learning techniques, and logistic regression to identify comprehensive risk factors associated with malnutrition in young children.