Solomon Ntukidem, A. Chukwu, O. Oyamakin, C. James, Ignace Habimana-Kabano
{"title":"尼日利亚5岁以下儿童死亡率的趋势分析和决定因素:机器学习方法","authors":"Solomon Ntukidem, A. Chukwu, O. Oyamakin, C. James, Ignace Habimana-Kabano","doi":"10.9734/ajpas/2023/v24i2520","DOIUrl":null,"url":null,"abstract":"The study aimed to examine the trend of the under-five mortality rate in Nigeria from 2003 to 2018 and the determinants of under-five mortality using the Nigeria Demographic and Health Survey (NDHS) data. The data for the study was the Nigeria Demographic and Health Survey data conducted in 2003, 2008, 2013, and 2018. These four surveys were used to study under-five mortality trends within the study period, while machine learning was applied only to the 2018 dataset being the latest in Nigeria. The data were partitioned into training and testing sets. 30% of the dataset was randomly selected for testing, while 70% was used in training the model. Before applying logistic regression and neural networks, the essential under-five mortality variables were first selected using a random forest classifier. \nThe trend showed that the mortality rates were 200.72, 156.86, 128.05, and 132.02 in 2003, 2008, 2013, and 2018 respectively, per 1,000 live births. This result means that one in every five children died before their fifth birthday in 2003, one in six in 2008, one in eight in 2013, and one in seven in 2018. The forecast result indicated that the under-five mortality rate would likely be 102.17 in 2023. The variable importance result of the random forest showed that breastfeeding (when the child was put to the breast after birth) had the highest contribution to under-five mortality. The breakdown of breastfeeding from the logistic regression result showed that delaying the breastfeeding of a child to 6-23 hours in comparison with 0-5 hours after birth increases by 1.4 fold the likelihood of child death. The accuracy of logistic regression (LR) on the test set was 60%, and that of deep neural network (DNN) was 74%, recall (sensitivity) for LR was 63%, and DNN was 75%), Precision (LR=97%, DNN=95), F1 score (LR=76%, DNN=84%) and area under the curve (AUC) (LR=79%, DNN=77%). \nBoth logistic regression and deep neural network models performed very well in discriminative ability and accuracy. The deep neural network had a better performance than the logistic regression.","PeriodicalId":8532,"journal":{"name":"Asian Journal of Probability and Statistics","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Trend Analysis and Determinants of under-5 Mortality in Nigeria: A Machine Learning Approach\",\"authors\":\"Solomon Ntukidem, A. Chukwu, O. Oyamakin, C. James, Ignace Habimana-Kabano\",\"doi\":\"10.9734/ajpas/2023/v24i2520\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The study aimed to examine the trend of the under-five mortality rate in Nigeria from 2003 to 2018 and the determinants of under-five mortality using the Nigeria Demographic and Health Survey (NDHS) data. The data for the study was the Nigeria Demographic and Health Survey data conducted in 2003, 2008, 2013, and 2018. These four surveys were used to study under-five mortality trends within the study period, while machine learning was applied only to the 2018 dataset being the latest in Nigeria. The data were partitioned into training and testing sets. 30% of the dataset was randomly selected for testing, while 70% was used in training the model. Before applying logistic regression and neural networks, the essential under-five mortality variables were first selected using a random forest classifier. \\nThe trend showed that the mortality rates were 200.72, 156.86, 128.05, and 132.02 in 2003, 2008, 2013, and 2018 respectively, per 1,000 live births. This result means that one in every five children died before their fifth birthday in 2003, one in six in 2008, one in eight in 2013, and one in seven in 2018. The forecast result indicated that the under-five mortality rate would likely be 102.17 in 2023. The variable importance result of the random forest showed that breastfeeding (when the child was put to the breast after birth) had the highest contribution to under-five mortality. The breakdown of breastfeeding from the logistic regression result showed that delaying the breastfeeding of a child to 6-23 hours in comparison with 0-5 hours after birth increases by 1.4 fold the likelihood of child death. The accuracy of logistic regression (LR) on the test set was 60%, and that of deep neural network (DNN) was 74%, recall (sensitivity) for LR was 63%, and DNN was 75%), Precision (LR=97%, DNN=95), F1 score (LR=76%, DNN=84%) and area under the curve (AUC) (LR=79%, DNN=77%). \\nBoth logistic regression and deep neural network models performed very well in discriminative ability and accuracy. The deep neural network had a better performance than the logistic regression.\",\"PeriodicalId\":8532,\"journal\":{\"name\":\"Asian Journal of Probability and Statistics\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Asian Journal of Probability and Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.9734/ajpas/2023/v24i2520\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asian Journal of Probability and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.9734/ajpas/2023/v24i2520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Trend Analysis and Determinants of under-5 Mortality in Nigeria: A Machine Learning Approach
The study aimed to examine the trend of the under-five mortality rate in Nigeria from 2003 to 2018 and the determinants of under-five mortality using the Nigeria Demographic and Health Survey (NDHS) data. The data for the study was the Nigeria Demographic and Health Survey data conducted in 2003, 2008, 2013, and 2018. These four surveys were used to study under-five mortality trends within the study period, while machine learning was applied only to the 2018 dataset being the latest in Nigeria. The data were partitioned into training and testing sets. 30% of the dataset was randomly selected for testing, while 70% was used in training the model. Before applying logistic regression and neural networks, the essential under-five mortality variables were first selected using a random forest classifier.
The trend showed that the mortality rates were 200.72, 156.86, 128.05, and 132.02 in 2003, 2008, 2013, and 2018 respectively, per 1,000 live births. This result means that one in every five children died before their fifth birthday in 2003, one in six in 2008, one in eight in 2013, and one in seven in 2018. The forecast result indicated that the under-five mortality rate would likely be 102.17 in 2023. The variable importance result of the random forest showed that breastfeeding (when the child was put to the breast after birth) had the highest contribution to under-five mortality. The breakdown of breastfeeding from the logistic regression result showed that delaying the breastfeeding of a child to 6-23 hours in comparison with 0-5 hours after birth increases by 1.4 fold the likelihood of child death. The accuracy of logistic regression (LR) on the test set was 60%, and that of deep neural network (DNN) was 74%, recall (sensitivity) for LR was 63%, and DNN was 75%), Precision (LR=97%, DNN=95), F1 score (LR=76%, DNN=84%) and area under the curve (AUC) (LR=79%, DNN=77%).
Both logistic regression and deep neural network models performed very well in discriminative ability and accuracy. The deep neural network had a better performance than the logistic regression.