Determinants in Predicting Life Expectancy Using Machine Learning

B. Kouame Amos, I. V. Smirnov
{"title":"Determinants in Predicting Life Expectancy Using Machine Learning","authors":"B. Kouame Amos, I. V. Smirnov","doi":"10.23947/2687-1653-2022-22-4-373-383","DOIUrl":null,"url":null,"abstract":"   Introduction. Life expectancy is, by definition, the average number of years a person can expect to live from birth to death. It is therefore the best indicator for assessing the health of  human beings, but also a comprehensive index for assessing the level of economic development, education and health systems . From our extensive research, we have found that most existing studies contain qualitative analyses of one or a few factors. There is a lack of quantitative analyses of multiple factors, which leads to a situation where the predominant factor influencing life expectancy cannot be identified with precision. However, with the existence of various conditions and complications witnessed in society today, several factors need to be taken into consideration to predict life expectancy. Therefore, various machine learning models have been developed to predict life expectancy.   The aim of this article is to identify the factors that determine life expectancy.   Materials and Methods. Our research uses the  Pearson  correlation coefficient  to assess correlations between indicators, and we use multiple linear regression models,  Ridge regression, and Lasso regression  to measure the impact of each indicator on  life expectancy .  For model selection, the Akaike information criterion, the coefficient of variation and the mean square error were used. R2 and the mean square error were used.   Results. Based on these criteria, multiple linear regression was selected for the development of the life expectancy prediction model, as this model obtained the smallest Akaike information criterion of 6109.07, an adjusted coefficient of 85 % and an RMSE of 3.85.   Conclusion and Discussion. At the end of our study, we concluded that the variables that best explain life expectancy are adult mortality, infant mortality, percentage of expenditure, measles, under-five mortality, polio, total expenditure, diphtheria, HIV / AIDS, GDP, longevity of 1.19 years, resource composition, and schooling. The results of this analysis can be used by the World Health Organization and the health sectors to improve society.","PeriodicalId":13758,"journal":{"name":"International Journal of Advanced Engineering Research and Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Engineering Research and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23947/2687-1653-2022-22-4-373-383","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

   Introduction. Life expectancy is, by definition, the average number of years a person can expect to live from birth to death. It is therefore the best indicator for assessing the health of  human beings, but also a comprehensive index for assessing the level of economic development, education and health systems . From our extensive research, we have found that most existing studies contain qualitative analyses of one or a few factors. There is a lack of quantitative analyses of multiple factors, which leads to a situation where the predominant factor influencing life expectancy cannot be identified with precision. However, with the existence of various conditions and complications witnessed in society today, several factors need to be taken into consideration to predict life expectancy. Therefore, various machine learning models have been developed to predict life expectancy.   The aim of this article is to identify the factors that determine life expectancy.   Materials and Methods. Our research uses the  Pearson  correlation coefficient  to assess correlations between indicators, and we use multiple linear regression models,  Ridge regression, and Lasso regression  to measure the impact of each indicator on  life expectancy .  For model selection, the Akaike information criterion, the coefficient of variation and the mean square error were used. R2 and the mean square error were used.   Results. Based on these criteria, multiple linear regression was selected for the development of the life expectancy prediction model, as this model obtained the smallest Akaike information criterion of 6109.07, an adjusted coefficient of 85 % and an RMSE of 3.85.   Conclusion and Discussion. At the end of our study, we concluded that the variables that best explain life expectancy are adult mortality, infant mortality, percentage of expenditure, measles, under-five mortality, polio, total expenditure, diphtheria, HIV / AIDS, GDP, longevity of 1.19 years, resource composition, and schooling. The results of this analysis can be used by the World Health Organization and the health sectors to improve society.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用机器学习预测预期寿命的决定因素
介绍。根据定义,预期寿命是指一个人从出生到死亡的平均寿命。因此,它是评估人类健康的最佳指标,也是评估经济发展、教育和卫生系统水平的综合指标。从我们广泛的研究中,我们发现大多数现有的研究都包含对一个或几个因素的定性分析。缺乏对多因素的定量分析,导致无法准确确定影响预期寿命的主要因素。然而,随着当今社会中各种疾病和并发症的存在,预测预期寿命需要考虑几个因素。因此,人们开发了各种机器学习模型来预测预期寿命。本文的目的是找出决定预期寿命的因素。材料与方法。我们的研究使用Pearson相关系数来评估指标之间的相关性,并使用多元线性回归模型、Ridge回归和Lasso回归来衡量每个指标对预期寿命的影响。模型选择采用赤池信息准则、变异系数和均方误差。采用R2和均方误差。结果。在此基础上,选择多元线性回归建立预期寿命预测模型,该模型获得了最小的赤池信息准则6109.07,调整系数为85%,RMSE为3.85。结论与讨论。在我们的研究结束时,我们得出结论,最能解释预期寿命的变量是成人死亡率、婴儿死亡率、支出百分比、麻疹、五岁以下儿童死亡率、脊髓灰质炎、总支出、白喉、艾滋病毒/艾滋病、GDP、1.19岁的寿命、资源构成和学校教育。世界卫生组织和卫生部门可以利用这一分析的结果来改善社会。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Investigation of Dynamic Characteristics of an Automated Position Long-Stroke Pneumatic Actuator of Fabrication System Optimization of Geometric Characteristics of Cycloidal Profiles of Gerotor Hydraulic Machines Sensitivity of Diffusion-Weighted Image Combined with T2 Turbo Inversion Recovery Magnitude Sequence and as an Alternative to Contrast-Enhanced MRI in the Detection of Perianal Fistula GATCGGenerator: New Software for Generation of Quasirandom Nucleotide Sequences 3D Human Motion Capture Method Based on Computer Vision
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1