通过机器学习和异常检测预测指标:巴西补充卫生系统的案例研究

Mirele Marques Borges, Cláudio José Müller
{"title":"通过机器学习和异常检测预测指标:巴西补充卫生系统的案例研究","authors":"Mirele Marques Borges, Cláudio José Müller","doi":"10.14807/ijmp.v12i8.1481","DOIUrl":null,"url":null,"abstract":"The research aimed to investigate the stages of a Machine Learning model process creation in order to predict the indicator over the number of medical appointments per day done in the area of ​​supplementary health in the region of Porto Alegre / RS - Brazil and to propose a metric for anomalies detection. Literature review and applied case study was used as a methodology in this paper, besides was used the statistical software called R, in order to prepare the data and create the model. The stages of the case study was: database extraction, division of the base in training and testing, creation of functions and feature engineering, variables selection and correlation analysis, choice of the algorithms with cross-validation and tuning, training of models, application of the models in the test data, selection of the best model and proposal of the metric for anomalies detection. At the end of these stages, it was possible to select the best model in terms of MAE (Mean Absolute Error), the Random Forest, which was the algorithm with better performance when compared to Linear Regression and Neural Network. It also makes possible to identified nine anomaly points and thirty-eight warning points using the standard deviation metric. It was concluded, through the proposed methodology and the results obtained, that the steps of feature engineering and variables selection were essential for the creation and selection of the model, in addition, the proposed metric achieved the objective of generates alerts in the indicator, showing cases with possible problems or opportunities.","PeriodicalId":54124,"journal":{"name":"Independent Journal of Management & Production","volume":" ","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of indicators through machine learning and anomaly detection: a case study in the supplementary health system in Brazil\",\"authors\":\"Mirele Marques Borges, Cláudio José Müller\",\"doi\":\"10.14807/ijmp.v12i8.1481\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The research aimed to investigate the stages of a Machine Learning model process creation in order to predict the indicator over the number of medical appointments per day done in the area of ​​supplementary health in the region of Porto Alegre / RS - Brazil and to propose a metric for anomalies detection. Literature review and applied case study was used as a methodology in this paper, besides was used the statistical software called R, in order to prepare the data and create the model. The stages of the case study was: database extraction, division of the base in training and testing, creation of functions and feature engineering, variables selection and correlation analysis, choice of the algorithms with cross-validation and tuning, training of models, application of the models in the test data, selection of the best model and proposal of the metric for anomalies detection. At the end of these stages, it was possible to select the best model in terms of MAE (Mean Absolute Error), the Random Forest, which was the algorithm with better performance when compared to Linear Regression and Neural Network. It also makes possible to identified nine anomaly points and thirty-eight warning points using the standard deviation metric. It was concluded, through the proposed methodology and the results obtained, that the steps of feature engineering and variables selection were essential for the creation and selection of the model, in addition, the proposed metric achieved the objective of generates alerts in the indicator, showing cases with possible problems or opportunities.\",\"PeriodicalId\":54124,\"journal\":{\"name\":\"Independent Journal of Management & Production\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Independent Journal of Management & Production\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14807/ijmp.v12i8.1481\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MANAGEMENT\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Independent Journal of Management & Production","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14807/ijmp.v12i8.1481","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MANAGEMENT","Score":null,"Total":0}
引用次数: 0

摘要

该研究旨在调查机器学习模型流程创建的各个阶段,以预测巴西阿雷格里港地区补充健康领域每天完成的医疗预约数量的指标,并提出异常检测的指标。本文采用文献研究法和应用案例研究法,并使用统计软件R来准备数据和建立模型。案例研究的阶段包括:数据库提取、训练和测试中基础的划分、函数的创建和特征工程、变量的选择和相关性分析、交叉验证和调优算法的选择、模型的训练、模型在测试数据中的应用、最佳模型的选择和异常检测度量的提出。在这些阶段结束时,可以根据MAE(平均绝对误差)选择最佳模型,随机森林,这是与线性回归和神经网络相比性能更好的算法。它还可以使用标准偏差度量来识别9个异常点和38个警告点。通过提出的方法和获得的结果得出结论,特征工程和变量选择的步骤对于模型的创建和选择至关重要,此外,所提出的度量实现了在指标中生成警报的目标,显示可能存在问题或机会的情况。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Prediction of indicators through machine learning and anomaly detection: a case study in the supplementary health system in Brazil
The research aimed to investigate the stages of a Machine Learning model process creation in order to predict the indicator over the number of medical appointments per day done in the area of ​​supplementary health in the region of Porto Alegre / RS - Brazil and to propose a metric for anomalies detection. Literature review and applied case study was used as a methodology in this paper, besides was used the statistical software called R, in order to prepare the data and create the model. The stages of the case study was: database extraction, division of the base in training and testing, creation of functions and feature engineering, variables selection and correlation analysis, choice of the algorithms with cross-validation and tuning, training of models, application of the models in the test data, selection of the best model and proposal of the metric for anomalies detection. At the end of these stages, it was possible to select the best model in terms of MAE (Mean Absolute Error), the Random Forest, which was the algorithm with better performance when compared to Linear Regression and Neural Network. It also makes possible to identified nine anomaly points and thirty-eight warning points using the standard deviation metric. It was concluded, through the proposed methodology and the results obtained, that the steps of feature engineering and variables selection were essential for the creation and selection of the model, in addition, the proposed metric achieved the objective of generates alerts in the indicator, showing cases with possible problems or opportunities.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
20.00%
发文量
105
期刊最新文献
Age-Period-Cohort Analysis of Ischemic Heart Disease Morbidity and Mortality in China, 1990-2019. Inverse modeling of the stewart foot A simulation-based approach to identify bottlenecks in the bearing manufacturing process Discrete event simulation applied to single queue management: a case study at a bank agency Impact of seasonality on the inbound and outbound freight of a global fertilizer company
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1