An Accident Prediction Model Based on ARIMA in Kuala Lumpur, Malaysia, Using Time Series of Actual Accidents and Related Data

IF 0.6 Q3 MULTIDISCIPLINARY SCIENCES Pertanika Journal of Science and Technology Pub Date : 2024-04-01 DOI:10.47836/pjst.32.3.07
Boon Chong Choo, Musab Abdul Razak, Mohd Zahirasri Mohd Tohir, D. A. Awang Biak, Syafiie Syam
{"title":"An Accident Prediction Model Based on ARIMA in Kuala Lumpur, Malaysia, Using Time Series of Actual Accidents and Related Data","authors":"Boon Chong Choo, Musab Abdul Razak, Mohd Zahirasri Mohd Tohir, D. A. Awang Biak, Syafiie Syam","doi":"10.47836/pjst.32.3.07","DOIUrl":null,"url":null,"abstract":"Recently, there has been an emerging trend to analyse time series data and utilise sophisticated tools for optimally fitting time series models. To date, Malaysian industrial accident data is underutilised and lacks informative records. Thus, this paper aims to investigate the Malaysian accident database and further evaluate the optimal forecasting models in accident prediction. The model’s input was based on available data from the Department of Occupational Safety and Health, Malaysia (DOSH), from 2018 until 2021, with 80% of the dataset to train the models and the remaining 20% for validation. The negative binomial and Poisson distribution prediction showed a mean absolute percentage error (MAPE) of 33% and 51%, respectively. It indicated that the negative binomial performed better than the Poisson distribution in accident frequency prediction. The available time series accident data were gathered for four years, and stationarity was checked in R Studio software for the Augmented Dickey-Fuller test. The lowest Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and other error values were used to justify the best model, which was the ARIMA(2,0,2)(2,0,0)(12) model. The ARIMA models were considered after the data showed autocorrelation. The MAPE for both ARIMA in R and manual time series were 40% and 49%, respectively. Therefore, the accident prediction by using R Studio would outperform the manually negative binomial and Poisson distribution. Based on the findings, industrial safety practitioners should report accidents to DOSH truthfully in the era of digitalisation. It could enable future data-driven accident predictions to be carried out.","PeriodicalId":46234,"journal":{"name":"Pertanika Journal of Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.6000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pertanika Journal of Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47836/pjst.32.3.07","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Recently, there has been an emerging trend to analyse time series data and utilise sophisticated tools for optimally fitting time series models. To date, Malaysian industrial accident data is underutilised and lacks informative records. Thus, this paper aims to investigate the Malaysian accident database and further evaluate the optimal forecasting models in accident prediction. The model’s input was based on available data from the Department of Occupational Safety and Health, Malaysia (DOSH), from 2018 until 2021, with 80% of the dataset to train the models and the remaining 20% for validation. The negative binomial and Poisson distribution prediction showed a mean absolute percentage error (MAPE) of 33% and 51%, respectively. It indicated that the negative binomial performed better than the Poisson distribution in accident frequency prediction. The available time series accident data were gathered for four years, and stationarity was checked in R Studio software for the Augmented Dickey-Fuller test. The lowest Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and other error values were used to justify the best model, which was the ARIMA(2,0,2)(2,0,0)(12) model. The ARIMA models were considered after the data showed autocorrelation. The MAPE for both ARIMA in R and manual time series were 40% and 49%, respectively. Therefore, the accident prediction by using R Studio would outperform the manually negative binomial and Poisson distribution. Based on the findings, industrial safety practitioners should report accidents to DOSH truthfully in the era of digitalisation. It could enable future data-driven accident predictions to be carried out.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用实际事故和相关数据的时间序列,基于 ARIMA 的马来西亚吉隆坡事故预测模型
最近,分析时间序列数据和利用先进工具优化拟合时间序列模型已成为一种新兴趋势。迄今为止,马来西亚的工业事故数据尚未得到充分利用,而且缺乏翔实的记录。因此,本文旨在调查马来西亚事故数据库,并进一步评估事故预测中的最优预测模型。模型的输入基于马来西亚职业安全与健康部(DOSH)2018 年至 2021 年的可用数据,其中 80% 的数据集用于训练模型,其余 20% 用于验证。负二项分布和泊松分布预测的平均绝对百分比误差(MAPE)分别为 33% 和 51%。这表明负二项分布在事故频率预测方面的表现优于泊松分布。收集了四年的时间序列事故数据,并在 R Studio 软件中进行了增强 Dickey-Fuller 检验,以检查静态性。使用最低的阿凯克信息准则(AIC)、贝叶斯信息准则(BIC)和其他误差值来证明最佳模型,即 ARIMA(2,0,2)(2,0,0)(12) 模型。ARIMA 模型是在数据显示出自相关性后才被考虑的。R 中的 ARIMA 模型和人工时间序列的 MAPE 分别为 40% 和 49%。因此,使用 R Studio 进行事故预测将优于人工负二项分布和泊松分布。根据研究结果,在数字化时代,工业安全从业人员应向监督与健康部如实报告事故。这将有助于未来进行数据驱动的事故预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Pertanika Journal of Science and Technology
Pertanika Journal of Science and Technology MULTIDISCIPLINARY SCIENCES-
CiteScore
1.50
自引率
16.70%
发文量
178
期刊介绍: Pertanika Journal of Science and Technology aims to provide a forum for high quality research related to science and engineering research. Areas relevant to the scope of the journal include: bioinformatics, bioscience, biotechnology and bio-molecular sciences, chemistry, computer science, ecology, engineering, engineering design, environmental control and management, mathematics and statistics, medicine and health sciences, nanotechnology, physics, safety and emergency management, and related fields of study.
期刊最新文献
A Review on the Development of Microcarriers for Cell Culture Applications The Compatibility of Cement Bonded Fibreboard Through Dimensional Stability Analysis: A Review Bending Effects on Polyvinyl Alcohol Thin Film for Flexible Wearable Antenna Substrate Mesh Optimisation for General 3D Printed Objects with Cusp-Height Triangulation Approach The Riblet Short-Slot Coupler Using Substrate Integrated Waveguide (SIW) for High-frequency Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1