Machine Learning Method for Changepoint Detection in Short Time Series Data

IF 4 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Machine learning and knowledge extraction Pub Date : 2023-10-05 DOI:10.3390/make5040071
Veronika Smejkalová, Radovan Šomplák, Martin Rosecký, Kristína Šramková
{"title":"Machine Learning Method for Changepoint Detection in Short Time Series Data","authors":"Veronika Smejkalová, Radovan Šomplák, Martin Rosecký, Kristína Šramková","doi":"10.3390/make5040071","DOIUrl":null,"url":null,"abstract":"Analysis of data is crucial in waste management to improve effective planning from both short- and long-term perspectives. Real-world data often presents anomalies, but in the waste management sector, anomaly detection is seldom performed. The main goal and contribution of this paper is a proposal of a complex machine learning framework for changepoint detection in a large number of short time series from waste management. In such a case, it is not possible to use only an expert-based approach due to the time-consuming nature of this process and subjectivity. The proposed framework consists of two steps: (1) outlier detection via outlier test for trend-adjusted data, and (2) changepoints are identified via comparison of linear model parameters. In order to use the proposed method, it is necessary to have a sufficient number of experts’ assessments of the presence of anomalies in time series. The proposed framework is demonstrated on waste management data from the Czech Republic. It is observed that certain waste categories in specific regions frequently exhibit changepoints. On the micro-regional level, approximately 31.1% of time series contain at least one outlier and 16.4% exhibit changepoints. Certain groups of waste are more prone to the occurrence of anomalies. The results indicate that even in the case of aggregated data, anomalies are not rare, and their presence should always be checked.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"56 1","pages":"0"},"PeriodicalIF":4.0000,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning and knowledge extraction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/make5040071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Analysis of data is crucial in waste management to improve effective planning from both short- and long-term perspectives. Real-world data often presents anomalies, but in the waste management sector, anomaly detection is seldom performed. The main goal and contribution of this paper is a proposal of a complex machine learning framework for changepoint detection in a large number of short time series from waste management. In such a case, it is not possible to use only an expert-based approach due to the time-consuming nature of this process and subjectivity. The proposed framework consists of two steps: (1) outlier detection via outlier test for trend-adjusted data, and (2) changepoints are identified via comparison of linear model parameters. In order to use the proposed method, it is necessary to have a sufficient number of experts’ assessments of the presence of anomalies in time series. The proposed framework is demonstrated on waste management data from the Czech Republic. It is observed that certain waste categories in specific regions frequently exhibit changepoints. On the micro-regional level, approximately 31.1% of time series contain at least one outlier and 16.4% exhibit changepoints. Certain groups of waste are more prone to the occurrence of anomalies. The results indicate that even in the case of aggregated data, anomalies are not rare, and their presence should always be checked.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
短时间序列数据中变化点检测的机器学习方法
数据分析对于废物管理至关重要,可以从短期和长期的角度改进有效的规划。现实世界的数据经常出现异常,但在废物管理领域,很少进行异常检测。本文的主要目标和贡献是提出了一个复杂的机器学习框架,用于从废物管理的大量短时间序列中检测变更点。在这种情况下,由于这个过程的耗时和主观性,不可能只使用基于专家的方法。该框架包括两个步骤:(1)通过对趋势调整数据的异常值检验来检测异常值;(2)通过比较线性模型参数来识别变化点。为了使用所提出的方法,需要有足够数量的专家对时间序列中异常的存在进行评估。拟议的框架以捷克共和国的废物管理数据为依据。可以观察到,特定区域的某些废物类别经常表现出变化点。在微观区域水平上,约31.1%的时间序列至少包含一个异常值,16.4%的时间序列表现出变化点。某些废物组更容易发生异常情况。结果表明,即使在汇总数据的情况下,异常也不罕见,并且应该始终检查它们的存在。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.30
自引率
0.00%
发文量
0
审稿时长
7 weeks
期刊最新文献
Knowledge Graph Extraction of Business Interactions from News Text for Business Networking Analysis Machine Learning for an Enhanced Credit Risk Analysis: A Comparative Study of Loan Approval Prediction Models Integrating Mental Health Data A Data Mining Approach for Health Transport Demand Predicting Wind Comfort in an Urban Area: A Comparison of a Regression- with a Classification-CNN for General Wind Rose Statistics An Evaluative Baseline for Sentence-Level Semantic Division
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1