HW-GA流数据异常检测算法的高性能计算机性能测试

Jakup Fondaj, Zirije Hasani, Samedin Krrabaj
{"title":"HW-GA流数据异常检测算法的高性能计算机性能测试","authors":"Jakup Fondaj, Zirije Hasani, Samedin Krrabaj","doi":"10.7494/csci.2022.23.3.4389","DOIUrl":null,"url":null,"abstract":"Anomaly detection is very important in every sector as health, education, business, etc. Knowing what is going wrong with data/digital system help peoples from every sector to take decision. Detection anomalies in real time Big Data is nowadays very crucial. Dealing with real time data requires speed, for this reason the aim of this paper is to measure the performance of our previously proposed HW-GA algorithm compared with other anomaly detection algorithms. Many factors will be analyzed which may affect the performance of HW-GA as visualization of result, amount of data and performance of computers. Algorithm execution time and CPU usage are the parameters which will be measured to evaluate the performance of HW-GA algorithm. Also, another aim of this paper is to test the HW-GA algorithm with large amount of data to verify if it will find the possible anomalies and the result to compare with other algorithms. The experiments will be done in R with different datasets as real data Covid-19 and e-dnevnik data and three benchmarks from Numenta datasets. The real data have not known anomalies but in the benchmark data the anomalies are known this is in order to evaluate how the algorithms work in both situations. The novelty of this paper is that the performance will be tested in three different computers which one of them is high performance computer.","PeriodicalId":23063,"journal":{"name":"Theor. Comput. Sci.","volume":"198 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance measurement with high performance computer of HW-GA anomaly detection algorithms for streaming data\",\"authors\":\"Jakup Fondaj, Zirije Hasani, Samedin Krrabaj\",\"doi\":\"10.7494/csci.2022.23.3.4389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Anomaly detection is very important in every sector as health, education, business, etc. Knowing what is going wrong with data/digital system help peoples from every sector to take decision. Detection anomalies in real time Big Data is nowadays very crucial. Dealing with real time data requires speed, for this reason the aim of this paper is to measure the performance of our previously proposed HW-GA algorithm compared with other anomaly detection algorithms. Many factors will be analyzed which may affect the performance of HW-GA as visualization of result, amount of data and performance of computers. Algorithm execution time and CPU usage are the parameters which will be measured to evaluate the performance of HW-GA algorithm. Also, another aim of this paper is to test the HW-GA algorithm with large amount of data to verify if it will find the possible anomalies and the result to compare with other algorithms. The experiments will be done in R with different datasets as real data Covid-19 and e-dnevnik data and three benchmarks from Numenta datasets. The real data have not known anomalies but in the benchmark data the anomalies are known this is in order to evaluate how the algorithms work in both situations. The novelty of this paper is that the performance will be tested in three different computers which one of them is high performance computer.\",\"PeriodicalId\":23063,\"journal\":{\"name\":\"Theor. Comput. Sci.\",\"volume\":\"198 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Theor. Comput. Sci.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7494/csci.2022.23.3.4389\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theor. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7494/csci.2022.23.3.4389","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

异常检测在卫生、教育、商业等各个领域都非常重要。了解数据/数字系统的问题有助于各个部门的人们做出决策。当前,实时大数据异常检测非常关键。处理实时数据需要速度,因此本文的目的是比较我们之前提出的HW-GA算法与其他异常检测算法的性能。本文将分析影响HW-GA性能的因素,如结果的可视化、数据量和计算机性能。算法执行时间和CPU占用率是评估HW-GA算法性能的主要指标。此外,本文的另一个目的是对HW-GA算法进行大量数据测试,验证其是否能发现可能的异常,并与其他算法进行比较。实验将在R语言中进行,使用不同的数据集作为真实数据Covid-19和e- nevnik数据以及来自Numenta数据集的三个基准。真实数据没有已知的异常,但在基准数据中,异常是已知的,这是为了评估算法在两种情况下的工作方式。本文的新颖之处在于将在三台不同的计算机上进行性能测试,其中一台是高性能计算机。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Performance measurement with high performance computer of HW-GA anomaly detection algorithms for streaming data
Anomaly detection is very important in every sector as health, education, business, etc. Knowing what is going wrong with data/digital system help peoples from every sector to take decision. Detection anomalies in real time Big Data is nowadays very crucial. Dealing with real time data requires speed, for this reason the aim of this paper is to measure the performance of our previously proposed HW-GA algorithm compared with other anomaly detection algorithms. Many factors will be analyzed which may affect the performance of HW-GA as visualization of result, amount of data and performance of computers. Algorithm execution time and CPU usage are the parameters which will be measured to evaluate the performance of HW-GA algorithm. Also, another aim of this paper is to test the HW-GA algorithm with large amount of data to verify if it will find the possible anomalies and the result to compare with other algorithms. The experiments will be done in R with different datasets as real data Covid-19 and e-dnevnik data and three benchmarks from Numenta datasets. The real data have not known anomalies but in the benchmark data the anomalies are known this is in order to evaluate how the algorithms work in both situations. The novelty of this paper is that the performance will be tested in three different computers which one of them is high performance computer.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On the Parameterized Complexity of s-club Cluster Deletion Problems Spiking neural P systems with weights and delays on synapses Iterated Uniform Finite-State Transducers on Unary Languages Lazy Regular Sensing State Complexity of Finite Partial Languages
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1