高精度离群点检测的参数和非参数方法

IF 0.5 4区 计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS Journal of Information Science and Engineering Pub Date : 2020-03-01 DOI:10.6688/JISE.202003_36(2).0018
Mohamed Jaward Bah, Honghi Wang
{"title":"高精度离群点检测的参数和非参数方法","authors":"Mohamed Jaward Bah, Honghi Wang","doi":"10.6688/JISE.202003_36(2).0018","DOIUrl":null,"url":null,"abstract":"Outlier detection is an essential problem that has been studied in a wide range of applications in diverse fields. One common approach to outlier detection is using statistical models, but these methods have inherent challenges and drawbacks. For instance, in providing optimal solutions that will enable the idea of detecting outliers more effectively with a high detection rate and in minimizing the computational cost. Many statistical techniques that have been proposed are classified into mainly parametric and non-parametric methods, and to the best of our knowledge, evaluating and deciphering the effects of these methods against each other remains to be an open research direction, and most of these statistical methods proposed earlier have not shown high outlier detection accuracy. In this paper, under the umbrella and generalization of statistical approach, we propose Gaussian Mixture Model for Outlier Detection (GMMOD) for the parametric approach and Kernel Density Estimation for Outlier Detection (KDEOD) algorithms for the non-parametric approach, for solving the problem of detecting outliers more effectively and in improving the outlier detection accuracy. The proposed methods are applied to real- world datasets, and our experimental results show that even though both techniques perform well, KDEOD shows favorable by a smaller margin in most cases when compared to GMMOD and both show improved performance over their similar comparative algorithms.","PeriodicalId":50177,"journal":{"name":"Journal of Information Science and Engineering","volume":"55 1","pages":"441-465"},"PeriodicalIF":0.5000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Parametric and Non-Parametric Approach for High-Accurate Outlier Detection\",\"authors\":\"Mohamed Jaward Bah, Honghi Wang\",\"doi\":\"10.6688/JISE.202003_36(2).0018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Outlier detection is an essential problem that has been studied in a wide range of applications in diverse fields. One common approach to outlier detection is using statistical models, but these methods have inherent challenges and drawbacks. For instance, in providing optimal solutions that will enable the idea of detecting outliers more effectively with a high detection rate and in minimizing the computational cost. Many statistical techniques that have been proposed are classified into mainly parametric and non-parametric methods, and to the best of our knowledge, evaluating and deciphering the effects of these methods against each other remains to be an open research direction, and most of these statistical methods proposed earlier have not shown high outlier detection accuracy. In this paper, under the umbrella and generalization of statistical approach, we propose Gaussian Mixture Model for Outlier Detection (GMMOD) for the parametric approach and Kernel Density Estimation for Outlier Detection (KDEOD) algorithms for the non-parametric approach, for solving the problem of detecting outliers more effectively and in improving the outlier detection accuracy. The proposed methods are applied to real- world datasets, and our experimental results show that even though both techniques perform well, KDEOD shows favorable by a smaller margin in most cases when compared to GMMOD and both show improved performance over their similar comparative algorithms.\",\"PeriodicalId\":50177,\"journal\":{\"name\":\"Journal of Information Science and Engineering\",\"volume\":\"55 1\",\"pages\":\"441-465\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2020-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.6688/JISE.202003_36(2).0018\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.6688/JISE.202003_36(2).0018","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

异常值检测是一个重要的问题,在各个领域都有广泛的应用。异常值检测的一种常用方法是使用统计模型,但这些方法存在固有的挑战和缺点。例如,在提供最优解决方案时,能够以更高的检测率更有效地检测异常值,并将计算成本降至最低。已经提出的许多统计技术主要分为参数方法和非参数方法,据我们所知,评估和破译这些方法相互之间的影响仍然是一个开放的研究方向,而且大多数这些统计方法都没有显示出很高的离群值检测精度。为了更有效地解决离群点检测问题,提高离群点检测精度,本文在统计方法的概括和推广下,提出了参数方法的高斯混合模型离群点检测(GMMOD)和非参数方法的核密度估计离群点检测(KDEOD)算法。我们的实验结果表明,尽管两种技术都表现良好,但与GMMOD相比,KDEOD在大多数情况下表现出较小的优势,并且两种方法都比类似的比较算法表现出更高的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Parametric and Non-Parametric Approach for High-Accurate Outlier Detection
Outlier detection is an essential problem that has been studied in a wide range of applications in diverse fields. One common approach to outlier detection is using statistical models, but these methods have inherent challenges and drawbacks. For instance, in providing optimal solutions that will enable the idea of detecting outliers more effectively with a high detection rate and in minimizing the computational cost. Many statistical techniques that have been proposed are classified into mainly parametric and non-parametric methods, and to the best of our knowledge, evaluating and deciphering the effects of these methods against each other remains to be an open research direction, and most of these statistical methods proposed earlier have not shown high outlier detection accuracy. In this paper, under the umbrella and generalization of statistical approach, we propose Gaussian Mixture Model for Outlier Detection (GMMOD) for the parametric approach and Kernel Density Estimation for Outlier Detection (KDEOD) algorithms for the non-parametric approach, for solving the problem of detecting outliers more effectively and in improving the outlier detection accuracy. The proposed methods are applied to real- world datasets, and our experimental results show that even though both techniques perform well, KDEOD shows favorable by a smaller margin in most cases when compared to GMMOD and both show improved performance over their similar comparative algorithms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Information Science and Engineering
Journal of Information Science and Engineering 工程技术-计算机:信息系统
CiteScore
2.00
自引率
0.00%
发文量
4
审稿时长
8 months
期刊介绍: The Journal of Information Science and Engineering is dedicated to the dissemination of information on computer science, computer engineering, and computer systems. This journal encourages articles on original research in the areas of computer hardware, software, man-machine interface, theory and applications. tutorial papers in the above-mentioned areas, and state-of-the-art papers on various aspects of computer systems and applications.
期刊最新文献
MedCheX: An Efficient COVID-19 Detection Model for Clinical Usage Spatiotemporal Data Warehousing for Event Tracking Applications An Optimized Modelling and Simulation on Task Scheduling for Multi-Processor System using Hybridized ACO-CVOA An Approach to Monitor Vaccine Quality During Distribution Using Internet of Things Data Science Applied to Marketing: A Literature Review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1