一种新的离群值识别集成方法

2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence) Pub Date : 2020-01-01 DOI:10.1109/Confluence47617.2020.9058219

Stamatios-Aggelos N. Alexandropoulos, S. Kotsiantis, Violetta E. Piperigou, M. Vrahatis

{"title":"一种新的离群值识别集成方法","authors":"Stamatios-Aggelos N. Alexandropoulos, S. Kotsiantis, Violetta E. Piperigou, M. Vrahatis","doi":"10.1109/Confluence47617.2020.9058219","DOIUrl":null,"url":null,"abstract":"A vast number of factors influence the applicability of machine learning methods and the use of statistical models for a given task. The existence of outliers in a data set is a common issue that needs to be tackled. The identification of such values is a difficult, yet very useful project. In many cases errors or dissimilar values to the majority of the data are useless. Nevertheless, valuable information can be hidden in outliers’ set. During the last years, although several models have been developed for outlier detection, there is always space for new, intelligent, more efficient and less time consuming techniques for this issue. In the present work we provide a new ensemble method for outlier detection. In order to test the proposed methodology, comparisons are made with widely used techniques for outlier detection. The results obtained indicate that our model is robust and quite competitive to the other methods.","PeriodicalId":180005,"journal":{"name":"2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A new ensemble method for outlier identification\",\"authors\":\"Stamatios-Aggelos N. Alexandropoulos, S. Kotsiantis, Violetta E. Piperigou, M. Vrahatis\",\"doi\":\"10.1109/Confluence47617.2020.9058219\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A vast number of factors influence the applicability of machine learning methods and the use of statistical models for a given task. The existence of outliers in a data set is a common issue that needs to be tackled. The identification of such values is a difficult, yet very useful project. In many cases errors or dissimilar values to the majority of the data are useless. Nevertheless, valuable information can be hidden in outliers’ set. During the last years, although several models have been developed for outlier detection, there is always space for new, intelligent, more efficient and less time consuming techniques for this issue. In the present work we provide a new ensemble method for outlier detection. In order to test the proposed methodology, comparisons are made with widely used techniques for outlier detection. The results obtained indicate that our model is robust and quite competitive to the other methods.\",\"PeriodicalId\":180005,\"journal\":{\"name\":\"2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/Confluence47617.2020.9058219\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/Confluence47617.2020.9058219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

大量的因素影响机器学习方法的适用性和对给定任务的统计模型的使用。数据集中异常值的存在是一个需要解决的常见问题。确定这些价值是一项困难但非常有用的项目。在许多情况下，大多数数据的错误或不同的值是无用的。然而，有价值的信息可能隐藏在离群值的集合中。在过去的几年中，尽管已经开发了几种用于异常值检测的模型，但对于这个问题，总是有新的、智能的、更有效的和更省时的技术空间。在本工作中，我们提供了一种新的集成方法来检测异常值。为了测试所提出的方法，与广泛使用的离群检测技术进行了比较。结果表明，该模型具有较强的鲁棒性，与其他方法相比具有较强的竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A new ensemble method for outlier identification

A vast number of factors influence the applicability of machine learning methods and the use of statistical models for a given task. The existence of outliers in a data set is a common issue that needs to be tackled. The identification of such values is a difficult, yet very useful project. In many cases errors or dissimilar values to the majority of the data are useless. Nevertheless, valuable information can be hidden in outliers’ set. During the last years, although several models have been developed for outlier detection, there is always space for new, intelligent, more efficient and less time consuming techniques for this issue. In the present work we provide a new ensemble method for outlier detection. In order to test the proposed methodology, comparisons are made with widely used techniques for outlier detection. The results obtained indicate that our model is robust and quite competitive to the other methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)

自引率

0.00%

发文量