Flexible and adaptive subspace search for outlier analysis

Proceedings of the 22nd ACM international conference on Information & Knowledge Management Pub Date : 2013-10-27 DOI:10.1145/2505515.2505560

F. Keller, Emmanuel Müller, Andreas Wixler, Klemens Böhm

{"title":"Flexible and adaptive subspace search for outlier analysis","authors":"F. Keller, Emmanuel Müller, Andreas Wixler, Klemens Böhm","doi":"10.1145/2505515.2505560","DOIUrl":null,"url":null,"abstract":"There exists a variety of traditional outlier models, which measure the deviation of outliers with respect to the full attribute space. However, these techniques fail to detect outliers that deviate only w.r.t. an attribute subset. To address this problem, recent techniques focus on a selection of subspaces that allow: (1) A clear distinction between clustered objects and outliers; (2) a description of outlier reasons by the selected subspaces. However, depending on the outlier model used, different objects in different subspaces have the highest deviation. It is an open research issue to make subspace selection adaptive to the outlier score of each object and flexible w.r.t. the use of different outlier models. In this work we propose such a flexible and adaptive subspace selection scheme. Our generic processing allows instantiations with different outlier models. We utilize the differences of outlier scores in random subspaces to perform a combinatorial refinement of relevant subspaces. Our refinement allows an individual selection of subspaces for each outlier, which is tailored to the underlying outlier model. In the experiments we show the flexibility of our subspace search w.r.t. various outlier models such as distance-based, angle-based, and local-density-based outlier detection.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"28 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2505515.2505560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 21

Abstract

There exists a variety of traditional outlier models, which measure the deviation of outliers with respect to the full attribute space. However, these techniques fail to detect outliers that deviate only w.r.t. an attribute subset. To address this problem, recent techniques focus on a selection of subspaces that allow: (1) A clear distinction between clustered objects and outliers; (2) a description of outlier reasons by the selected subspaces. However, depending on the outlier model used, different objects in different subspaces have the highest deviation. It is an open research issue to make subspace selection adaptive to the outlier score of each object and flexible w.r.t. the use of different outlier models. In this work we propose such a flexible and adaptive subspace selection scheme. Our generic processing allows instantiations with different outlier models. We utilize the differences of outlier scores in random subspaces to perform a combinatorial refinement of relevant subspaces. Our refinement allows an individual selection of subspaces for each outlier, which is tailored to the underlying outlier model. In the experiments we show the flexibility of our subspace search w.r.t. various outlier models such as distance-based, angle-based, and local-density-based outlier detection.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

灵活和自适应子空间搜索异常值分析

传统的离群值模型有多种，它们衡量的是离群值相对于整个属性空间的偏差。然而，这些技术无法检测到只偏离属性子集的异常值。为了解决这个问题，最近的技术集中在子空间的选择上，这些子空间允许:(1)明确区分聚类对象和异常值;(2)选取子空间描述离群原因。然而，根据所使用的离群值模型，不同子空间中的不同对象具有最高的偏差。如何使子空间选择适应每个目标的离群值得分，并灵活地使用不同的离群值模型，是一个有待研究的问题。本文提出了一种灵活的自适应子空间选择方案。我们的通用处理允许使用不同的离群模型实例化。我们利用随机子空间中离群值的差异对相关子空间进行组合细化。我们的细化允许为每个离群值单独选择子空间，这是针对底层离群值模型量身定制的。在实验中，我们展示了子空间搜索与各种离群点模型(如基于距离、基于角度和基于局部密度的离群点检测)的灵活性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 22nd ACM international conference on Information & Knowledge Management

自引率

0.00%

发文量

期刊最新文献

Exploring XML data is as easy as using maps Mining-based compression approach of propositional formulae Flexible and dynamic compromises for effective recommendations Efficient parsing-based search over structured data Recommendation via user's personality and social contextual