{"title":"Mining outlier based on semantic relations","authors":"Hongfang Zhou, Hongyang Li","doi":"10.1109/ICCIAUTOM.2011.6183904","DOIUrl":null,"url":null,"abstract":"Existing methods on outlier detection doesn't take the semantic knowledge of the dataset into considerations. They only try to find outliers from dataset itself, which prevents from finding more meaningful outliers. In this paper, we consider the problem of outlier detection integrating semantic relations hidden in Web logs. We give a new definition of semantic outlier. A measure for identifying the degree of each object being an outlier is presented, which is called Likelihood of Semantic Outlier (LSO). A semantic outlier is a data point, which behaves differently from other data points in the same cluster, while looks normal with respect to data points in another cluster. An efficient algorithm of mining semantic outliers based on LSO is also proposed. The effectiveness of the algorithm is demonstrated on the real data, and the experimental results show that the proposed algorithm is efficient and effective.","PeriodicalId":177039,"journal":{"name":"2011 2nd International Conference on Control, Instrumentation and Automation (ICCIA)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 2nd International Conference on Control, Instrumentation and Automation (ICCIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIAUTOM.2011.6183904","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Existing methods on outlier detection doesn't take the semantic knowledge of the dataset into considerations. They only try to find outliers from dataset itself, which prevents from finding more meaningful outliers. In this paper, we consider the problem of outlier detection integrating semantic relations hidden in Web logs. We give a new definition of semantic outlier. A measure for identifying the degree of each object being an outlier is presented, which is called Likelihood of Semantic Outlier (LSO). A semantic outlier is a data point, which behaves differently from other data points in the same cluster, while looks normal with respect to data points in another cluster. An efficient algorithm of mining semantic outliers based on LSO is also proposed. The effectiveness of the algorithm is demonstrated on the real data, and the experimental results show that the proposed algorithm is efficient and effective.