{"title":"基于自适应聚类距离边界的高维索引改进KNN算法","authors":"Huang Hong, Guo Juan, Wang Ben","doi":"10.1109/GCIS.2012.86","DOIUrl":null,"url":null,"abstract":"Because of the intense bounding and the distance of the query vector to the cluster bounding is closer to the true distance, filtering out irrelevant clusters by the distance of the query vector to the cluster bounding in the process of similarity search has well reduced the I/O complexity. Hence, the \"curse of dimensionality\" can be well avoided. We propose an improved KNN search algorithm based on adaptive cluster distance bounding for high dimensional indexing by reducing the CPU cost which was achieved by filtering out unnecessary distance calculations number using the triangle inequality, but with the cost of some overhead and pretreatment. Finally, we verify the improved exact KNN search algorithm has a better performance through some experiments based on a real data set.","PeriodicalId":337629,"journal":{"name":"2012 Third Global Congress on Intelligent Systems","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"An Improved KNN Algorithm Based on Adaptive Cluster Distance Bounding for High Dimensional Indexing\",\"authors\":\"Huang Hong, Guo Juan, Wang Ben\",\"doi\":\"10.1109/GCIS.2012.86\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Because of the intense bounding and the distance of the query vector to the cluster bounding is closer to the true distance, filtering out irrelevant clusters by the distance of the query vector to the cluster bounding in the process of similarity search has well reduced the I/O complexity. Hence, the \\\"curse of dimensionality\\\" can be well avoided. We propose an improved KNN search algorithm based on adaptive cluster distance bounding for high dimensional indexing by reducing the CPU cost which was achieved by filtering out unnecessary distance calculations number using the triangle inequality, but with the cost of some overhead and pretreatment. Finally, we verify the improved exact KNN search algorithm has a better performance through some experiments based on a real data set.\",\"PeriodicalId\":337629,\"journal\":{\"name\":\"2012 Third Global Congress on Intelligent Systems\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Third Global Congress on Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GCIS.2012.86\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Third Global Congress on Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GCIS.2012.86","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Improved KNN Algorithm Based on Adaptive Cluster Distance Bounding for High Dimensional Indexing
Because of the intense bounding and the distance of the query vector to the cluster bounding is closer to the true distance, filtering out irrelevant clusters by the distance of the query vector to the cluster bounding in the process of similarity search has well reduced the I/O complexity. Hence, the "curse of dimensionality" can be well avoided. We propose an improved KNN search algorithm based on adaptive cluster distance bounding for high dimensional indexing by reducing the CPU cost which was achieved by filtering out unnecessary distance calculations number using the triangle inequality, but with the cost of some overhead and pretreatment. Finally, we verify the improved exact KNN search algorithm has a better performance through some experiments based on a real data set.