Huixiang Lv, Xianglin Huang, Lifang Yang, Tao Liu, Ping Wang
{"title":"A k-means clustering algorithm based on the distribution of SIFT","authors":"Huixiang Lv, Xianglin Huang, Lifang Yang, Tao Liu, Ping Wang","doi":"10.1109/ICIST.2013.6747776","DOIUrl":null,"url":null,"abstract":"Bag-of-Words based Image retrieval recently became the research hotspot. To improve the performance of visual word training in Bag-of-Words based image retrieval system, a k-means clustering algorithm based on the distribution of SIFT (Scale Invariant Feature Transform) feature data on each dimension is proposed. The initial clustering centers are obtained by analyzing the distribution of SIFT feature data on each dimension, and combing the iDistance method which is used to partition the data space in high-dimensional indexing according to the data distribution adaptively. Then the AKM (Approximate k-means) is used to do cluster on the sample feature data, train the visual words and get the visual vocabulary finally. In AKM, the k-d tree is built on the cluster centers at the beginning of each iteration to increase speed. The image retrieval system is constructed to verify the performance of our proposed method. Experiments are carried out on the oxford buildings 5k datasets which have 11 landmarks and the mAP (mean Average Precision) is used to evaluate the performance of image retrieval. Our proposed method achieves 31.9% compared to the AKM's 29.8%, so it is clear that our proposed method optimizes the visual words training process and finally improves the bag-of-words based image retrieval performance.","PeriodicalId":415759,"journal":{"name":"2013 IEEE Third International Conference on Information Science and Technology (ICIST)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Third International Conference on Information Science and Technology (ICIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIST.2013.6747776","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Bag-of-Words based Image retrieval recently became the research hotspot. To improve the performance of visual word training in Bag-of-Words based image retrieval system, a k-means clustering algorithm based on the distribution of SIFT (Scale Invariant Feature Transform) feature data on each dimension is proposed. The initial clustering centers are obtained by analyzing the distribution of SIFT feature data on each dimension, and combing the iDistance method which is used to partition the data space in high-dimensional indexing according to the data distribution adaptively. Then the AKM (Approximate k-means) is used to do cluster on the sample feature data, train the visual words and get the visual vocabulary finally. In AKM, the k-d tree is built on the cluster centers at the beginning of each iteration to increase speed. The image retrieval system is constructed to verify the performance of our proposed method. Experiments are carried out on the oxford buildings 5k datasets which have 11 landmarks and the mAP (mean Average Precision) is used to evaluate the performance of image retrieval. Our proposed method achieves 31.9% compared to the AKM's 29.8%, so it is clear that our proposed method optimizes the visual words training process and finally improves the bag-of-words based image retrieval performance.