-nearest neighbor (NN) search is widely applied to low- and high-dimensional tasks, as well as various data distributions and distance functions. However, its computational cost increases with the data volume, causing a bottleneck for many applications. The workload of the existing tree-based methods linearly increases with the neighbor count in the worst case. In addition, some tree-based methods only apply to tasks with L2 distances and may have severe warp divergence when employed on GPUs. Our goal is to develop a general-purpose NN method based on cluster sorting to achieve better pruning efficiency compared with tree-based approaches. We optimize the proposed method to achieve higher performance on tasks with different dimensionalities or distance functions. The proposed Sort, TraversE, and then Prune (STEP) algorithm is a NN method that clusters the data points beforehand. With various 1) numbers of data points, 2) numbers of query points, 3) neighbor counts, 4) dimensions, and 5) distance metrics, the STEP method offers high performance because of the following aspects. First, our method prunes the data points efficiently by sorting the clusters for each query. Second, we exploit the single-instruction multiple-threads (SIMT) architecture of the GPU and utilize both coarse- and fine-grained parallelism to accelerate computation. The proposed method concurrently computes all queries and minimizes warp divergence by assigning a query to a GPU warp. Third, the STEP method rapidly updates the NN results using bitonic operations. Fourth, we proposed an adaptive approach that automatically switches from the indexing approach to the exhaustive approach to achieve good scalability on high-dimensional data. Finally, we develop a variant of Gärtner’s bounding sphere algorithm so that our indexing method can handle distance metrics other than the L2 distance. The STEP method achieves a 15.9 times speedup with L2 distances and a 36.7 times speedup with angular distances compared with other state-of-the-art methods.
扫码关注我们
求助内容:
应助结果提醒方式:
