Yin Ye, Zhihui Du, David A. Bader, Quan Yang, Weiwei Huo
{"title":"GPUMemSort: A High Performance Graphics Co-processors Sorting Algorithm for Large Scale In-Memory Data","authors":"Yin Ye, Zhihui Du, David A. Bader, Quan Yang, Weiwei Huo","doi":"10.5176/2010-2283_1.2.34","DOIUrl":null,"url":null,"abstract":"In this paper, we present a GPU-based sorting algorithm, GPUMemSort, which achieves high performance in sorting large-scale in-memory data by take advantage of GPU processors. It consists of two algorithms: an in-core algorithm, which is responsible for sorting data in GPU global memory efficiently, and an out-of-core algorithm, which is responsible for dividing large-scale data into multiple chunks that fit GPU global memory. GPUMemSort is implemented based on NVIDIA’s CUDA framework and some critical and detailed optimization methods are also presented. The tests of different algorithms have been run on multiple data sets. The experimental results show that our in-core sorting can outperform other comparison-based algorithms and GPUMemSort is highly effective in sorting large-scale inmemory data.","PeriodicalId":91079,"journal":{"name":"GSTF international journal on computing","volume":"78 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2014-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GSTF international journal on computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5176/2010-2283_1.2.34","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
In this paper, we present a GPU-based sorting algorithm, GPUMemSort, which achieves high performance in sorting large-scale in-memory data by take advantage of GPU processors. It consists of two algorithms: an in-core algorithm, which is responsible for sorting data in GPU global memory efficiently, and an out-of-core algorithm, which is responsible for dividing large-scale data into multiple chunks that fit GPU global memory. GPUMemSort is implemented based on NVIDIA’s CUDA framework and some critical and detailed optimization methods are also presented. The tests of different algorithms have been run on multiple data sets. The experimental results show that our in-core sorting can outperform other comparison-based algorithms and GPUMemSort is highly effective in sorting large-scale inmemory data.