{"title":"A New Fast Intersection Algorithm for Sorted Lists on GPU","authors":"Faïza Manseur, Lougmiri Zekri, M. Senouci","doi":"10.4018/jitr.298325","DOIUrl":null,"url":null,"abstract":"Set intersection algorithms between sorted lists are important in triangles counting, community detection in graph analysis and in search engines where the intersection is computed between queries and inverted indexes. Many researches use GPU techniques for solving this intersection problem. The majority of these techniques focus on improving the level of parallelism by reducing redundant comparisons and distributing the workload among GPU threads. In this paper, we propose the GPU Test with Jumps (GTWJ) algorithm to compute the intersection between sorted lists using a new data structure. The idea of GTWJ is to group the data, of each sorted list, into a set of sequences. A sequence is identified by a key and is handled by a thread. Intersection is computed between sequences with the same key. This key allows skipping data packets in parallel if the keys do not match. A counter is used to avoid useless tests between cells of sequences with different lengths. Experiments on the data used in this filed show that GTWJ is better in terms of execution time and number of tests.","PeriodicalId":296080,"journal":{"name":"J. Inf. Technol. Res.","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Inf. Technol. Res.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jitr.298325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Set intersection algorithms between sorted lists are important in triangles counting, community detection in graph analysis and in search engines where the intersection is computed between queries and inverted indexes. Many researches use GPU techniques for solving this intersection problem. The majority of these techniques focus on improving the level of parallelism by reducing redundant comparisons and distributing the workload among GPU threads. In this paper, we propose the GPU Test with Jumps (GTWJ) algorithm to compute the intersection between sorted lists using a new data structure. The idea of GTWJ is to group the data, of each sorted list, into a set of sequences. A sequence is identified by a key and is handled by a thread. Intersection is computed between sequences with the same key. This key allows skipping data packets in parallel if the keys do not match. A counter is used to avoid useless tests between cells of sequences with different lengths. Experiments on the data used in this filed show that GTWJ is better in terms of execution time and number of tests.
排序列表之间的集合交集算法在三角形计数、图分析中的社区检测以及在查询和倒排索引之间计算交集的搜索引擎中都很重要。许多研究使用GPU技术来解决这个交叉问题。这些技术中的大多数都侧重于通过减少冗余比较和在GPU线程之间分配工作负载来提高并行性水平。在本文中,我们提出了GPU Test with跳转(GTWJ)算法,该算法使用新的数据结构来计算排序列表之间的交集。GTWJ的思想是将每个排序列表的数据分组到一组序列中。序列由键标识,并由线程处理。在具有相同键的序列之间计算交集。如果密钥不匹配,此密钥允许并行跳过数据包。计数器用于避免在不同长度序列的细胞之间进行无用的测试。对该领域使用的数据进行的实验表明,GTWJ在执行时间和测试次数方面都优于GTWJ。