{"title":"Difference Bloom Filter: A probabilistic structure for multi-set membership query","authors":"Dongsheng Yang, Deyu Tian, Junzhi Gong, Siang Gao, Tong Yang, Xiaoming Li","doi":"10.1109/ICC.2017.7996678","DOIUrl":null,"url":null,"abstract":"Given v sets and an incoming item e, multi-set membership query is to report which set contains item e. Multi-set membership query is a fundamental problem in computer systems and applications. All existing data structures cannot achieve small memory usage, fast query speed and high accuracy at the same time. In this paper, we propose a novel probabilistic data structure named Difference Bloom Filter (DBF) for fast multi-set membership query, which not only is more accurate than the state-of-the-art, but has a faster query speed. There are two key design principles for DBF. The first one is to make the representation of the membership of elements exclusive by writing different number of 1s and 0s in the same filter, and the second one is to use the slow but cheap DRAM memory to improve the accuracy of the filter on the fast but expensive SRAM memory. Experimental results show that in terms of accuracy, DBF has a great advantage compared to state-of-the-art, being hundreds of times more accurate than the state-of-the-art vBF and ShBF. Furthermore, we have made the source code of our DBF available at our homepage [1] and GitHub [2].","PeriodicalId":6517,"journal":{"name":"2017 IEEE International Conference on Communications (ICC)","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Communications (ICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICC.2017.7996678","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
Given v sets and an incoming item e, multi-set membership query is to report which set contains item e. Multi-set membership query is a fundamental problem in computer systems and applications. All existing data structures cannot achieve small memory usage, fast query speed and high accuracy at the same time. In this paper, we propose a novel probabilistic data structure named Difference Bloom Filter (DBF) for fast multi-set membership query, which not only is more accurate than the state-of-the-art, but has a faster query speed. There are two key design principles for DBF. The first one is to make the representation of the membership of elements exclusive by writing different number of 1s and 0s in the same filter, and the second one is to use the slow but cheap DRAM memory to improve the accuracy of the filter on the fast but expensive SRAM memory. Experimental results show that in terms of accuracy, DBF has a great advantage compared to state-of-the-art, being hundreds of times more accurate than the state-of-the-art vBF and ShBF. Furthermore, we have made the source code of our DBF available at our homepage [1] and GitHub [2].