{"title":"BhBF:一种基于Bh序列的多集隶属度查询布隆过滤器","authors":"Shuyu Pei, Kun Xie, Xin Wang, Gaogang Xie, Kenli Li, Wei Li, Yanbiao Li, Jigang Wen","doi":"10.1145/3502735","DOIUrl":null,"url":null,"abstract":"Multi-set membership query is a fundamental issue for network functions such as packet processing and state machines monitoring. Given the rigid query speed and memory requirements, it would be promising if a multi-set query algorithm can be designed based on Bloom filter (BF), a space-efficient probabilistic data structure. However, existing efforts on multi-set query based on BF suffer from at least one of the following drawbacks: low query speed, low query accuracy, limitation in only supporting insertion and query operations, or limitation in the set size. To address the issues, we design a novel Bh sequence-based Bloom filter (BhBF) for multi-set query, which supports four operations: insertion, query, deletion, and update. In BhBF, the set ID is encoded as a code in a Bh sequence. Exploiting good properties of Bh sequences, we can correctly decode the BF cells to obtain the set IDs even when the number of hash collisions is high, which brings high query accuracy. In BhBF, we propose two strategies to further speed up the query speed and increase the query accuracy. On the theoretical side, we analyze the false positive and classification failure rate of our BhBF. Our results from extensive experiments over two real datasets demonstrate that BhBF significantly advances state-of-the-art multi-set query algorithms.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"BhBF: A Bloom Filter Using Bh Sequences for Multi-set Membership Query\",\"authors\":\"Shuyu Pei, Kun Xie, Xin Wang, Gaogang Xie, Kenli Li, Wei Li, Yanbiao Li, Jigang Wen\",\"doi\":\"10.1145/3502735\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-set membership query is a fundamental issue for network functions such as packet processing and state machines monitoring. Given the rigid query speed and memory requirements, it would be promising if a multi-set query algorithm can be designed based on Bloom filter (BF), a space-efficient probabilistic data structure. However, existing efforts on multi-set query based on BF suffer from at least one of the following drawbacks: low query speed, low query accuracy, limitation in only supporting insertion and query operations, or limitation in the set size. To address the issues, we design a novel Bh sequence-based Bloom filter (BhBF) for multi-set query, which supports four operations: insertion, query, deletion, and update. In BhBF, the set ID is encoded as a code in a Bh sequence. Exploiting good properties of Bh sequences, we can correctly decode the BF cells to obtain the set IDs even when the number of hash collisions is high, which brings high query accuracy. In BhBF, we propose two strategies to further speed up the query speed and increase the query accuracy. On the theoretical side, we analyze the false positive and classification failure rate of our BhBF. Our results from extensive experiments over two real datasets demonstrate that BhBF significantly advances state-of-the-art multi-set query algorithms.\",\"PeriodicalId\":435653,\"journal\":{\"name\":\"ACM Transactions on Knowledge Discovery from Data (TKDD)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Knowledge Discovery from Data (TKDD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3502735\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data (TKDD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3502735","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
BhBF: A Bloom Filter Using Bh Sequences for Multi-set Membership Query
Multi-set membership query is a fundamental issue for network functions such as packet processing and state machines monitoring. Given the rigid query speed and memory requirements, it would be promising if a multi-set query algorithm can be designed based on Bloom filter (BF), a space-efficient probabilistic data structure. However, existing efforts on multi-set query based on BF suffer from at least one of the following drawbacks: low query speed, low query accuracy, limitation in only supporting insertion and query operations, or limitation in the set size. To address the issues, we design a novel Bh sequence-based Bloom filter (BhBF) for multi-set query, which supports four operations: insertion, query, deletion, and update. In BhBF, the set ID is encoded as a code in a Bh sequence. Exploiting good properties of Bh sequences, we can correctly decode the BF cells to obtain the set IDs even when the number of hash collisions is high, which brings high query accuracy. In BhBF, we propose two strategies to further speed up the query speed and increase the query accuracy. On the theoretical side, we analyze the false positive and classification failure rate of our BhBF. Our results from extensive experiments over two real datasets demonstrate that BhBF significantly advances state-of-the-art multi-set query algorithms.