Near-optimal approximate membership query over time-decaying windows

2013 Proceedings IEEE INFOCOM Pub Date : 2013-04-14 DOI:10.1109/INFCOM.2013.6566939

Yang Liu, Wenji Chen, Y. Guan

{"title":"Near-optimal approximate membership query over time-decaying windows","authors":"Yang Liu, Wenji Chen, Y. Guan","doi":"10.1109/INFCOM.2013.6566939","DOIUrl":null,"url":null,"abstract":"There has been a long history of finding a spaceefficient data structure to support approximate membership queries, started from Bloom's work in the 1970's. Given a set A of n items and an additional item x from the same universe U of a size m ≫ n, we want to distinguish whether x ∈ A or not, using small (limited) space. The solutions for the membership query are needed for many network applications, such as cache directory, load-balancing, security, etc. If A is static, there exist optimal algorithms to find a randomized data structure to represent A using only (1+ o(1))n log 1/δ bits, which only allows for a small false positive δ but no false negative. However, existing optimal algorithms are not practical for many Internet applications, e.g., social network services, peer-to-peer systems, network traffic monitoring, etc. They are too spaceand time-expensive due to the frequent changes in the set A, because all items are needed to recompute the optimal data structure for each change using a linear running time. In this paper, we propose a novel data structure to support the approximate membership query in the time-decaying window model. In this model, items are inserted one-by-one over a data stream, and we want to determine whether an item is among the most recent w items for any given window size w ≤ n. Our data structure only requires O(n(log 1/δ+logn)) bits and O(1) running time. We also prove a non-trivial space lower bound, i.e. (n - δm) log(n - δm) bits, which guarantees that our data structure is near-optimal. Our data structure has been evaluated using both synthetic and real data sets.","PeriodicalId":206346,"journal":{"name":"2013 Proceedings IEEE INFOCOM","volume":"139 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Proceedings IEEE INFOCOM","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFCOM.2013.6566939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

Abstract

There has been a long history of finding a spaceefficient data structure to support approximate membership queries, started from Bloom's work in the 1970's. Given a set A of n items and an additional item x from the same universe U of a size m ≫ n, we want to distinguish whether x ∈ A or not, using small (limited) space. The solutions for the membership query are needed for many network applications, such as cache directory, load-balancing, security, etc. If A is static, there exist optimal algorithms to find a randomized data structure to represent A using only (1+ o(1))n log 1/δ bits, which only allows for a small false positive δ but no false negative. However, existing optimal algorithms are not practical for many Internet applications, e.g., social network services, peer-to-peer systems, network traffic monitoring, etc. They are too spaceand time-expensive due to the frequent changes in the set A, because all items are needed to recompute the optimal data structure for each change using a linear running time. In this paper, we propose a novel data structure to support the approximate membership query in the time-decaying window model. In this model, items are inserted one-by-one over a data stream, and we want to determine whether an item is among the most recent w items for any given window size w ≤ n. Our data structure only requires O(n(log 1/δ+logn)) bits and O(1) running time. We also prove a non-trivial space lower bound, i.e. (n - δm) log(n - δm) bits, which guarantees that our data structure is near-optimal. Our data structure has been evaluated using both synthetic and real data sets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在时间衰减窗口上的近似最优成员查询

从20世纪70年代Bloom的工作开始，人们一直在寻找一种空间高效的数据结构来支持近似成员查询。给定一个有n个项目的集合a和一个附加的项目x，它们来自大小为m比n的同一个宇宙U，我们想用很小的(有限的)空间来区分x是否∈a。许多网络应用都需要成员查询的解决方案，如缓存目录、负载平衡、安全性等。如果A是静态的，存在最优算法来找到一个随机数据结构来表示A，只使用(1+ o(1))n log 1/δ位，这只允许一个小的假正δ，但没有假负。然而，现有的最优算法并不适用于许多互联网应用，例如社交网络服务、点对点系统、网络流量监控等。由于集合A中的频繁更改，它们的空间和时间开销太大，因为所有项都需要使用线性运行时间为每次更改重新计算最佳数据结构。本文提出了一种新的数据结构来支持时间衰减窗口模型中的近似隶属度查询。在这个模型中，条目在数据流上一个接一个地插入，我们想要确定一个条目是否在任何给定窗口大小w≤n的最近w项中。我们的数据结构只需要O(n(log 1/δ+logn))位和O(1)运行时间。我们还证明了一个非平凡空间下界，即(n - δm) log(n - δm)位，这保证了我们的数据结构是接近最优的。我们的数据结构已经使用合成数据集和真实数据集进行了评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 Proceedings IEEE INFOCOM

自引率

0.00%

发文量

期刊最新文献

VoteTrust: Leveraging friend invitation graph to defend against social network Sybils Groupon in the Air: A three-stage auction framework for Spectrum Group-buying Into the Moana1 — Hypergraph-based network layer indirection Prometheus: Privacy-aware data retrieval on hybrid cloud Adaptive device-free passive localization coping with dynamic target speed