后院布谷鸟哈希:具有简洁表示的常数最坏情况操作

Yuriy Arbitman, M. Naor, G. Segev
{"title":"后院布谷鸟哈希:具有简洁表示的常数最坏情况操作","authors":"Yuriy Arbitman, M. Naor, G. Segev","doi":"10.1109/FOCS.2010.80","DOIUrl":null,"url":null,"abstract":"The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constant-time operations in the worst case with high probability, and in terms of space consumption there are known constructions that use essentially optimal space. However, although the first analysis of a dynamic dictionary dates back more than 45 years ago (when Knuth analyzed linear probing in 1963), the trade-off between these aspects of performance is still not completely understood. In this paper we settle two fundamental open problems: \\begin{itemize} \\item We construct the first dynamic dictionary that enjoys the best of both worlds: it stores $\\boldsymbol{n}$ elements using $\\boldsymbol{(1 + \\epsilon) n}$ memory words, and guarantees constant-time operations in the worst case with high probability. Specifically, for any \\boldsymbol{\\epsilon = \\Omega ( (\\log \\log n / \\log n)^{1/2} )}$ and for any sequence of polynomially many operations, with high probability over the randomness of the initialization phase, all operations are performed in constant time which is independent of $\\boldsymbol{\\epsilon}$. The construction is a two-level variant of cuckoo hashing, augmented with a ``backyard'' that handles a large fraction of the elements, together with a de-amortized perfect hashing scheme for eliminating the dependency on $\\boldsymbol{\\epsilon}$. \\item We present a variant of the above construction that uses only $\\boldsymbol{(1 + o(1))\\B}$ bits, where $\\boldsymbol{\\B}$ is the information-theoretic lower bound for representing a set of size $\\boldsymbol{n}$ taken from a universe of size $\\boldsymbol{u}$, and guarantees constant-time operations in the worst case with high probability, as before. This problem was open even in the {\\em amortized} setting. One of the main ingredients of our construction is a permutation-based variant of cuckoo hashing, which significantly improves the space consumption of cuckoo hashing when dealing with a rather small universe. \\end{itemize}","PeriodicalId":228365,"journal":{"name":"2010 IEEE 51st Annual Symposium on Foundations of Computer Science","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"88","resultStr":"{\"title\":\"Backyard Cuckoo Hashing: Constant Worst-Case Operations with a Succinct Representation\",\"authors\":\"Yuriy Arbitman, M. Naor, G. Segev\",\"doi\":\"10.1109/FOCS.2010.80\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constant-time operations in the worst case with high probability, and in terms of space consumption there are known constructions that use essentially optimal space. However, although the first analysis of a dynamic dictionary dates back more than 45 years ago (when Knuth analyzed linear probing in 1963), the trade-off between these aspects of performance is still not completely understood. In this paper we settle two fundamental open problems: \\\\begin{itemize} \\\\item We construct the first dynamic dictionary that enjoys the best of both worlds: it stores $\\\\boldsymbol{n}$ elements using $\\\\boldsymbol{(1 + \\\\epsilon) n}$ memory words, and guarantees constant-time operations in the worst case with high probability. Specifically, for any \\\\boldsymbol{\\\\epsilon = \\\\Omega ( (\\\\log \\\\log n / \\\\log n)^{1/2} )}$ and for any sequence of polynomially many operations, with high probability over the randomness of the initialization phase, all operations are performed in constant time which is independent of $\\\\boldsymbol{\\\\epsilon}$. The construction is a two-level variant of cuckoo hashing, augmented with a ``backyard'' that handles a large fraction of the elements, together with a de-amortized perfect hashing scheme for eliminating the dependency on $\\\\boldsymbol{\\\\epsilon}$. \\\\item We present a variant of the above construction that uses only $\\\\boldsymbol{(1 + o(1))\\\\B}$ bits, where $\\\\boldsymbol{\\\\B}$ is the information-theoretic lower bound for representing a set of size $\\\\boldsymbol{n}$ taken from a universe of size $\\\\boldsymbol{u}$, and guarantees constant-time operations in the worst case with high probability, as before. This problem was open even in the {\\\\em amortized} setting. One of the main ingredients of our construction is a permutation-based variant of cuckoo hashing, which significantly improves the space consumption of cuckoo hashing when dealing with a rather small universe. \\\\end{itemize}\",\"PeriodicalId\":228365,\"journal\":{\"name\":\"2010 IEEE 51st Annual Symposium on Foundations of Computer Science\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"88\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE 51st Annual Symposium on Foundations of Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FOCS.2010.80\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 51st Annual Symposium on Foundations of Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FOCS.2010.80","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 88

摘要

动态字典的性能主要通过其更新时间、查找时间和空间消耗来衡量。就更新时间和查找时间而言,有一些已知的结构可以在最坏的情况下以高概率保证恒定时间的操作,而就空间消耗而言,有一些已知的结构使用本质上最优的空间。然而,尽管对动态字典的第一次分析可以追溯到45年前(当时Knuth在1963年分析了线性探测),但性能的这些方面之间的权衡仍然没有完全理解。本文解决了两个基本的开放问题:我们构造了第一个具有两方面优点的动态字典:它使用$\boldsymbol{(1 + \epsilon) n}$存储$\boldsymbol{n}$元素,并保证在最坏情况下以高概率进行恒定时间操作。具体来说,对于任何\boldsymbol{\epsilon = \Omega ((\log \log n / \log n)^{1/2})}$,以及对于任何多项式多个操作的序列,在初始化阶段的随机性上具有高概率,所有操作都在常数时间内执行,这与$\boldsymbol{\epsilon}$无关。该构造是布谷鸟哈希的两级变体,增加了一个处理大部分元素的“后院”,以及一个消除对$\boldsymbol{\epsilon}$依赖的非平摊完美哈希方案。我们给出了上述结构的一种变体,它只使用$\boldsymbol{(1 + o(1))\B}$位,其中$\boldsymbol{\B}$是表示大小为$\boldsymbol{n}$的集合的信息论下界,该集合来自大小为$\boldsymbol{u}$的集合,并保证在最坏情况下具有高概率的恒定时间操作,如前所示。即使在{\em平摊}设置中,这个问题也是开放的。我们构建的主要成分之一是基于排列的布谷鸟哈希变体,它在处理较小的宇宙时显著提高了布谷鸟哈希的空间消耗。结束\{逐条列记}
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Backyard Cuckoo Hashing: Constant Worst-Case Operations with a Succinct Representation
The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constant-time operations in the worst case with high probability, and in terms of space consumption there are known constructions that use essentially optimal space. However, although the first analysis of a dynamic dictionary dates back more than 45 years ago (when Knuth analyzed linear probing in 1963), the trade-off between these aspects of performance is still not completely understood. In this paper we settle two fundamental open problems: \begin{itemize} \item We construct the first dynamic dictionary that enjoys the best of both worlds: it stores $\boldsymbol{n}$ elements using $\boldsymbol{(1 + \epsilon) n}$ memory words, and guarantees constant-time operations in the worst case with high probability. Specifically, for any \boldsymbol{\epsilon = \Omega ( (\log \log n / \log n)^{1/2} )}$ and for any sequence of polynomially many operations, with high probability over the randomness of the initialization phase, all operations are performed in constant time which is independent of $\boldsymbol{\epsilon}$. The construction is a two-level variant of cuckoo hashing, augmented with a ``backyard'' that handles a large fraction of the elements, together with a de-amortized perfect hashing scheme for eliminating the dependency on $\boldsymbol{\epsilon}$. \item We present a variant of the above construction that uses only $\boldsymbol{(1 + o(1))\B}$ bits, where $\boldsymbol{\B}$ is the information-theoretic lower bound for representing a set of size $\boldsymbol{n}$ taken from a universe of size $\boldsymbol{u}$, and guarantees constant-time operations in the worst case with high probability, as before. This problem was open even in the {\em amortized} setting. One of the main ingredients of our construction is a permutation-based variant of cuckoo hashing, which significantly improves the space consumption of cuckoo hashing when dealing with a rather small universe. \end{itemize}
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On the Computational Complexity of Coin Flipping The Monotone Complexity of k-clique on Random Graphs Local List Decoding with a Constant Number of Queries Agnostically Learning under Permutation Invariant Distributions Pseudorandom Generators for Regular Branching Programs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1