{"title":"状态变化少的流算法","authors":"Rajesh Jayaram, David P. Woodruff, Samson Zhou","doi":"10.1145/3651145","DOIUrl":null,"url":null,"abstract":"In this paper, we study streaming algorithms that minimize the number of changes made to their internal state (i.e., memory contents). While the design of streaming algorithms typically focuses on minimizing space and update time, these metrics fail to capture the asymmetric costs, inherent in modern hardware and database systems, of reading versus writing to memory. In fact, most streaming algorithms write to their memory on every update, which is undesirable when writing is significantly more expensive than reading. This raises the question of whether streaming algorithms with small space and number of memory writes are possible.\n \n We first demonstrate that, for the fundamental F\n p\n moment estimation problem with p ≥ 1, any streaming algorithm that achieves a constant factor approximation must make Ω(n\n 1-1/p\n ) internal state changes, regardless of how much space it uses. Perhaps surprisingly, we show that this lower bound can be matched by an algorithm which also has near-optimal space complexity. Specifically, we give a (1+ε)-approximation algorithm for F\n p\n moment estimation that use a near-optimal ~O\n ε\n (n\n 1-1/p\n ) number of state changes, while simultaneously achieving near-optimal space, i.e., for p∈[1,2), our algorithm uses poly(log n,1/ε) bits of space for, while for p>2, the algorithm uses ~O\n ε\n (n\n 1-1/p\n ) space. We similarly design streaming algorithms that are simultaneously near-optimal in both space complexity and the number of state changes for the heavy-hitters problem, sparse support recovery, and entropy estimation. Our results demonstrate that an optimal number of state changes can be achieved without sacrificing space complexity.\n","PeriodicalId":498157,"journal":{"name":"Proceedings of the ACM on Management of Data","volume":" 83","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Streaming Algorithms with Few State Changes\",\"authors\":\"Rajesh Jayaram, David P. Woodruff, Samson Zhou\",\"doi\":\"10.1145/3651145\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study streaming algorithms that minimize the number of changes made to their internal state (i.e., memory contents). While the design of streaming algorithms typically focuses on minimizing space and update time, these metrics fail to capture the asymmetric costs, inherent in modern hardware and database systems, of reading versus writing to memory. In fact, most streaming algorithms write to their memory on every update, which is undesirable when writing is significantly more expensive than reading. This raises the question of whether streaming algorithms with small space and number of memory writes are possible.\\n \\n We first demonstrate that, for the fundamental F\\n p\\n moment estimation problem with p ≥ 1, any streaming algorithm that achieves a constant factor approximation must make Ω(n\\n 1-1/p\\n ) internal state changes, regardless of how much space it uses. Perhaps surprisingly, we show that this lower bound can be matched by an algorithm which also has near-optimal space complexity. Specifically, we give a (1+ε)-approximation algorithm for F\\n p\\n moment estimation that use a near-optimal ~O\\n ε\\n (n\\n 1-1/p\\n ) number of state changes, while simultaneously achieving near-optimal space, i.e., for p∈[1,2), our algorithm uses poly(log n,1/ε) bits of space for, while for p>2, the algorithm uses ~O\\n ε\\n (n\\n 1-1/p\\n ) space. We similarly design streaming algorithms that are simultaneously near-optimal in both space complexity and the number of state changes for the heavy-hitters problem, sparse support recovery, and entropy estimation. Our results demonstrate that an optimal number of state changes can be achieved without sacrificing space complexity.\\n\",\"PeriodicalId\":498157,\"journal\":{\"name\":\"Proceedings of the ACM on Management of Data\",\"volume\":\" 83\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM on Management of Data\",\"FirstCategoryId\":\"0\",\"ListUrlMain\":\"https://doi.org/10.1145/3651145\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Management of Data","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.1145/3651145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
本文研究的流式算法能最大限度地减少对内部状态(即内存内容)的更改次数。虽然流算法的设计通常侧重于最小化空间和更新时间,但这些指标未能捕捉到现代硬件和数据库系统固有的读取内存与写入内存的不对称成本。事实上,大多数流式算法在每次更新时都会向内存写入数据,当写入数据的成本明显高于读取数据时,这种做法是不可取的。这就提出了一个问题:写入内存的空间和次数较少的流式算法是否可行? 我们首先证明,对于 p ≥ 1 的基本 F p 矩估计问题,无论使用多少空间,任何能实现常数因子逼近的流算法都必须进行 Ω(n 1-1/p ) 内部状态变化。也许令人惊讶的是,我们证明了这种算法也能达到这个下限,而且空间复杂度接近最优。具体来说,我们给出了一种 (1+ε)-Approximation 算法,用于 F p 矩估计,该算法使用了接近最优的 ~O ε (n 1-1/p ) 状态变化次数,同时实现了接近最优的空间,即对于 p∈[1,2], 我们的算法使用了 poly(log n,1/ε) 位空间,而对于 p>2, 该算法使用了 ~O ε (n 1-1/p ) 空间。我们还设计了类似的流算法,这些算法同时在重载问题、稀疏支持恢复和熵估计的空间复杂度和状态变化次数上接近最优。我们的结果表明,可以在不牺牲空间复杂度的情况下实现最佳状态变化次数。
In this paper, we study streaming algorithms that minimize the number of changes made to their internal state (i.e., memory contents). While the design of streaming algorithms typically focuses on minimizing space and update time, these metrics fail to capture the asymmetric costs, inherent in modern hardware and database systems, of reading versus writing to memory. In fact, most streaming algorithms write to their memory on every update, which is undesirable when writing is significantly more expensive than reading. This raises the question of whether streaming algorithms with small space and number of memory writes are possible.
We first demonstrate that, for the fundamental F
p
moment estimation problem with p ≥ 1, any streaming algorithm that achieves a constant factor approximation must make Ω(n
1-1/p
) internal state changes, regardless of how much space it uses. Perhaps surprisingly, we show that this lower bound can be matched by an algorithm which also has near-optimal space complexity. Specifically, we give a (1+ε)-approximation algorithm for F
p
moment estimation that use a near-optimal ~O
ε
(n
1-1/p
) number of state changes, while simultaneously achieving near-optimal space, i.e., for p∈[1,2), our algorithm uses poly(log n,1/ε) bits of space for, while for p>2, the algorithm uses ~O
ε
(n
1-1/p
) space. We similarly design streaming algorithms that are simultaneously near-optimal in both space complexity and the number of state changes for the heavy-hitters problem, sparse support recovery, and entropy estimation. Our results demonstrate that an optimal number of state changes can be achieved without sacrificing space complexity.