The White-Box Adversarial Data Stream Model

Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems Pub Date : 2022-04-19 DOI:10.1145/3517804.3526228

M. Ajtai, V. Braverman, T. S. Jayram, Sandeep Silwal, Alec Sun, David P. Woodruff, Samson Zhou

{"title":"The White-Box Adversarial Data Stream Model","authors":"M. Ajtai, V. Braverman, T. S. Jayram, Sandeep Silwal, Alec Sun, David P. Woodruff, Samson Zhou","doi":"10.1145/3517804.3526228","DOIUrl":null,"url":null,"abstract":"There has been a flurry of recent literature studying streaming algorithms for which the input stream is chosen adaptively by a black-box adversary who observes the output of the streaming algorithm at each time step. However, these algorithms fail when the adversary has access to the internal state of the algorithm, rather than just the output of the algorithm. We study streaming algorithms in the white-box adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible. We first give a randomized algorithm for the L1-heavy hitters problem that outperforms the optimal deterministic Misra-Gries algorithm on long streams. If the white-box adversary is computationally bounded, we use cryptographic techniques to reduce the memory of our L1-heavy hitters algorithm even further and to design a number of additional algorithms for graph, string, and linear algebra problems. The existence of such algorithms is surprising, as the streaming algorithm does not even have a secret key in this model, i.e., its state is entirely known to the adversary. One algorithm we design is for estimating the number of distinct elements in a stream with insertions and deletions achieving a multiplicative approximation and sublinear space; such an algorithm is impossible for deterministic algorithms. We also give a general technique that translates any two-player deterministic communication lower bound to a lower bound for randomized algorithms robust to a white-box adversary. In particular, our results show that for all p≥0, there exists a constant Cp>1 such that any Cp-approximation algorithm for Fp moment estimation in insertion-only streams with a white-box adversary requires Ω(n) space for a universe of size n. Similarly, there is a constant C>1 such that any C-approximation algorithm in an insertion-only stream for matrix rank requires Ω(n) space with a white-box adversary. These results do not contradict our upper bounds since they assume the adversary has unbounded computational power. Our algorithmic results based on cryptography thus show a separation between computationally bounded and unbounded adversaries. Finally, we prove a lower bound of Ω(log(n)) bits for the fundamental problem of deterministic approximate counting in a stream of 0s and 1s, which holds even if we know how many total stream updates we have seen so far at each point in the stream. Such a lower bound for approximate counting with additional information was previously unknown, and in our context, it shows a separation between multiplayer deterministic maximum communication and the white-box space complexity of a streaming algorithm.","PeriodicalId":230606,"journal":{"name":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517804.3526228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

There has been a flurry of recent literature studying streaming algorithms for which the input stream is chosen adaptively by a black-box adversary who observes the output of the streaming algorithm at each time step. However, these algorithms fail when the adversary has access to the internal state of the algorithm, rather than just the output of the algorithm. We study streaming algorithms in the white-box adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible. We first give a randomized algorithm for the L1-heavy hitters problem that outperforms the optimal deterministic Misra-Gries algorithm on long streams. If the white-box adversary is computationally bounded, we use cryptographic techniques to reduce the memory of our L1-heavy hitters algorithm even further and to design a number of additional algorithms for graph, string, and linear algebra problems. The existence of such algorithms is surprising, as the streaming algorithm does not even have a secret key in this model, i.e., its state is entirely known to the adversary. One algorithm we design is for estimating the number of distinct elements in a stream with insertions and deletions achieving a multiplicative approximation and sublinear space; such an algorithm is impossible for deterministic algorithms. We also give a general technique that translates any two-player deterministic communication lower bound to a lower bound for randomized algorithms robust to a white-box adversary. In particular, our results show that for all p≥0, there exists a constant Cp>1 such that any Cp-approximation algorithm for Fp moment estimation in insertion-only streams with a white-box adversary requires Ω(n) space for a universe of size n. Similarly, there is a constant C>1 such that any C-approximation algorithm in an insertion-only stream for matrix rank requires Ω(n) space with a white-box adversary. These results do not contradict our upper bounds since they assume the adversary has unbounded computational power. Our algorithmic results based on cryptography thus show a separation between computationally bounded and unbounded adversaries. Finally, we prove a lower bound of Ω(log(n)) bits for the fundamental problem of deterministic approximate counting in a stream of 0s and 1s, which holds even if we know how many total stream updates we have seen so far at each point in the stream. Such a lower bound for approximate counting with additional information was previously unknown, and in our context, it shows a separation between multiplayer deterministic maximum communication and the white-box space complexity of a streaming algorithm.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

白盒对抗数据流模型

最近有大量研究流算法的文献，其中输入流由黑箱攻击者自适应地选择，该攻击者在每个时间步观察流算法的输出。然而，当攻击者能够访问算法的内部状态，而不仅仅是算法的输出时，这些算法就会失败。我们在白盒对抗模型中研究流算法，其中流由对手自适应地选择，对手在每个时间步观察算法的整个内部状态。我们证明了非平凡算法仍然是可能的。我们首先给出了一种随机算法来解决L1-heavy hit问题，该算法在长数据流上优于最优确定性Misra-Gries算法。如果白盒对手在计算上是有限的，我们使用加密技术来进一步减少我们的L1-heavy hitters算法的内存，并为图、字符串和线性代数问题设计许多额外的算法。这种算法的存在是令人惊讶的，因为流算法在这个模型中甚至没有密钥，也就是说，它的状态是对手完全知道的。我们设计的一种算法是用于估计具有插入和删除的流中不同元素的数量，从而实现乘法近似和亚线性空间;这种算法对于确定性算法来说是不可能的。我们还给出了一种通用技术，将任何两方确定性通信的下界转换为对白盒对手具有鲁棒性的随机算法的下界。特别地，我们的结果表明，对于所有p≥0，存在一个常数Cp>，使得在具有白盒对手的纯插入流中，任何用于Fp矩估计的Cp-近似算法对于大小为n的宇宙都需要Ω(n)空间。类似地，存在一个常数C>，使得在具有矩阵秩的纯插入流中，任何C-近似算法都需要具有白盒对手的Ω(n)空间。这些结果与我们的上限并不矛盾，因为它们假设对手具有无限的计算能力。因此，我们基于密码学的算法结果显示了计算有界和无界对手之间的分离。最后，我们证明了在0和1的流中确定性近似计数的基本问题的Ω(log(n))位的下界，即使我们知道到目前为止我们在流中的每个点上看到的流更新总数也是如此。这种带有附加信息的近似计数的下界以前是未知的，在我们的上下文中，它显示了多人确定性最大通信和流算法的白盒空间复杂性之间的分离。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助