近似分布支撑尺寸的强下界及不同元问题

Sofya Raskhodnikova, D. Ron, Amir Shpilka, Adam D. Smith
{"title":"近似分布支撑尺寸的强下界及不同元问题","authors":"Sofya Raskhodnikova, D. Ron, Amir Shpilka, Adam D. Smith","doi":"10.1109/FOCS.2007.67","DOIUrl":null,"url":null,"abstract":"We consider the problem of approximating the support size of a distribution from a small number of samples, when each element in the distribution appears with probability at least 1/n. This problem is closely related to the problem of approximating the number of distinct elements in a sequence of length n. For both problems, we prove a nearly linear in n lower bound on the query complexity, applicable even for approximation with additive error. At the heart of the lower bound is a construction of two positive integer random variables. X<sub>1</sub> and X<sub>2</sub>, with very different expectations and the following condition on the first k moments: E[X<sub>1</sub>]/E[X<sub>2</sub>] = E[X<sub>1</sub> <sup>2</sup>]/E[X<sub>2</sub> <sup>2</sup>] = ... = E[X<sub>1</sub> <sup>k</sup>]/E[X<sub>2</sub> <sup>k</sup>]. Our lower bound method is also applicable to other problems. In particular, it gives new lower bounds for the sample complexity of (1) approximating the entropy of a distribution and (2) approximating how well a given string is compressed by the Lempel-Ziv scheme.","PeriodicalId":197431,"journal":{"name":"48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2007-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"136","resultStr":"{\"title\":\"Strong Lower Bounds for Approximating Distribution Support Size and the Distinct Elements Problem\",\"authors\":\"Sofya Raskhodnikova, D. Ron, Amir Shpilka, Adam D. Smith\",\"doi\":\"10.1109/FOCS.2007.67\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of approximating the support size of a distribution from a small number of samples, when each element in the distribution appears with probability at least 1/n. This problem is closely related to the problem of approximating the number of distinct elements in a sequence of length n. For both problems, we prove a nearly linear in n lower bound on the query complexity, applicable even for approximation with additive error. At the heart of the lower bound is a construction of two positive integer random variables. X<sub>1</sub> and X<sub>2</sub>, with very different expectations and the following condition on the first k moments: E[X<sub>1</sub>]/E[X<sub>2</sub>] = E[X<sub>1</sub> <sup>2</sup>]/E[X<sub>2</sub> <sup>2</sup>] = ... = E[X<sub>1</sub> <sup>k</sup>]/E[X<sub>2</sub> <sup>k</sup>]. Our lower bound method is also applicable to other problems. In particular, it gives new lower bounds for the sample complexity of (1) approximating the entropy of a distribution and (2) approximating how well a given string is compressed by the Lempel-Ziv scheme.\",\"PeriodicalId\":197431,\"journal\":{\"name\":\"48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"136\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FOCS.2007.67\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FOCS.2007.67","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 136

摘要

我们考虑从少量样本中近似分布的支持大小的问题,当分布中的每个元素以至少1/n的概率出现时。该问题与近似长度为n的序列中不同元素个数的问题密切相关。对于这两个问题,我们证明了查询复杂度在n下界近似线性,甚至适用于具有加性误差的近似。下界的核心是两个正整数随机变量的构造。X1和X2,期望值非常不同,前k个矩的条件如下:E[X1]/E[X2] = E[X1 2]/E[X2 2] =…= E[X1 k]/E[X2 k]。我们的下界方法也适用于其他问题。特别是,它给出了(1)近似分布的熵和(2)近似给定字符串被Lempel-Ziv方案压缩的程度的样本复杂度的新下界。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Strong Lower Bounds for Approximating Distribution Support Size and the Distinct Elements Problem
We consider the problem of approximating the support size of a distribution from a small number of samples, when each element in the distribution appears with probability at least 1/n. This problem is closely related to the problem of approximating the number of distinct elements in a sequence of length n. For both problems, we prove a nearly linear in n lower bound on the query complexity, applicable even for approximation with additive error. At the heart of the lower bound is a construction of two positive integer random variables. X1 and X2, with very different expectations and the following condition on the first k moments: E[X1]/E[X2] = E[X1 2]/E[X2 2] = ... = E[X1 k]/E[X2 k]. Our lower bound method is also applicable to other problems. In particular, it gives new lower bounds for the sample complexity of (1) approximating the entropy of a distribution and (2) approximating how well a given string is compressed by the Lempel-Ziv scheme.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Maximizing Non-Monotone Submodular Functions On the Complexity of Nash Equilibria and Other Fixed Points (Extended Abstract) A Lower Bound for the Size of Syntactically Multilinear Arithmetic Circuits Linear Equations Modulo 2 and the L1 Diameter of Convex Bodies Non-Preemptive Min-Sum Scheduling with Resource Augmentation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1