How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions

Tal Herman, Guy Rothblum
{"title":"How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions","authors":"Tal Herman, Guy Rothblum","doi":"arxiv-2409.06594","DOIUrl":null,"url":null,"abstract":"As statistical analyses become more central to science, industry and society,\nthere is a growing need to ensure correctness of their results. Approximate\ncorrectness can be verified by replicating the entire analysis, but can we\nverify without replication? Building on a recent line of work, we study\nproof-systems that allow a probabilistic verifier to ascertain that the results\nof an analysis are approximately correct, while drawing fewer samples and using\nless computational resources than would be needed to replicate the analysis. We\nfocus on distribution testing problems: verifying that an unknown distribution\nis close to having a claimed property. Our main contribution is a interactive protocol between a verifier and an\nuntrusted prover, which can be used to verify any distribution property that\ncan be decided in polynomial time given a full and explicit description of the\ndistribution. If the distribution is at statistical distance $\\varepsilon$ from\nhaving the property, then the verifier rejects with high probability. This\nsoundness property holds against any polynomial-time strategy that a cheating\nprover might follow, assuming the existence of collision-resistant hash\nfunctions (a standard assumption in cryptography). For distributions over a\ndomain of size $N$, the protocol consists of $4$ messages and the communication\ncomplexity and verifier runtime are roughly $\\widetilde{O}\\left(\\sqrt{N} /\n\\varepsilon^2 \\right)$. The verifier's sample complexity is\n$\\widetilde{O}\\left(\\sqrt{N} / \\varepsilon^2 \\right)$, and this is optimal up\nto $\\polylog(N)$ factors (for any protocol, regardless of its communication\ncomplexity). Even for simple properties, approximately deciding whether an\nunknown distribution has the property can require quasi-linear sample\ncomplexity and running time. For any such property, our protocol provides a\nquadratic speedup over replicating the analysis.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

As statistical analyses become more central to science, industry and society, there is a growing need to ensure correctness of their results. Approximate correctness can be verified by replicating the entire analysis, but can we verify without replication? Building on a recent line of work, we study proof-systems that allow a probabilistic verifier to ascertain that the results of an analysis are approximately correct, while drawing fewer samples and using less computational resources than would be needed to replicate the analysis. We focus on distribution testing problems: verifying that an unknown distribution is close to having a claimed property. Our main contribution is a interactive protocol between a verifier and an untrusted prover, which can be used to verify any distribution property that can be decided in polynomial time given a full and explicit description of the distribution. If the distribution is at statistical distance $\varepsilon$ from having the property, then the verifier rejects with high probability. This soundness property holds against any polynomial-time strategy that a cheating prover might follow, assuming the existence of collision-resistant hash functions (a standard assumption in cryptography). For distributions over a domain of size $N$, the protocol consists of $4$ messages and the communication complexity and verifier runtime are roughly $\widetilde{O}\left(\sqrt{N} / \varepsilon^2 \right)$. The verifier's sample complexity is $\widetilde{O}\left(\sqrt{N} / \varepsilon^2 \right)$, and this is optimal up to $\polylog(N)$ factors (for any protocol, regardless of its communication complexity). Even for simple properties, approximately deciding whether an unknown distribution has the property can require quasi-linear sample complexity and running time. For any such property, our protocol provides a quadratic speedup over replicating the analysis.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
如何验证任何(合理的)分布属性:计算合理的分布论证系统
随着统计分析在科学、工业和社会中变得越来越重要,人们越来越需要确保其结果的正确性。近似正确性可以通过复制整个分析来验证,但我们能在不复制的情况下验证吗?在最近的工作基础上,我们研究了允许概率验证者确定分析结果近似正确的验证系统,同时比复制分析所需的样本和计算资源更少。我们的重点是分布测试问题:验证未知分布是否接近所宣称的属性。我们的主要贡献在于验证者与不受信任的证明者之间的交互协议,该协议可用于验证任何分布属性,只要给定对分布的完整而明确的描述,就能在多项式时间内确定分布属性。如果分布与该属性的统计距离为 $\varepsilon$ ,那么验证者就会高概率地拒绝验证。假设存在抗碰撞的哈希函数(密码学中的标准假设),那么这个健全性就能抵御作弊者可能采取的任何多项式时间策略。对于大小为 $N$ 的域上分布,协议由 $4$ 消息组成,通信复杂度和验证者运行时间大致为 $\widetilde{O}\left(\sqrt{N} /\varepsilon^2 \right)$。验证者的采样复杂度为$widetilde{O}\left(\sqrt{N} / \varepsilon^2 \right)$,而且这是最优的,最高可达$polylog(N)$因子(对于任何协议,无论其通信复杂度如何)。即使是简单的属性,近似判断未知分布是否具有该属性也需要准线性的样本复杂度和运行时间。对于任何此类属性,我们的协议都能比复制分析提供无量级的速度提升。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
PAD-FT: A Lightweight Defense for Backdoor Attacks via Data Purification and Fine-Tuning Artemis: Efficient Commit-and-Prove SNARKs for zkML A Survey-Based Quantitative Analysis of Stress Factors and Their Impacts Among Cybersecurity Professionals Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction Practical Investigation on the Distinguishability of Longa's Atomic Patterns
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1