单侧误差下子序列自由度的最优无分布抽样检验

ACM Transactions on Computation Theory (TOCT) Pub Date : 2022-03-24 DOI:10.1145/3512750

D. Ron, Asaf Rosin

{"title":"单侧误差下子序列自由度的最优无分布抽样检验","authors":"D. Ron, Asaf Rosin","doi":"10.1145/3512750","DOIUrl":null,"url":null,"abstract":"In this work, we study the problem of testing subsequence-freeness. For a given subsequence (word) w = w1 … wk, a sequence (text) T = t1 … tn is said to contain w if there exist indices 1 ≤ i1 < … < ik ≤ n such that tij = wj for every 1 ≤ j ≤ k. Otherwise, T is w-free. While a large majority of the research in property testing deals with algorithms that perform queries, here we consider sample-based testing (with one-sided error). In the “standard” sample-based model (i.e., under the uniform distribution), the algorithm is given samples (i, ti) where i is distributed uniformly independently at random. The algorithm should distinguish between the case that T is w-free, and the case that T is ε-far from being w-free (i.e., more than an ε-fraction of its symbols should be modified so as to make it w-free). Freitag, Price, and Swartworth (Proceedings of RANDOM, 2017) showed that O((k2 log k)ε) samples suffice for this testing task. We obtain the following results. – The number of samples sufficient for one-sided error sample-based testing (under the uniform distribution) is O(kε). This upper bound builds on a characterization that we present for the distance of a text T from w-freeness in terms of the maximum number of copies of w in T, where these copies should obey certain restrictions. – We prove a matching lower bound, which holds for every word w. This implies that the above upper bound is tight. – The same upper bound holds in the more general distribution-free sample-based model. In this model, the algorithm receives samples (i, ti) where i is distributed according to an arbitrary distribution p (and the distance from w-freeness is measured with respect to p). We highlight the fact that while we require that the testing algorithm work for every distribution and when only provided with samples, the complexity we get matches a known lower bound for a special case of the seemingly easier problem of testing subsequence-freeness with one-sided error under the uniform distribution and with queries (Canonne et al., Theory of Computing, 2019).","PeriodicalId":198744,"journal":{"name":"ACM Transactions on Computation Theory (TOCT)","volume":"171 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Optimal Distribution-Free Sample-Based Testing of Subsequence-Freeness with One-Sided Error\",\"authors\":\"D. Ron, Asaf Rosin\",\"doi\":\"10.1145/3512750\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we study the problem of testing subsequence-freeness. For a given subsequence (word) w = w1 … wk, a sequence (text) T = t1 … tn is said to contain w if there exist indices 1 ≤ i1 < … < ik ≤ n such that tij = wj for every 1 ≤ j ≤ k. Otherwise, T is w-free. While a large majority of the research in property testing deals with algorithms that perform queries, here we consider sample-based testing (with one-sided error). In the “standard” sample-based model (i.e., under the uniform distribution), the algorithm is given samples (i, ti) where i is distributed uniformly independently at random. The algorithm should distinguish between the case that T is w-free, and the case that T is ε-far from being w-free (i.e., more than an ε-fraction of its symbols should be modified so as to make it w-free). Freitag, Price, and Swartworth (Proceedings of RANDOM, 2017) showed that O((k2 log k)ε) samples suffice for this testing task. We obtain the following results. – The number of samples sufficient for one-sided error sample-based testing (under the uniform distribution) is O(kε). This upper bound builds on a characterization that we present for the distance of a text T from w-freeness in terms of the maximum number of copies of w in T, where these copies should obey certain restrictions. – We prove a matching lower bound, which holds for every word w. This implies that the above upper bound is tight. – The same upper bound holds in the more general distribution-free sample-based model. In this model, the algorithm receives samples (i, ti) where i is distributed according to an arbitrary distribution p (and the distance from w-freeness is measured with respect to p). We highlight the fact that while we require that the testing algorithm work for every distribution and when only provided with samples, the complexity we get matches a known lower bound for a special case of the seemingly easier problem of testing subsequence-freeness with one-sided error under the uniform distribution and with queries (Canonne et al., Theory of Computing, 2019).\",\"PeriodicalId\":198744,\"journal\":{\"name\":\"ACM Transactions on Computation Theory (TOCT)\",\"volume\":\"171 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Computation Theory (TOCT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3512750\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Computation Theory (TOCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3512750","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在这项工作中，我们研究了子序列自由度的测试问题。对于给定的子序列(word) w = w1…wk，如果存在索引1≤i1 <…< ik≤n，且对于每一个1≤j≤k, tij = wj，则称序列(text) T = t1…tn包含w，否则T不含w。虽然绝大多数属性测试研究涉及执行查询的算法，但在这里我们考虑基于样本的测试(具有单侧误差)。在“标准”样本模型中(即均匀分布下)，算法给定样本(i, ti)，其中i均匀独立随机分布。算法应该区分T是无w的情况，以及T是ε-远不是无w的情况(即，应该修改其符号的ε-分数以使其无w)。Freitag, Price和Swartworth (Proceedings of RANDOM, 2017)表明，O((k2 log k)ε)样本足以完成该测试任务。我们得到以下结果。-在均匀分布下，足以进行单侧误差样本检验的样本数量为0 (kε)。这个上限建立在一个表征上，我们用w在T中的最大拷贝数来表示文本T与w自由的距离，这些拷贝应该遵守一定的限制。-我们证明了一个匹配的下界，它适用于每个单词w。这意味着上面的上界是紧的。同样的上界适用于更一般的无分布的基于样本的模型。在该模型中，算法接收样本(i, ti)，其中i根据任意分布p分布(并且相对于p测量到w-free的距离)。我们强调的事实是，虽然我们要求测试算法适用于每个分布，并且仅提供样本时，我们得到的复杂性匹配一个已知的下界，这是一个看似更容易的问题，即在均匀分布和查询下测试单侧错误的子序列自由性(Canonne等人，Theory of Computing, 2019)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Optimal Distribution-Free Sample-Based Testing of Subsequence-Freeness with One-Sided Error

In this work, we study the problem of testing subsequence-freeness. For a given subsequence (word) w = w1 … wk, a sequence (text) T = t1 … tn is said to contain w if there exist indices 1 ≤ i1 < … < ik ≤ n such that tij = wj for every 1 ≤ j ≤ k. Otherwise, T is w-free. While a large majority of the research in property testing deals with algorithms that perform queries, here we consider sample-based testing (with one-sided error). In the “standard” sample-based model (i.e., under the uniform distribution), the algorithm is given samples (i, ti) where i is distributed uniformly independently at random. The algorithm should distinguish between the case that T is w-free, and the case that T is ε-far from being w-free (i.e., more than an ε-fraction of its symbols should be modified so as to make it w-free). Freitag, Price, and Swartworth (Proceedings of RANDOM, 2017) showed that O((k2 log k)ε) samples suffice for this testing task. We obtain the following results. – The number of samples sufficient for one-sided error sample-based testing (under the uniform distribution) is O(kε). This upper bound builds on a characterization that we present for the distance of a text T from w-freeness in terms of the maximum number of copies of w in T, where these copies should obey certain restrictions. – We prove a matching lower bound, which holds for every word w. This implies that the above upper bound is tight. – The same upper bound holds in the more general distribution-free sample-based model. In this model, the algorithm receives samples (i, ti) where i is distributed according to an arbitrary distribution p (and the distance from w-freeness is measured with respect to p). We highlight the fact that while we require that the testing algorithm work for every distribution and when only provided with samples, the complexity we get matches a known lower bound for a special case of the seemingly easier problem of testing subsequence-freeness with one-sided error under the uniform distribution and with queries (Canonne et al., Theory of Computing, 2019).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Computation Theory (TOCT)

自引率

0.00%

发文量

期刊最新文献

A Polynomial Degree Bound on Equations for Non-rigid Matrices and Small Linear Circuits Optimal Distribution-Free Sample-Based Testing of Subsequence-Freeness with One-Sided Error Approximate Degree, Weight, and Indistinguishability The (Coarse) Fine-Grained Structure of NP-Hard SAT and CSP Problems Multiplicative Parameterization Above a Guarantee