首页 > 最新文献

IEEE Transactions on Information Theory最新文献

英文 中文
Predicting Truncated Galois Linear Feedback Shift Registers 预测截断伽罗瓦线性反馈移位寄存器
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-13 DOI: 10.1109/tit.2024.3442870
Han-Bing Yu, Qun-Xiong Zheng
{"title":"Predicting Truncated Galois Linear Feedback Shift Registers","authors":"Han-Bing Yu, Qun-Xiong Zheng","doi":"10.1109/tit.2024.3442870","DOIUrl":"https://doi.org/10.1109/tit.2024.3442870","url":null,"abstract":"","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"32 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing Dependency of Unlabeled Databases 测试无标记数据库的依赖性
IF 2.2 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-13 DOI: 10.1109/TIT.2024.3442977
Vered Paslev;Wasim Huleihel
In this paper, we investigate the problem of deciding whether two random databases $textsf {X}in {mathcal { X}} ^{ntimes d}$ and $textsf {Y}in {mathcal { Y}} ^{ntimes d}$ are statistically dependent or not. This is formulated as a hypothesis testing problem, where under the null hypothesis, these two databases are statistically independent, while under the alternative, there exists an unknown row permutation $sigma $ , such that $textsf {X}$ and $textsf {Y}^{sigma } $ , a permuted version of $textsf {Y}$ , are statistically dependent with some known joint distribution, but have the same marginal distributions as the null. We characterize the thresholds at which optimal testing is information-theoretically impossible and possible, as a function of n, d, and some spectral properties of the generative distributions of the datasets. For example, we prove that if a certain function of the eigenvalues of the likelihood function and d, is below a certain threshold, as $dto infty $ , then weak detection (performing slightly better than random guessing) is statistically impossible, no matter what the value of n is. This mimics the performance of an efficient test that thresholds a centered version of the log-likelihood function of the observed matrices. We also analyze the case where d is fixed, for which we derive strong (vanishing error) and weak detection lower and upper bounds.
在本文中,我们研究了如何决定两个随机数据库 $textsf {X}in {mathcal { X}} 是否是和 $textsf {Y}in {mathcal { Y}} 是统计上的吗是否具有统计依赖性。这被表述为一个假设检验问题,在零假设下,这两个数据库在统计上是独立的,而在备择假设下,存在一个未知的行排列组合 $sigma $ ,使得 $textsf {X}$ 和 $textsf {Y}^{sigma } $ ,是 $textsf {Y}^{sigma } 的一个排列版本。$ ,$textsf {Y}$的一个置换版本,在统计上与某种已知的联合分布相关,但具有与空值相同的边际分布。作为 n、d 和数据集生成分布的一些谱属性的函数,我们描述了最佳测试在信息论上不可能和可能的阈值。例如,我们证明,如果似然函数的特征值和 d 的某个函数低于某个阈值,即 $dto infty $,那么无论 n 的值是多少,弱检测(比随机猜测表现稍好)在统计学上都是不可能的。这模仿了高效测试的性能,该测试对观测矩阵的对数似然函数的居中版本进行阈值化。我们还分析了 d 固定的情况,并得出了强检测(误差消失)和弱检测的下限和上限。
{"title":"Testing Dependency of Unlabeled Databases","authors":"Vered Paslev;Wasim Huleihel","doi":"10.1109/TIT.2024.3442977","DOIUrl":"10.1109/TIT.2024.3442977","url":null,"abstract":"In this paper, we investigate the problem of deciding whether two random databases \u0000<inline-formula> <tex-math>$textsf {X}in {mathcal { X}} ^{ntimes d}$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$textsf {Y}in {mathcal { Y}} ^{ntimes d}$ </tex-math></inline-formula>\u0000 are statistically dependent or not. This is formulated as a hypothesis testing problem, where under the null hypothesis, these two databases are statistically independent, while under the alternative, there exists an unknown row permutation \u0000<inline-formula> <tex-math>$sigma $ </tex-math></inline-formula>\u0000, such that \u0000<inline-formula> <tex-math>$textsf {X}$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$textsf {Y}^{sigma } $ </tex-math></inline-formula>\u0000, a permuted version of \u0000<inline-formula> <tex-math>$textsf {Y}$ </tex-math></inline-formula>\u0000, are statistically dependent with some known joint distribution, but have the same marginal distributions as the null. We characterize the thresholds at which optimal testing is information-theoretically impossible and possible, as a function of n, d, and some spectral properties of the generative distributions of the datasets. For example, we prove that if a certain function of the eigenvalues of the likelihood function and d, is below a certain threshold, as \u0000<inline-formula> <tex-math>$dto infty $ </tex-math></inline-formula>\u0000, then weak detection (performing slightly better than random guessing) is statistically impossible, no matter what the value of n is. This mimics the performance of an efficient test that thresholds a centered version of the log-likelihood function of the observed matrices. We also analyze the case where d is fixed, for which we derive strong (vanishing error) and weak detection lower and upper bounds.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"7410-7431"},"PeriodicalIF":2.2,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transfer Learning in Bandits With Latent Continuity 具有潜在连续性的匪帮中的迁移学习
IF 2.2 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-12 DOI: 10.1109/TIT.2024.3441669
Hyejin Park;Seiyun Shin;Kwang-Sung Jun;Jungseul Ok
A continuity structure of correlations among arms in multi-armed bandit can bring a significant acceleration of exploration and reduction of regret, in particular, when there are many arms. However, it is often latent in practice. To cope with the latent continuity, we consider a transfer learning setting where an agent learns the structural information, parameterized by a Lipschitz constant and an embedding of arms, from a sequence of past tasks and transfers it to a new one. We propose a simple but provably-efficient algorithm to accurately estimate and fully exploit the Lipschitz continuity at the same asymptotic order of lower bound of sample complexity in the previous tasks. The proposed algorithm is applicable to estimate not only a latent Lipschitz constant given an embedding, but also a latent embedding, while the latter requires slightly more sample complexity. To be specific, we analyze the efficiency of the proposed framework in two folds: (i) our regret bound on the new task is close to that of the oracle algorithm with the full knowledge of the Lipschitz continuity under mild assumptions; and (ii) the sample complexity of our estimator matches with the information-theoretic fundamental limit. Our analysis reveals a set of useful insights on transfer learning for latent Lipschitz continuity. From a numerical evaluation based on real-world dataset of rate adaptation in time-varying wireless channel, we demonstrate the theoretical findings and show the superiority of the proposed framework compared to baselines.
多臂强盗中武器间相关性的连续性结构可以大大加快探索速度,减少遗憾,尤其是在武器数量众多的情况下。然而,在实践中它往往是潜在的。为了应对这种潜在的连续性,我们考虑了一种迁移学习设置,即代理从过去的一系列任务中学习结构信息(参数为李普希茨常数和武器嵌入),并将其迁移到新任务中。我们提出了一种简单但可证明高效的算法,可在先前任务中以相同的样本复杂度下限渐近阶准确估计并充分利用 Lipschitz 连续性。所提出的算法不仅适用于估计给定嵌入的潜在 Lipschitz 常量,也适用于估计潜在嵌入,而后者所需的样本复杂度略高。具体来说,我们从两个方面分析了所提框架的效率:(i) 在温和的假设条件下,我们对新任务的遗憾约束接近于完全了解 Lipschitz 连续性的神谕算法;(ii) 我们的估计器的样本复杂度与信息论基本极限相匹配。我们的分析揭示了潜在 Lipschitz 连续性迁移学习的一系列有用见解。通过对时变无线信道中速率适应的实际数据集进行数值评估,我们证明了理论结论,并展示了与基线相比,拟议框架的优越性。
{"title":"Transfer Learning in Bandits With Latent Continuity","authors":"Hyejin Park;Seiyun Shin;Kwang-Sung Jun;Jungseul Ok","doi":"10.1109/TIT.2024.3441669","DOIUrl":"10.1109/TIT.2024.3441669","url":null,"abstract":"A continuity structure of correlations among arms in multi-armed bandit can bring a significant acceleration of exploration and reduction of regret, in particular, when there are many arms. However, it is often latent in practice. To cope with the latent continuity, we consider a transfer learning setting where an agent learns the structural information, parameterized by a Lipschitz constant and an embedding of arms, from a sequence of past tasks and transfers it to a new one. We propose a simple but provably-efficient algorithm to accurately estimate and fully exploit the Lipschitz continuity at the same asymptotic order of lower bound of sample complexity in the previous tasks. The proposed algorithm is applicable to estimate not only a latent Lipschitz constant given an embedding, but also a latent embedding, while the latter requires slightly more sample complexity. To be specific, we analyze the efficiency of the proposed framework in two folds: (i) our regret bound on the new task is close to that of the oracle algorithm with the full knowledge of the Lipschitz continuity under mild assumptions; and (ii) the sample complexity of our estimator matches with the information-theoretic fundamental limit. Our analysis reveals a set of useful insights on transfer learning for latent Lipschitz continuity. From a numerical evaluation based on real-world dataset of rate adaptation in time-varying wireless channel, we demonstrate the theoretical findings and show the superiority of the proposed framework compared to baselines.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 11","pages":"7952-7970"},"PeriodicalIF":2.2,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable Multi-Round Multi-Party Privacy-Preserving Neural Network Training 可扩展的多轮多方隐私保护神经网络训练
IF 2.2 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-09 DOI: 10.1109/TIT.2024.3441509
Xingyu Lu;Umit Yigit Basaran;Başak Güler
Privacy-preserving machine learning has achieved breakthrough advances in collaborative training of machine learning models, under strong information-theoretic privacy guarantees. Despite the recent advances, communication bottleneck still remains as a major challenge against scalability in neural networks. To address this challenge, this paper presents the first scalable multi-party neural network training framework with linear communication complexity, significantly improving over the quadratic state-of-the-art, under strong end-to-end information-theoretic privacy guarantees. Our contribution is an iterative coded computing mechanism with linear communication complexity, termed Double Lagrange Coding, which allows iterative scalable multi-party polynomial computations without degrading the parallelization gain, adversary tolerance, and dropout resilience throughout the iterations. While providing strong multi-round information-theoretic privacy guarantees, our framework achieves equal adversary tolerance, resilience to user dropouts, and model accuracy to the state-of-the-art, while reducing the communication overhead from quadratic to linear. In doing so, our framework addresses a key technical challenge in collaborative privacy-preserving machine learning, while paving the way for large-scale privacy-preserving iterative algorithms for deep learning and beyond.
在强大的信息论隐私保证下,隐私保护机器学习在机器学习模型的协作训练方面取得了突破性进展。尽管取得了最新进展,但通信瓶颈仍是神经网络可扩展性面临的主要挑战。为了应对这一挑战,本文提出了首个具有线性通信复杂度的可扩展多方神经网络训练框架,在强大的端到端信息论隐私保证下,显著改善了二次方的最新水平。我们的贡献是一种具有线性通信复杂度的迭代编码计算机制(称为双拉格朗日编码),它允许进行可扩展的多方多项式迭代计算,而不会降低整个迭代过程中的并行化增益、对手容忍度和抗丢弃能力。在提供强大的多轮信息论隐私保证的同时,我们的框架实现了与最先进技术同等的对手容错性、对用户辍学的恢复能力和模型准确性,同时将通信开销从二次方降低到线性。这样,我们的框架就解决了协作式隐私保护机器学习中的一个关键技术难题,同时为深度学习及其他领域的大规模隐私保护迭代算法铺平了道路。
{"title":"Scalable Multi-Round Multi-Party Privacy-Preserving Neural Network Training","authors":"Xingyu Lu;Umit Yigit Basaran;Başak Güler","doi":"10.1109/TIT.2024.3441509","DOIUrl":"10.1109/TIT.2024.3441509","url":null,"abstract":"Privacy-preserving machine learning has achieved breakthrough advances in collaborative training of machine learning models, under strong information-theoretic privacy guarantees. Despite the recent advances, communication bottleneck still remains as a major challenge against scalability in neural networks. To address this challenge, this paper presents the first scalable multi-party neural network training framework with linear communication complexity, significantly improving over the quadratic state-of-the-art, under strong end-to-end information-theoretic privacy guarantees. Our contribution is an iterative coded computing mechanism with linear communication complexity, termed Double Lagrange Coding, which allows iterative scalable multi-party polynomial computations without degrading the parallelization gain, adversary tolerance, and dropout resilience throughout the iterations. While providing strong multi-round information-theoretic privacy guarantees, our framework achieves equal adversary tolerance, resilience to user dropouts, and model accuracy to the state-of-the-art, while reducing the communication overhead from quadratic to linear. In doing so, our framework addresses a key technical challenge in collaborative privacy-preserving machine learning, while paving the way for large-scale privacy-preserving iterative algorithms for deep learning and beyond.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 11","pages":"8204-8236"},"PeriodicalIF":2.2,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conflict-Avoiding Codes of Prime Lengths and Cyclotomic Numbers 质数长度和循环数的避免冲突代码
IF 2.2 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-09 DOI: 10.1109/TIT.2024.3439714
Liang-Chung Hsia;Hua-Chieh Li;Wei-Liang Sun
The problem to construct optimal conflict-avoiding codes of even lengths and the Hamming weight 3 is completely settled. On the contrary, it is still open for odd lengths. It turns out that the prime lengths are the fundamental cases needed to be constructed. In the article, we study conflict-avoiding codes of prime lengths and give a connection with the so-called cyclotomic numbers. By having some nonzero cyclotomic numbers, a well-known algorithm for constructing optimal conflict-avoiding codes will work for certain prime lengths. As a consequence, we are able to answer the size of optimal conflict-avoiding code for a new class of prime lengths.
构建偶数长度、汉明权重为 3 的最佳避免冲突编码的问题已经完全解决。相反,奇数长度的问题仍未解决。事实证明,质数长度是需要构建的基本情况。在这篇文章中,我们研究了质数长度的避免冲突编码,并给出了与所谓的回旋数之间的联系。通过一些非零的循环数,一种著名的构建最优避免冲突编码的算法将适用于某些素数长度。因此,我们能够回答一类新质数长度的最优冲突避免代码的大小。
{"title":"Conflict-Avoiding Codes of Prime Lengths and Cyclotomic Numbers","authors":"Liang-Chung Hsia;Hua-Chieh Li;Wei-Liang Sun","doi":"10.1109/TIT.2024.3439714","DOIUrl":"10.1109/TIT.2024.3439714","url":null,"abstract":"The problem to construct optimal conflict-avoiding codes of even lengths and the Hamming weight 3 is completely settled. On the contrary, it is still open for odd lengths. It turns out that the prime lengths are the fundamental cases needed to be constructed. In the article, we study conflict-avoiding codes of prime lengths and give a connection with the so-called cyclotomic numbers. By having some nonzero cyclotomic numbers, a well-known algorithm for constructing optimal conflict-avoiding codes will work for certain prime lengths. As a consequence, we are able to answer the size of optimal conflict-avoiding code for a new class of prime lengths.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"6834-6841"},"PeriodicalIF":2.2,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrections to “Private Information Retrieval Over Gaussian MAC” 对 "高斯 MAC 上的私人信息检索 "的更正
IF 2.2 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-08 DOI: 10.1109/TIT.2024.3440476
Or Elimelech;Ori Shmuel;Asaf Cohen
In the above article [1], the authors introduced a PIR scheme for the Additive White Gaussian Noise (AWGN) Multiple Access Channel (MAC), both with and without fading. The authors utilized the additive nature of the channel and leveraged the linear properties and structure of lattice codes to retrieve the desired message without the servers acquiring any knowledge about the retrieved message’s index. Theorems 3 and 4 in [1] contain an error arising from the incorrect usage of the modulo operator. Moreover, the proofs assume a one-to-one mapping function, $phi (cdot)$ , between a message $W_{j}in mathbb {F}_{p}^{L}$ and the elements of $mathcal { C}$ , mistakenly suggesting that the user possesses all the required information in advance. To deal with that, we defined $phi (cdot)$ as a one-to-one mapping function between a vector of l information bits and a lattice point $lambda in {mathcal { C}}$ . Herein, we present the corrected versions of these theorems.
在上述文章[1]中,作者针对有衰落和无衰落的加性白高斯噪声(AWGN)多路访问信道(MAC)提出了一种 PIR 方案。作者利用信道的可加性和网格编码的线性特性和结构,在服务器不知道检索信息索引的情况下检索所需的信息。文献[1]中的定理 3 和 4 包含一个错误,该错误源于对 modulo 运算符的错误使用。此外,证明还假定在 mathbb {F}_{p}^{L}$ 中的 $W_{j} 与 $mathcal { C}$ 中的元素之间存在一个一一对应的映射函数 $phi (cdot)$ ,误以为用户事先掌握了所有需要的信息。为了解决这个问题,我们将 $phi (cdot)$ 定义为一个包含 l 个信息比特的向量与 {mathcal { C}}$ 中的一个晶格点 $lambda 之间的一一映射函数。 在此,我们提出了这些定理的修正版。
{"title":"Corrections to “Private Information Retrieval Over Gaussian MAC”","authors":"Or Elimelech;Ori Shmuel;Asaf Cohen","doi":"10.1109/TIT.2024.3440476","DOIUrl":"10.1109/TIT.2024.3440476","url":null,"abstract":"In the above article \u0000<xref>[1]</xref>\u0000, the authors introduced a PIR scheme for the Additive White Gaussian Noise (AWGN) Multiple Access Channel (MAC), both with and without fading. The authors utilized the additive nature of the channel and leveraged the linear properties and structure of lattice codes to retrieve the desired message without the servers acquiring any knowledge about the retrieved message’s index. Theorems 3 and 4 in \u0000<xref>[1]</xref>\u0000 contain an error arising from the incorrect usage of the modulo operator. Moreover, the proofs assume a one-to-one mapping function, \u0000<inline-formula> <tex-math>$phi (cdot)$ </tex-math></inline-formula>\u0000, between a message \u0000<inline-formula> <tex-math>$W_{j}in mathbb {F}_{p}^{L}$ </tex-math></inline-formula>\u0000 and the elements of \u0000<inline-formula> <tex-math>$mathcal { C}$ </tex-math></inline-formula>\u0000, mistakenly suggesting that the user possesses all the required information in advance. To deal with that, we defined \u0000<inline-formula> <tex-math>$phi (cdot)$ </tex-math></inline-formula>\u0000 as a one-to-one mapping function between a vector of \u0000<italic>l</i>\u0000 information bits and a lattice point \u0000<inline-formula> <tex-math>$lambda in {mathcal { C}}$ </tex-math></inline-formula>\u0000. Herein, we present the corrected versions of these theorems.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"7521-7524"},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Missing g-Mass: Investigating the Missing Parts of Distributions 缺失的 g-质量调查分布的缺失部分
IF 2.2 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-08 DOI: 10.1109/TIT.2024.3440661
Prafulla Chandra;Andrew Thangaraj
Estimating the underlying distribution from iid samples is a classical and important problem in statistics. When the alphabet size is large compared to number of samples, a portion of the distribution is highly likely to be unobserved or sparsely observed. The missing mass, defined as the sum of probabilities $Pr (x)$ over the missing letters x, and the Good-Turing estimator for missing mass have been important tools in large-alphabet distribution estimation. In this article, given a positive function g from $[{0,1}]$ to the reals, the missing g-mass, defined as the sum of $g(Pr (x))$ over the missing letters x, is introduced and studied. The missing g-mass can be used to investigate the structure of the missing part of the distribution. Specific applications for special cases such as order- $alpha $ missing mass ( $g(p)=p^{alpha }$ ) and the missing Shannon entropy ( $g(p)=-plog p$ ) include estimating distance from uniformity of the missing distribution and its partial estimation. Minimax estimation is studied for order- $alpha $ missing mass for integer values of $alpha $ and exact minimax convergence rates are obtained. Concentration is studied for a class of functions g and specific results are derived for order- $alpha $ missing mass and missing Shannon entropy. Sub-Gaussian tail bounds with near-optimal worst-case variance factors are derived. Two new notions of concentration, named strongly sub-Gamma and filtered sub-Gaussian concentration, are introduced and shown to result in right tail bounds that are better than those obtained from sub-Gaussian concentration.
从 iid 样本中估计基本分布是统计学中一个经典而重要的问题。当字母表的大小与样本数量相比较大时,分布的一部分极有可能未被观测到或观测稀少。缺失质量定义为缺失字母 x 的概率总和 $Pr (x)$,缺失质量的 Good-Turing 估计器一直是大字母分布估计的重要工具。本文介绍并研究了从$[{0,1}]$到实数的正函数g的缺失g-质量,其定义为缺失字母x上的$g(Pr (x))$之和。缺失 g 质量可用于研究分布中缺失部分的结构。特殊情况下的具体应用,如阶- $alpha $缺失质量($g(p)=p^{alpha }$)和缺失香农熵($g(p)=-plog p$),包括估计缺失分布的均匀性距离及其部分估计。针对 $alpha $ 的整数值,研究了阶 $alpha $ 缺失质量的最小估计,并获得了精确的最小收敛率。研究了一类函数 g 的集中性,并得出了阶(order- $alpha $)缺失质量和缺失香农熵的具体结果。推导出了接近最优最坏情况方差系数的亚高斯尾边界。引入了两个新的集中概念,分别称为强亚伽马集中和滤波亚高斯集中,并证明这两个概念能得到比亚高斯集中更好的右尾边界。
{"title":"Missing g-Mass: Investigating the Missing Parts of Distributions","authors":"Prafulla Chandra;Andrew Thangaraj","doi":"10.1109/TIT.2024.3440661","DOIUrl":"10.1109/TIT.2024.3440661","url":null,"abstract":"Estimating the underlying distribution from iid samples is a classical and important problem in statistics. When the alphabet size is large compared to number of samples, a portion of the distribution is highly likely to be unobserved or sparsely observed. The missing mass, defined as the sum of probabilities \u0000<inline-formula> <tex-math>$Pr (x)$ </tex-math></inline-formula>\u0000 over the missing letters x, and the Good-Turing estimator for missing mass have been important tools in large-alphabet distribution estimation. In this article, given a positive function g from \u0000<inline-formula> <tex-math>$[{0,1}]$ </tex-math></inline-formula>\u0000 to the reals, the missing g-mass, defined as the sum of \u0000<inline-formula> <tex-math>$g(Pr (x))$ </tex-math></inline-formula>\u0000 over the missing letters x, is introduced and studied. The missing g-mass can be used to investigate the structure of the missing part of the distribution. Specific applications for special cases such as order-\u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000 missing mass (\u0000<inline-formula> <tex-math>$g(p)=p^{alpha }$ </tex-math></inline-formula>\u0000) and the missing Shannon entropy (\u0000<inline-formula> <tex-math>$g(p)=-plog p$ </tex-math></inline-formula>\u0000) include estimating distance from uniformity of the missing distribution and its partial estimation. Minimax estimation is studied for order-\u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000 missing mass for integer values of \u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000 and exact minimax convergence rates are obtained. Concentration is studied for a class of functions g and specific results are derived for order-\u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000 missing mass and missing Shannon entropy. Sub-Gaussian tail bounds with near-optimal worst-case variance factors are derived. Two new notions of concentration, named strongly sub-Gamma and filtered sub-Gaussian concentration, are introduced and shown to result in right tail bounds that are better than those obtained from sub-Gaussian concentration.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"7049-7065"},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Worst-Case Misidentification Control in Sequential Change Diagnosis Using the Min-CuSum 利用最小 CuSum 控制顺序变化诊断中的最坏情况误识别
IF 2.2 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-08 DOI: 10.1109/TIT.2024.3437158
Austin Warner;Georgios Fellouris
The problem of sequential change diagnosis is considered, where a sequence of independent random elements is accessed sequentially, there is an abrupt change in its distribution at some unknown time, and there are two main operational goals: to quickly detect the change, and to accurately identify upon stopping the post-change distribution among a finite set of alternatives. The focus is on the min-CuSum algorithm, which raises an alarm as soon as a CuSum statistic that corresponds to one of the post-change alternatives exceeds a certain threshold. We obtain, under certain assumptions, non-asymptotic upper bounds on its conditional probability of misidentification given that a false alarm did not occur. When, in particular, the data are generated over independent channels and the change can occur in only one of them, its worst-case—with respect to the change point—conditional probability of misidentification given that there was not a false alarm is shown to decay exponentially fast in the threshold. As a corollary, in this setup, the min-CuSum is shown to asymptotically minimize Lorden’s detection delay criterion, simultaneously for every post-change scenario, within the class of schemes that satisfy prescribed bounds on both the false alarm rate and the worst-case conditional probability of misidentification, in a regime where the latter does not go to zero faster than the former. Finally, these theoretical results are also illustrated in simulation studies.
本文考虑的是顺序变化诊断问题,即顺序访问独立随机元素序列,在某个未知时间其分布发生突然变化,有两个主要操作目标:快速检测变化,以及在停止后从一组有限的备选方案中准确识别变化后的分布。我们的重点是 min-CuSum 算法,一旦与变化后备选方案之一相对应的 CuSum 统计量超过某个阈值,该算法就会发出警报。在某些假设条件下,我们得到了在误报没有发生的情况下,其条件误报概率的非渐近上限。特别是当数据是在独立信道上生成的,而变化只能发生在其中一个信道上时,其最坏情况--相对于变化点--在没有发生误报的情况下的条件误识别概率会以指数速度在阈值上衰减。作为推论,在这种情况下,min-CuSum 可以同时在每种变化后情况下渐进地最小化 Lorden 的检测延迟准则,在这一类方案中,误报率和误识别的最坏情况条件概率都满足规定的界限,而且后者归零的速度不会比前者快。最后,模拟研究也说明了这些理论结果。
{"title":"Worst-Case Misidentification Control in Sequential Change Diagnosis Using the Min-CuSum","authors":"Austin Warner;Georgios Fellouris","doi":"10.1109/TIT.2024.3437158","DOIUrl":"10.1109/TIT.2024.3437158","url":null,"abstract":"The problem of sequential change diagnosis is considered, where a sequence of independent random elements is accessed sequentially, there is an abrupt change in its distribution at some unknown time, and there are two main operational goals: to quickly detect the change, and to accurately identify upon stopping the post-change distribution among a finite set of alternatives. The focus is on the min-CuSum algorithm, which raises an alarm as soon as a CuSum statistic that corresponds to one of the post-change alternatives exceeds a certain threshold. We obtain, under certain assumptions, non-asymptotic upper bounds on its conditional probability of misidentification given that a false alarm did not occur. When, in particular, the data are generated over independent channels and the change can occur in only one of them, its worst-case—with respect to the change point—conditional probability of misidentification given that there was not a false alarm is shown to decay exponentially fast in the threshold. As a corollary, in this setup, the min-CuSum is shown to asymptotically minimize Lorden’s detection delay criterion, simultaneously for every post-change scenario, within the class of schemes that satisfy prescribed bounds on both the false alarm rate and the worst-case conditional probability of misidentification, in a regime where the latter does not go to zero faster than the former. Finally, these theoretical results are also illustrated in simulation studies.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 11","pages":"8364-8377"},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10632080","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141969540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple-Error-Correcting Codes for Analog Computing on Resistive Crossbars 用于电阻横梁模拟计算的多重纠错码
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-07 DOI: 10.1109/tit.2024.3439674
Hengjia Wei, Ron M. Roth
{"title":"Multiple-Error-Correcting Codes for Analog Computing on Resistive Crossbars","authors":"Hengjia Wei, Ron M. Roth","doi":"10.1109/tit.2024.3439674","DOIUrl":"https://doi.org/10.1109/tit.2024.3439674","url":null,"abstract":"","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"6 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Further Study of Vectorial Dual-Bent Functions 矢量双曲函数的进一步研究
IF 2.2 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-06 DOI: 10.1109/TIT.2024.3439375
Jiaxin Wang;Fang-Wei Fu;Yadi Wei;Jing Yang
Vectorial dual-bent functions have recently attracted some researchers’ interest as they play a significant role in constructing partial difference sets, association schemes, bent partitions, and linear codes. In this paper, we further study vectorial dual-bent functions $F: V_{n}^{(p)}rightarrow V_{m}^{(p)}$ , where $2leq m leq frac {n}{2}$ , and $V_{n}^{(p)}$ denotes an n-dimensional vector space over the prime field $mathbb {F}_{p}$ . For certain vectorial dual-bent functions (called vectorial dual-bent functions with Condition A), we present a more concise characterization in terms of partial difference sets than the one given in Wang et al. (2023), and give new characterizations in terms of amorphic association schemes, linear codes, and generalized Hadamard matrices, respectively. When $p=2$ , we characterize vectorial dual-bent functions with Condition A in terms of bent partitions. Through the relationship between vectorial dual-bent functions and bent partitions, new characterizations of certain bent partitions in terms of amorphic association schemes, linear codes, and generalized Hadamard matrices are obtained. For a vectorial dual-bent function $F: V_{n}^{(p)}rightarrow V_{m}^{(p)}$ with $F(0)=0, F(x)=F(-x)$ , where $2leq m leq frac {n}{2}$ , we give a necessary and sufficient condition under which the preimage set partition of F induces an association scheme. By using two classes of vectorial dual-bent functions, more association schemes are obtained.
矢量对偶弯曲函数最近引起了一些研究者的兴趣,因为它们在构造偏差集、关联方案、弯曲分区和线性编码中发挥了重要作用。本文将进一步研究矢量对偶弯曲函数 $F: V_{n}^{(p)}rightarrow V_{m}^{(p)}$ ,其中 $2leq m leq frac {n}{2}$ ,$V_{n}^{(p)}$ 表示素域 $mathbb {F}_{p}$ 上的 n 维矢量空间。对于某些向量对偶弯曲函数(称为带条件 A 的向量对偶弯曲函数),我们用偏差集给出了比 Wang 等人(2023)中给出的更简洁的表征,并分别用非定态关联方案、线性编码和广义哈达玛矩阵给出了新的表征。当 $p=2$ 时,我们用弯曲分区来描述条件 A 的矢量对偶弯曲函数。通过向量双弯曲函数和弯曲分区之间的关系,我们得到了某些弯曲分区在非定态关联方案、线性编码和广义哈达玛矩阵方面的新特征。对于向量对偶弯曲函数 $F: V_{n}^{(p)}rightarrow V_{m}^{(p)}$,$F(0)=0, F(x)=F(-x)$ ,其中$2leq m leqfrac{n}{2}$,我们给出了 F 的前像集分区诱导关联方案的必要条件和充分条件。通过使用两类向量对偶弯曲函数,我们得到了更多的关联方案。
{"title":"A Further Study of Vectorial Dual-Bent Functions","authors":"Jiaxin Wang;Fang-Wei Fu;Yadi Wei;Jing Yang","doi":"10.1109/TIT.2024.3439375","DOIUrl":"10.1109/TIT.2024.3439375","url":null,"abstract":"Vectorial dual-bent functions have recently attracted some researchers’ interest as they play a significant role in constructing partial difference sets, association schemes, bent partitions, and linear codes. In this paper, we further study vectorial dual-bent functions \u0000<inline-formula> <tex-math>$F: V_{n}^{(p)}rightarrow V_{m}^{(p)}$ </tex-math></inline-formula>\u0000, where \u0000<inline-formula> <tex-math>$2leq m leq frac {n}{2}$ </tex-math></inline-formula>\u0000, and \u0000<inline-formula> <tex-math>$V_{n}^{(p)}$ </tex-math></inline-formula>\u0000 denotes an n-dimensional vector space over the prime field \u0000<inline-formula> <tex-math>$mathbb {F}_{p}$ </tex-math></inline-formula>\u0000. For certain vectorial dual-bent functions (called vectorial dual-bent functions with Condition A), we present a more concise characterization in terms of partial difference sets than the one given in Wang et al. (2023), and give new characterizations in terms of amorphic association schemes, linear codes, and generalized Hadamard matrices, respectively. When \u0000<inline-formula> <tex-math>$p=2$ </tex-math></inline-formula>\u0000, we characterize vectorial dual-bent functions with Condition A in terms of bent partitions. Through the relationship between vectorial dual-bent functions and bent partitions, new characterizations of certain bent partitions in terms of amorphic association schemes, linear codes, and generalized Hadamard matrices are obtained. For a vectorial dual-bent function \u0000<inline-formula> <tex-math>$F: V_{n}^{(p)}rightarrow V_{m}^{(p)}$ </tex-math></inline-formula>\u0000 with \u0000<inline-formula> <tex-math>$F(0)=0, F(x)=F(-x)$ </tex-math></inline-formula>\u0000, where \u0000<inline-formula> <tex-math>$2leq m leq frac {n}{2}$ </tex-math></inline-formula>\u0000, we give a necessary and sufficient condition under which the preimage set partition of F induces an association scheme. By using two classes of vectorial dual-bent functions, more association schemes are obtained.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"7472-7483"},"PeriodicalIF":2.2,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141947532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Information Theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1