Pub Date : 2025-11-25DOI: 10.1109/TIT.2025.3632187
{"title":"IEEE Transactions on Information Theory Information for Authors","authors":"","doi":"10.1109/TIT.2025.3632187","DOIUrl":"https://doi.org/10.1109/TIT.2025.3632187","url":null,"abstract":"","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"71 12","pages":"C3-C3"},"PeriodicalIF":2.9,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268980","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-25DOI: 10.1109/TIT.2025.3614420
Shubhanshu Shekhar;Aaditya Ramdas
Lemma 2 of Shekhar and Ramdas (2024), which was used to derive the upper bound on the expected stopping time stated in (12), contains an error. In this note, we fix this error and provide the correct justification of (12), whose expression remains unchanged up to small constants.
{"title":"Corrections to “Nonparametric Two-Sample Testing by Betting”","authors":"Shubhanshu Shekhar;Aaditya Ramdas","doi":"10.1109/TIT.2025.3614420","DOIUrl":"https://doi.org/10.1109/TIT.2025.3614420","url":null,"abstract":"<xref>Lemma 2</xref> of Shekhar and Ramdas (2024), which was used to derive the upper bound on the expected stopping time stated in <xref>(12)</xref>, contains an error. In this note, we fix this error and provide the correct justification of <xref>(12)</xref>, whose expression remains unchanged up to small constants.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"71 12","pages":"9804-9806"},"PeriodicalIF":2.9,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268981","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-20DOI: 10.1109/TIT.2025.3635388
Qinyi Lu;Nan Liu;Wei Kang;Chunguo Li
We investigate the demand-private coded caching problem, in which K users, each equipped with a cache of size M, access a library of N files under a privacy constraint. This constraint requires that no user obtain any information about the demands of others. We first present a new virtual-user-based achievable scheme for arbitrary numbers of users and files, which yields tighter order-optimal guarantees when $N le K$ and $M le 1$ . Next, we further focus on the case $N le K$ . On the achievability side, for cache size $M in left [{{0, frac {N}{(K+1)(N-1)} }}right] $ , we propose a novel demand-private scheme based on the idea that each user’s decoding process should depend only on their own demand. In terms of converse, we derive a new converse bound that is applicable for $N leq K$ and arbitrary M. Comparing the proposed achievability and converse, we find the optimal memory-rate tradeoff of the demand-private coded caching problem for $M in left [{{0, frac {N}{(K+1)(N-1)} }}right] $ where $N le K le 2N-2$ , and the optimal memory-rate tradeoff for $M in left [{{0, frac {1}{K+1} }}right] $ where $ K gt 2N-2$ . Moreover, for the case of 2 files and arbitrary number of users, by deriving another new converse bound, the optimal memory-rate tradeoff is characterized for $Min left [{{0,frac {2}{K}}}right] cup left [{{frac {2(K-1)}{K+1},2}}right]$ . Finally, we provide the optimal memory-rate tradeoff of the demand-private coded caching problem for 2 files and 3 users under arbitrary cache size M.
我们研究了需求-私有编码缓存问题,其中K个用户,每个用户配备一个大小为M的缓存,在隐私约束下访问N个文件库。这个约束要求用户不能获得关于其他用户需求的任何信息。我们首先针对任意数量的用户和文件提出了一个新的基于虚拟用户的可实现方案,该方案在$N le K$和$M le 1$时产生更严格的顺序最优保证。接下来,我们进一步关注案例$N le K$。在可实现性方面,对于缓存大小$M in left [{{0, frac {N}{(K+1)(N-1)} }}right] $,我们提出了一种新颖的需求私有方案,该方案基于每个用户的解码过程应该只依赖于他们自己的需求。在逆向方面,我们推导了一个适用于$N leq K$和任意m的新逆向界。比较所提出的可实现性和逆向,我们发现需求私有编码缓存问题的最佳内存率权衡对于$M in left [{{0, frac {N}{(K+1)(N-1)} }}right] $ ($N le K le 2N-2$)和对于$M in left [{{0, frac {1}{K+1} }}right] $ ($ K gt 2N-2$)的最佳内存率权衡。此外,对于2个文件和任意数量的用户的情况,通过推导另一个新的逆界,表征为$Min left [{{0,frac {2}{K}}}right] cup left [{{frac {2(K-1)}{K+1},2}}right]$的最优内存率权衡。最后,我们在任意缓存大小M下为2个文件和3个用户提供了需求私有编码缓存问题的最佳内存率权衡。
{"title":"On the Optimal Memory-Rate Tradeoff of Demand-Private Coded Caching","authors":"Qinyi Lu;Nan Liu;Wei Kang;Chunguo Li","doi":"10.1109/TIT.2025.3635388","DOIUrl":"https://doi.org/10.1109/TIT.2025.3635388","url":null,"abstract":"We investigate the demand-private coded caching problem, in which <italic>K</i> users, each equipped with a cache of size <italic>M</i>, access a library of <italic>N</i> files under a privacy constraint. This constraint requires that no user obtain any information about the demands of others. We first present a new virtual-user-based achievable scheme for arbitrary numbers of users and files, which yields tighter order-optimal guarantees when <inline-formula> <tex-math>$N le K$ </tex-math></inline-formula> and <inline-formula> <tex-math>$M le 1$ </tex-math></inline-formula>. Next, we further focus on the case <inline-formula> <tex-math>$N le K$ </tex-math></inline-formula>. On the achievability side, for cache size <inline-formula> <tex-math>$M in left [{{0, frac {N}{(K+1)(N-1)} }}right] $ </tex-math></inline-formula>, we propose a novel demand-private scheme based on the idea that each user’s decoding process should depend only on their own demand. In terms of converse, we derive a new converse bound that is applicable for <inline-formula> <tex-math>$N leq K$ </tex-math></inline-formula> and arbitrary <italic>M</i>. Comparing the proposed achievability and converse, we find the optimal memory-rate tradeoff of the demand-private coded caching problem for <inline-formula> <tex-math>$M in left [{{0, frac {N}{(K+1)(N-1)} }}right] $ </tex-math></inline-formula> where <inline-formula> <tex-math>$N le K le 2N-2$ </tex-math></inline-formula>, and the optimal memory-rate tradeoff for <inline-formula> <tex-math>$M in left [{{0, frac {1}{K+1} }}right] $ </tex-math></inline-formula> where <inline-formula> <tex-math>$ K gt 2N-2$ </tex-math></inline-formula>. Moreover, for the case of 2 files and arbitrary number of users, by deriving another new converse bound, the optimal memory-rate tradeoff is characterized for <inline-formula> <tex-math>$Min left [{{0,frac {2}{K}}}right] cup left [{{frac {2(K-1)}{K+1},2}}right]$ </tex-math></inline-formula>. Finally, we provide the optimal memory-rate tradeoff of the demand-private coded caching problem for 2 files and 3 users under arbitrary cache size <italic>M</i>.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"72 1","pages":"664-690"},"PeriodicalIF":2.9,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145808581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stochastic Gradient Descent (SGD) has become a cornerstone method in modern data science. However, deploying SGD in high-stakes applications necessitates rigorous quantification of its inherent uncertainty. In this work, we establish non-asymptotic Berry–Esseen bounds for linear functionals of online least-squares SGD, thereby providing a Gaussian Central Limit Theorem (CLT) in a growing-dimensional regime. Existing approaches to high-dimensional inference for projection parameters, such as Chang et al., 2023, rely on inverting empirical covariance matrices and require at least $t gtrsim d^{3/2}$ iterations to achieve finite-sample Berry–Esseen guarantees, rendering them computationally expensive and restrictive in the allowable dimensional scaling. In contrast, we show that a CLT holds for SGD iterates when the number of iterations grows as $t gtrsim d^{1+delta }$ for any $delta gt 0$ , significantly extending the dimensional regime permitted by prior works while improving computational efficiency. The proposed online SGD-based procedure operates in $mathcal {O}(td)$ time and requires only $mathcal {O}(d)$ memory, in contrast to the $mathcal {O}(td^{2} + d^{3})$ runtime of covariance-inversion methods. To render the theory practically applicable, we further develop an online variance estimator for the asymptotic variance appearing in the CLT and establish high-probability deviation bounds for this estimator. Collectively, these results yield the first fully online and data-driven framework for constructing confidence intervals for SGD iterates in the near-optimal scaling regime $t gtrsim d^{1+delta }$ .
{"title":"Statistical Inference for Linear Functionals of Online Least-Squares SGD When t ≳ d1+δ","authors":"Bhavya Agrawalla;Krishnakumar Balasubramanian;Promit Ghosal","doi":"10.1109/TIT.2025.3635118","DOIUrl":"https://doi.org/10.1109/TIT.2025.3635118","url":null,"abstract":"Stochastic Gradient Descent (SGD) has become a cornerstone method in modern data science. However, deploying SGD in high-stakes applications necessitates rigorous quantification of its inherent uncertainty. In this work, we establish <italic>non-asymptotic Berry–Esseen bounds</i> for linear functionals of online least-squares SGD, thereby providing a Gaussian Central Limit Theorem (CLT) in a <italic>growing-dimensional regime</i>. Existing approaches to high-dimensional inference for projection parameters, such as Chang et al., 2023, rely on inverting empirical covariance matrices and require at least <inline-formula> <tex-math>$t gtrsim d^{3/2}$ </tex-math></inline-formula> iterations to achieve finite-sample Berry–Esseen guarantees, rendering them computationally expensive and restrictive in the allowable dimensional scaling. In contrast, we show that a CLT holds for SGD iterates when the number of iterations grows as <inline-formula> <tex-math>$t gtrsim d^{1+delta }$ </tex-math></inline-formula> for any <inline-formula> <tex-math>$delta gt 0$ </tex-math></inline-formula>, significantly extending the dimensional regime permitted by prior works while improving computational efficiency. The proposed online SGD-based procedure operates in <inline-formula> <tex-math>$mathcal {O}(td)$ </tex-math></inline-formula> time and requires only <inline-formula> <tex-math>$mathcal {O}(d)$ </tex-math></inline-formula> memory, in contrast to the <inline-formula> <tex-math>$mathcal {O}(td^{2} + d^{3})$ </tex-math></inline-formula> runtime of covariance-inversion methods. To render the theory practically applicable, we further develop an <italic>online variance estimator</i> for the asymptotic variance appearing in the CLT and establish <italic>high-probability deviation bounds</i> for this estimator. Collectively, these results yield the first fully online and data-driven framework for constructing confidence intervals for SGD iterates in the near-optimal scaling regime <inline-formula> <tex-math>$t gtrsim d^{1+delta }$ </tex-math></inline-formula>.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"72 1","pages":"447-477"},"PeriodicalIF":2.9,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145808553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-18DOI: 10.1109/TIT.2025.3634207
Mengchu Li;Yudong Chen;Tengyao Wang;Yi Yu
We study mean change point testing problems for high-dimensional data, with exponentially- or polynomially-decaying tails. In each case, depending on the $ell _{0}$ -norm of the mean change vector, we separately consider dense and sparse regimes. We characterise the boundary between the dense and sparse regimes under the above two tail conditions for the first time in the change point literature and propose novel testing procedures that attain optimal rates in each of the four regimes up to a poly-iterated logarithmic factor. To be specific, when the error distributions possess exponentially-decaying tails, a near-optimal CUSUM-type statistic is considered. As for polynomially-decaying tails, admitting bounded $alpha $ -th moments for some $alpha geq 4$ , we introduce a median-of-means-type test statistic that achieves a near-optimal testing rate in both dense and sparse regimes. Our investigation in the even more challenging case of $2 leq alpha lt 4$ , unveils a new phenomenon that the minimax testing rate has no sparse regime, i.e. testing sparse changes is information-theoretically as hard as testing dense changes. Finally, we consider various extensions where we also obtain near-optimal performances, including testing against multiple change points, allowing temporal dependence as well as fewer than two finite moments in the data generating mechanisms. We also show how sub-Gaussian rates can be achieved when an additional minimal spacing condition is imposed under the alternative hypothesis.
{"title":"Robust Mean Change Point Testing in High-Dimensional Data With Heavy Tails","authors":"Mengchu Li;Yudong Chen;Tengyao Wang;Yi Yu","doi":"10.1109/TIT.2025.3634207","DOIUrl":"https://doi.org/10.1109/TIT.2025.3634207","url":null,"abstract":"We study mean change point testing problems for high-dimensional data, with exponentially- or polynomially-decaying tails. In each case, depending on the <inline-formula> <tex-math>$ell _{0}$ </tex-math></inline-formula>-norm of the mean change vector, we separately consider dense and sparse regimes. We characterise the boundary between the dense and sparse regimes under the above two tail conditions for the first time in the change point literature and propose novel testing procedures that attain optimal rates in each of the four regimes up to a poly-iterated logarithmic factor. To be specific, when the error distributions possess exponentially-decaying tails, a near-optimal CUSUM-type statistic is considered. As for polynomially-decaying tails, admitting bounded <inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>-th moments for some <inline-formula> <tex-math>$alpha geq 4$ </tex-math></inline-formula>, we introduce a median-of-means-type test statistic that achieves a near-optimal testing rate in both dense and sparse regimes. Our investigation in the even more challenging case of <inline-formula> <tex-math>$2 leq alpha lt 4$ </tex-math></inline-formula>, unveils a new phenomenon that the minimax testing rate has no sparse regime, i.e. testing sparse changes is information-theoretically as hard as testing dense changes. Finally, we consider various extensions where we also obtain near-optimal performances, including testing against multiple change points, allowing temporal dependence as well as fewer than two finite moments in the data generating mechanisms. We also show how sub-Gaussian rates can be achieved when an additional minimal spacing condition is imposed under the alternative hypothesis.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"72 1","pages":"571-609"},"PeriodicalIF":2.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145808561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-18DOI: 10.1109/TIT.2025.3634135
Fei Shi;Kaiyi Guo;Xiande Zhang;Qi Zhao
Quantum weight enumerators are fundamental tools for analyzing quantum error-correcting codes and multipartite entanglement, offering insights into the existence of quantum error-correcting codes and $k$ -uniform states. In this work, we establish a connection between quantum weight enumerators and the $n$ -qubit parallelized SWAP test. We demonstrate that each shadow enumerator corresponds to a probability derived from this test, providing a physical interpretation for the shadow enumerators. Leveraging the non-negativity of these probabilities, we present an elegant proof for the shadow inequalities. Additionally, we show that the Shor-Laflamme weight enumerators and the Rains unitary enumerators can be calculated using the $n$ -qubit parallelized SWAP test. For applications, we utilize this test to compute the distances of quantum error-correcting codes, determine the $k$ -uniformity of pure states, and evaluate multipartite entanglement measures. Our results indicate that quantum weight enumerators can be efficiently estimated on quantum computers, opening a path to calculate and verify the distances of quantum error-correcting codes.
{"title":"Exploring Quantum Weight Enumerators From the n-Qubit Parallelized SWAP Test","authors":"Fei Shi;Kaiyi Guo;Xiande Zhang;Qi Zhao","doi":"10.1109/TIT.2025.3634135","DOIUrl":"https://doi.org/10.1109/TIT.2025.3634135","url":null,"abstract":"Quantum weight enumerators are fundamental tools for analyzing quantum error-correcting codes and multipartite entanglement, offering insights into the existence of quantum error-correcting codes and <inline-formula> <tex-math>$k$ </tex-math></inline-formula>-uniform states. In this work, we establish a connection between quantum weight enumerators and the <inline-formula> <tex-math>$n$ </tex-math></inline-formula>-qubit parallelized SWAP test. We demonstrate that each shadow enumerator corresponds to a probability derived from this test, providing a physical interpretation for the shadow enumerators. Leveraging the non-negativity of these probabilities, we present an elegant proof for the shadow inequalities. Additionally, we show that the Shor-Laflamme weight enumerators and the Rains unitary enumerators can be calculated using the <inline-formula> <tex-math>$n$ </tex-math></inline-formula>-qubit parallelized SWAP test. For applications, we utilize this test to compute the distances of quantum error-correcting codes, determine the <inline-formula> <tex-math>$k$ </tex-math></inline-formula>-uniformity of pure states, and evaluate multipartite entanglement measures. Our results indicate that quantum weight enumerators can be efficiently estimated on quantum computers, opening a path to calculate and verify the distances of quantum error-correcting codes.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"72 2","pages":"1220-1231"},"PeriodicalIF":2.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-18DOI: 10.1109/TIT.2025.3634168
Ziling Heng;Peng Wang;Chunlei Xie;Haiyan Zhou
In recent years, quasi-complementary sequence sets (QCSSs) have attracted widespread attention as they can support more users in MC-CDMA communications than perfect complementary sequence sets (PCSSs). The objective of this paper is to present three novel constructions of asymptotically optimal or near-optimal periodic QCSSs based on algebraic methods. Firstly, we propose a generic construction of QCSSs with small alphabet size p from polynomials over finite fields. Using the quadratic and cubic polynomials, we then respectively derive an infinite family of asymptotically optimal QCSSs and an infinite family of asymptotically near-optimal periodic QCSSs with large set sizes. Secondly, we give a construction of periodic QCSSs based on Gaussian sums which have smaller periodic tolerance than that of a known family of QCSSs. Thirdly, we present a construction of periodic QCSSs from permutation polynomials and complementary sets, yielding an infinite family of QCSSs with large set size, small periodic tolerance and low column sequence peak-to-average power ratio (PAPR).
{"title":"Large Sets of Quasi-Complementary Sequences From Polynomials Over Finite Fields and Gaussian Sums","authors":"Ziling Heng;Peng Wang;Chunlei Xie;Haiyan Zhou","doi":"10.1109/TIT.2025.3634168","DOIUrl":"https://doi.org/10.1109/TIT.2025.3634168","url":null,"abstract":"In recent years, quasi-complementary sequence sets (QCSSs) have attracted widespread attention as they can support more users in MC-CDMA communications than perfect complementary sequence sets (PCSSs). The objective of this paper is to present three novel constructions of asymptotically optimal or near-optimal periodic QCSSs based on algebraic methods. Firstly, we propose a generic construction of QCSSs with small alphabet size <italic>p</i> from polynomials over finite fields. Using the quadratic and cubic polynomials, we then respectively derive an infinite family of asymptotically optimal QCSSs and an infinite family of asymptotically near-optimal periodic QCSSs with large set sizes. Secondly, we give a construction of periodic QCSSs based on Gaussian sums which have smaller periodic tolerance than that of a known family of QCSSs. Thirdly, we present a construction of periodic QCSSs from permutation polynomials and complementary sets, yielding an infinite family of QCSSs with large set size, small periodic tolerance and low column sequence peak-to-average power ratio (PAPR).","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"72 1","pages":"729-741"},"PeriodicalIF":2.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145808551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the universal discrete filtering problem, where an input sequence generated by an unknown source passes through a discrete memoryless channel, and the goal is to estimate its components based on the output sequence, with limited lookahead or delay. We propose and establish the universality of a family of schemes for this setting. These schemes are induced by universal Sequential Probability Assignments (SPAs), and inherit their computational properties. We show that the schemes induced by LZ78 (a Lempel-Ziv compression algorithm) are practically implementable and well-suited for scenarios with limited computational resources and latency constraints. As a byproduct of our analysis, we obtain novel upper and lower bounds in the purely Bayesian setting using some of the intermediate results.
{"title":"Universal Discrete Filtering With Lookahead or Delay","authors":"Pumiao Yan;Jiwon Jeong;Naomi Sagan;Tsachy Weissman","doi":"10.1109/TIT.2025.3633236","DOIUrl":"https://doi.org/10.1109/TIT.2025.3633236","url":null,"abstract":"We consider the universal discrete filtering problem, where an input sequence generated by an unknown source passes through a discrete memoryless channel, and the goal is to estimate its components based on the output sequence, with limited lookahead or delay. We propose and establish the universality of a family of schemes for this setting. These schemes are induced by universal Sequential Probability Assignments (SPAs), and inherit their computational properties. We show that the schemes induced by LZ78 (a Lempel-Ziv compression algorithm) are practically implementable and well-suited for scenarios with limited computational resources and latency constraints. As a byproduct of our analysis, we obtain novel upper and lower bounds in the purely Bayesian setting using some of the intermediate results.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"72 1","pages":"791-809"},"PeriodicalIF":2.9,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145808624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-14DOI: 10.1109/TIT.2025.3633042
Yu Xia;Haoyu Zhou
ReLU is a widely used activation function in deep neural networks. This paper explores the stability properties of the ReLU map. For any weight matrix $boldsymbol {A} in mathbb {R}^{m times n}$ and bias vector ${boldsymbol {b}}in mathbb {R}^{m}$ at a given layer, we define the condition number $kappa _{{boldsymbol {A}},{boldsymbol {b}}}$ as $kappa _{{boldsymbol {A}},{boldsymbol {b}}} = frac {{mathcal {U}}_{{boldsymbol {A}},{boldsymbol {b}}}}{{mathcal {L}}_{{boldsymbol {A}},{boldsymbol {b}}}}$ , where ${mathcal {U}}_{{boldsymbol {A}},{boldsymbol {b}}}$ and ${mathcal {L}}_{{boldsymbol {A}},{boldsymbol {b}}}$ are the upper and lower Lipschitz constants, respectively. We first demonstrate that for any given $boldsymbol {A}$ and $boldsymbol {b}$ , the condition number satisfies $kappa _{{boldsymbol {A}},{boldsymbol {b}}} geq sqrt {2}$ . Moreover, when the weights of the network at a given layer are initialized as random i.i.d. Gaussian variables and the bias term is set to zero, the condition number asymptotically approaches this lower bound. Our findings offer valuable insights into the characteristics of randomly initialized neural networks, contributing to a better understanding of their initial behavior and potential performance.
{"title":"The Optimal Condition Number for ReLU Function","authors":"Yu Xia;Haoyu Zhou","doi":"10.1109/TIT.2025.3633042","DOIUrl":"https://doi.org/10.1109/TIT.2025.3633042","url":null,"abstract":"ReLU is a widely used activation function in deep neural networks. This paper explores the stability properties of the ReLU map. For any weight matrix <inline-formula> <tex-math>$boldsymbol {A} in mathbb {R}^{m times n}$ </tex-math></inline-formula> and bias vector <inline-formula> <tex-math>${boldsymbol {b}}in mathbb {R}^{m}$ </tex-math></inline-formula> at a given layer, we define the condition number <inline-formula> <tex-math>$kappa _{{boldsymbol {A}},{boldsymbol {b}}}$ </tex-math></inline-formula> as <inline-formula> <tex-math>$kappa _{{boldsymbol {A}},{boldsymbol {b}}} = frac {{mathcal {U}}_{{boldsymbol {A}},{boldsymbol {b}}}}{{mathcal {L}}_{{boldsymbol {A}},{boldsymbol {b}}}}$ </tex-math></inline-formula>, where <inline-formula> <tex-math>${mathcal {U}}_{{boldsymbol {A}},{boldsymbol {b}}}$ </tex-math></inline-formula> and <inline-formula> <tex-math>${mathcal {L}}_{{boldsymbol {A}},{boldsymbol {b}}}$ </tex-math></inline-formula> are the upper and lower Lipschitz constants, respectively. We first demonstrate that for any given <inline-formula> <tex-math>$boldsymbol {A}$ </tex-math></inline-formula> and <inline-formula> <tex-math>$boldsymbol {b}$ </tex-math></inline-formula>, the condition number satisfies <inline-formula> <tex-math>$kappa _{{boldsymbol {A}},{boldsymbol {b}}} geq sqrt {2}$ </tex-math></inline-formula>. Moreover, when the weights of the network at a given layer are initialized as random i.i.d. Gaussian variables and the bias term is set to zero, the condition number asymptotically approaches this lower bound. Our findings offer valuable insights into the characteristics of randomly initialized neural networks, contributing to a better understanding of their initial behavior and potential performance.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"72 1","pages":"710-728"},"PeriodicalIF":2.9,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145808645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-11DOI: 10.1109/TIT.2025.3631735
Liyang Lu;Haochen Wu;Wenbo Xu;Zhaocheng Wang;H. Vincent Poor
We provide new recovery bounds for hierarchical compressed sensing (HCS) based on prior support information (PSI). A detailed PSI-enabled reconstruction model is formulated using various forms of PSI. The hierarchical block orthogonal matching pursuit with PSI (HiBOMP-P) algorithm is designed in a recursive form to reliably recover hierarchically block-sparse signals. We derive exact recovery conditions (ERCs) measured by the mutual incoherence property (MIP), wherein hierarchical MIP concepts are proposed, and further develop reconstructible sparsity levels to reveal sufficient conditions for ERCs. Leveraging these MIP analyses, we present several extended insights, including reliable recovery conditions in noisy scenarios and the optimal hierarchical structure for cases where sparsity is not equal to zero. Our results further confirm that HCS offers improved recovery performance even when the prior information does not overlap with the true support set, whereas existing methods heavily rely on this overlap, thereby compromising performance if it is absent.
{"title":"Hierarchically Block-Sparse Recovery With Prior Support Information","authors":"Liyang Lu;Haochen Wu;Wenbo Xu;Zhaocheng Wang;H. Vincent Poor","doi":"10.1109/TIT.2025.3631735","DOIUrl":"https://doi.org/10.1109/TIT.2025.3631735","url":null,"abstract":"We provide new recovery bounds for hierarchical compressed sensing (HCS) based on prior support information (PSI). A detailed PSI-enabled reconstruction model is formulated using various forms of PSI. The hierarchical block orthogonal matching pursuit with PSI (HiBOMP-P) algorithm is designed in a recursive form to reliably recover hierarchically block-sparse signals. We derive exact recovery conditions (ERCs) measured by the mutual incoherence property (MIP), wherein hierarchical MIP concepts are proposed, and further develop reconstructible sparsity levels to reveal sufficient conditions for ERCs. Leveraging these MIP analyses, we present several extended insights, including reliable recovery conditions in noisy scenarios and the optimal hierarchical structure for cases where sparsity is not equal to zero. Our results further confirm that HCS offers improved recovery performance even when the prior information does not overlap with the true support set, whereas existing methods heavily rely on this overlap, thereby compromising performance if it is absent.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"72 1","pages":"765-790"},"PeriodicalIF":2.9,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145808591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}