首页 > 最新文献

IEEE journal on selected areas in information theory最新文献

英文 中文
A Graph-Based Soft-Decision Decoding Scheme for Reed-Solomon Codes Reed-Solomon码的一种基于图的软判决译码方案
Pub Date : 2023-09-14 DOI: 10.1109/JSAIT.2023.3315453
Huang-Chang Lee;Jyun-Han Wu;Chung-Hsuan Wang;Yeong-Luh Ueng
This paper presents a soft decoding scheme based on the binary representations transferred from the parity-check matrices (PCMs) for Reed-Solomon (RS) codes. Referring to the modified binary PCM that has a systematic part and a high-density part corresponding to the least reliable variable nodes (LRVNs) and the most reliable variable nodes (MRVNs), respectively, an informed dynamic scheduling method, called Nested-Polling Residual Belief Propagation (NP-RBP), is applied to the corresponding Tanner graph. As with the popular adaptive BP (ABP) decoding approach, adaptation in a binary PCM based on the reliability of variable nodes is also conducted in the proposed NP-RBP decoding. The NP-RBP enables the LRVNs to receive significant updates and limits the correlation accumulation from the short cycles in the MRVNs. In order to enhance the error-rate performance for long codes, a bit-flipping (BF) technique is conducted in order to correct a selection of the errors in the MRVNs such that the propagation of these errors in the subsequent NP-RBP process can be avoided. The resultant decoder is termed NP-RBP-BF. For short codes such as the (31, 25) and (63, 55) RS codes, NP-RBP is able to provide an error-rate performance close to the maximum-likelihood (ML) bound. A more significant improvement can be observed for long codes. For instance, when the proposed NP-RBP-BF decoding is applied to the (255, 239) RS code, it can provide a gain of about 0.4 dB compared to the ABP decoding and the performance gap to the ML bound can be narrowed to about 0.25 dB at a frame error rate of $2times 10^{-3}$ .
本文提出了一种基于奇偶校验矩阵(PCM)传输的Reed-Solomon(RS)码二进制表示的软解码方案。参考修改的二进制PCM,其具有分别对应于最不可靠变量节点(LRVNs)和最可靠可变节点(MRVNs)的系统部分和高密度部分,将一种称为嵌套轮询残差置信传播(NP-RBP)的知情动态调度方法应用于相应的Tanner图。与流行的自适应BP(ABP)解码方法一样,在所提出的NP-RBP解码中,也进行了基于可变节点可靠性的二进制PCM中的自适应。NP-RBP使LRVN能够接收显著的更新,并限制MRVN中短周期的相关性累积。为了增强长代码的错误率性能,进行了比特翻转(BF)技术,以便校正MRVN中的错误的选择,使得可以避免这些错误在随后的NP-RBP过程中的传播。所得到的解码器被称为NP-RBP-BF。对于诸如(31,25)和(63,55)RS码的短码,NP-RBP能够提供接近最大似然(ML)界的错误率性能。对于长代码,可以观察到更显著的改进。例如,当所提出的NP-RBP-BF解码被应用于(255239)RS码时,与ABP解码相比,它可以提供大约0.4dB的增益,并且在$2乘以10^{-3}$的帧差错率下,与ML界的性能差距可以缩小到大约0.25dB。
{"title":"A Graph-Based Soft-Decision Decoding Scheme for Reed-Solomon Codes","authors":"Huang-Chang Lee;Jyun-Han Wu;Chung-Hsuan Wang;Yeong-Luh Ueng","doi":"10.1109/JSAIT.2023.3315453","DOIUrl":"https://doi.org/10.1109/JSAIT.2023.3315453","url":null,"abstract":"This paper presents a soft decoding scheme based on the binary representations transferred from the parity-check matrices (PCMs) for Reed-Solomon (RS) codes. Referring to the modified binary PCM that has a systematic part and a high-density part corresponding to the least reliable variable nodes (LRVNs) and the most reliable variable nodes (MRVNs), respectively, an informed dynamic scheduling method, called Nested-Polling Residual Belief Propagation (NP-RBP), is applied to the corresponding Tanner graph. As with the popular adaptive BP (ABP) decoding approach, adaptation in a binary PCM based on the reliability of variable nodes is also conducted in the proposed NP-RBP decoding. The NP-RBP enables the LRVNs to receive significant updates and limits the correlation accumulation from the short cycles in the MRVNs. In order to enhance the error-rate performance for long codes, a bit-flipping (BF) technique is conducted in order to correct a selection of the errors in the MRVNs such that the propagation of these errors in the subsequent NP-RBP process can be avoided. The resultant decoder is termed NP-RBP-BF. For short codes such as the (31, 25) and (63, 55) RS codes, NP-RBP is able to provide an error-rate performance close to the maximum-likelihood (ML) bound. A more significant improvement can be observed for long codes. For instance, when the proposed NP-RBP-BF decoding is applied to the (255, 239) RS code, it can provide a gain of about 0.4 dB compared to the ABP decoding and the performance gap to the ML bound can be narrowed to about 0.25 dB at a frame error rate of \u0000<inline-formula> <tex-math>$2times 10^{-3}$ </tex-math></inline-formula>\u0000.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"420-433"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50426629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph Coded Merkle Tree: Mitigating Data Availability Attacks in Blockchain Systems Using Informed Design of Polar Factor Graphs 图编码Merkle树:利用极因子图的知情设计缓解区块链系统中的数据可用性攻击
Pub Date : 2023-09-13 DOI: 10.1109/JSAIT.2023.3315148
Debarnab Mitra;Lev Tauz;Lara Dolecek
Data availability (DA) attack is a well-known problem in certain blockchains where users accept an invalid block with unavailable portions. Previous works have used LDPC and 2-D Reed Solomon (2D-RS) codes with Merkle trees to mitigate DA attacks. These codes perform well across various metrics such as DA detection probability and communication cost. However, these codes are difficult to apply to blockchains with large blocks due to large decoding complexity and coding fraud proof size (2D-RS codes), and intractable code guarantees for large code lengths (LDPC codes). In this paper, we focus on large block size applications and address the above challenges by proposing the novel Graph Coded Merkle Tree (GCMT): a Merkle tree encoded using polar encoding graphs. We provide a specialized polar encoding graph design algorithm called Sampling Efficient Freezing and an algorithm to prune the polar encoding graph. We demonstrate that the GCMT built using the above techniques results in a better DA detection probability and communication cost compared to LDPC codes, has a lower coding fraud proof size compared to LDPC and 2D-RS codes, provides tractable code guarantees at large code lengths (similar to 2D-RS codes), and has comparable decoding complexity to 2D-RS and LDPC codes.
在某些区块链中,数据可用性(DA)攻击是一个众所周知的问题,用户接受具有不可用部分的无效区块。先前的工作已经使用LDPC和具有Merkle树的2-D Reed-Solomon(2D-RS)码来减轻DA攻击。这些代码在诸如DA检测概率和通信成本之类的各种度量中表现良好。然而,由于大的解码复杂性和编码防欺诈大小(2D-RS码),以及针对大码长的难以处理的码保证(LDPC码),这些码难以应用于具有大块的块链。在本文中,我们专注于大块大小的应用,并通过提出新的图编码Merkle树(GCMT)来解决上述挑战:一种使用极坐标编码图编码的Merkle树法。我们提供了一种专门的极坐标编码图设计算法,称为采样高效冻结,以及一种修剪极坐标编码图形的算法。我们证明,与LDPC码相比,使用上述技术构建的GCMT导致更好的DA检测概率和通信成本,与LDPC和2D-RS码相比,具有更低的编码防欺诈大小,在大码长(类似于2D-RS代码)下提供可处理的码保证,并且具有与2D-RS和LDPC码相当的解码复杂度。
{"title":"Graph Coded Merkle Tree: Mitigating Data Availability Attacks in Blockchain Systems Using Informed Design of Polar Factor Graphs","authors":"Debarnab Mitra;Lev Tauz;Lara Dolecek","doi":"10.1109/JSAIT.2023.3315148","DOIUrl":"https://doi.org/10.1109/JSAIT.2023.3315148","url":null,"abstract":"Data availability (DA) attack is a well-known problem in certain blockchains where users accept an invalid block with unavailable portions. Previous works have used LDPC and 2-D Reed Solomon (2D-RS) codes with Merkle trees to mitigate DA attacks. These codes perform well across various metrics such as DA detection probability and communication cost. However, these codes are difficult to apply to blockchains with large blocks due to large decoding complexity and coding fraud proof size (2D-RS codes), and intractable code guarantees for large code lengths (LDPC codes). In this paper, we focus on large block size applications and address the above challenges by proposing the novel Graph Coded Merkle Tree (GCMT): a Merkle tree encoded using polar encoding graphs. We provide a specialized polar encoding graph design algorithm called Sampling Efficient Freezing and an algorithm to prune the polar encoding graph. We demonstrate that the GCMT built using the above techniques results in a better DA detection probability and communication cost compared to LDPC codes, has a lower coding fraud proof size compared to LDPC and 2D-RS codes, provides tractable code guarantees at large code lengths (similar to 2D-RS codes), and has comparable decoding complexity to 2D-RS and LDPC codes.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"434-452"},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50426630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Minimum Weight Codewords of PAC Codes: The Impact of Pre-Transformation 关于PAC码的最小权值码字:预变换的影响
Pub Date : 2023-09-11 DOI: 10.1109/JSAIT.2023.3312678
Mohammad Rowshan;Jinhong Yuan
The minimum Hamming distance of a linear block code is the smallest number of bit changes required to transform one valid codeword into another. The code’s minimum distance determines the code’s error-correcting capabilities. Furthermore, The number of minimum weight codewords, a.k.a. error coefficient, gives a good comparative measure for the block error rate (BLER) of linear block codes with identical minimum distance, in particular at a high SNR regime under maximum likelihood (ML) decoding. A code with a smaller error coefficient would give a lower BLER. Unlike polar codes, a closed-form expression for the enumeration of the error coefficient of polarization-adjusted convolutional (PAC) codes is yet unknown. As PAC codes are convolutionally pre-transformed polar codes, we study the impact of pre-transformation on polar codes in terms of minimum Hamming distance and error coefficient by partitioning the codewords into cosets. We show that the minimum distance of PAC codes does not decrease; however, the pre-transformation may reduce the error coefficient depending on the choice of convolutional polynomial. We recognize the properties of the cosets where pre-transformation is ineffective in decreasing the error coefficient, giving a lower bound for the error coefficient. Then, we propose a low-complexity enumeration method that determines the number of minimum weight codewords of PAC codes relying on the error coefficient of polar codes. That is, given the error coefficient ${mathcal {A}}_{w_{min}}$ of polar codes, we determine the reduction $X$ in the error coefficient due to convolutional pre-transformation in PAC coding and subtract it from the error coefficient of polar codes, ${mathcal {A}}_{w_{min}}-X$ . Furthermore, we numerically analyze the tightness of the lower bound and the impact of the choice of the convolutional polynomial on the error coefficient based on the sub-patterns in the polynomial’s coefficients. Eventually, we show how we can further reduce the error coefficient in the cosets.
线性块码的最小汉明距离是将一个有效码字转换为另一个有效的码字所需的最小比特变化数。代码的最小距离决定了代码的纠错能力。此外,最小权重码字的数量,也称为误差系数,对于具有相同最小距离的线性块码的块错误率(BLER),特别是在最大似然(ML)解码下的高SNR状态下,给出了良好的比较度量。具有较小误差系数的代码将给出较低的BLER。与极性码不同,偏振调整卷积码(PAC)的误差系数的枚举的闭合形式表达式尚不清楚。由于PAC码是卷积预变换的极性码,我们通过将码字划分为陪集,从最小汉明距离和误差系数的角度研究了预变换对极性码的影响。我们证明了PAC码的最小距离没有减小;然而,根据卷积多项式的选择,预变换可以降低误差系数。我们认识到陪集的性质,其中预变换在降低误差系数方面是无效的,给出了误差系数的下界。然后,我们提出了一种低复杂度的枚举方法,该方法根据极性码的误差系数来确定PAC码的最小权值码字的数量。也就是说,给定极性码的误差系数${mathcal{A}}_{w_{min}}$,我们确定由于PAC编码中的卷积预变换而导致的误差系数的减少$X$,并将其从极性码误差系数${mathcal{A}}}_{w _{min}}-X$中减去。此外,基于多项式系数中的子模式,我们数值分析了下界的严密性以及卷积多项式的选择对误差系数的影响。最后,我们展示了如何进一步降低陪集中的误差系数。
{"title":"On the Minimum Weight Codewords of PAC Codes: The Impact of Pre-Transformation","authors":"Mohammad Rowshan;Jinhong Yuan","doi":"10.1109/JSAIT.2023.3312678","DOIUrl":"https://doi.org/10.1109/JSAIT.2023.3312678","url":null,"abstract":"The minimum Hamming distance of a linear block code is the smallest number of bit changes required to transform one valid codeword into another. The code’s minimum distance determines the code’s error-correcting capabilities. Furthermore, The number of minimum weight codewords, a.k.a. error coefficient, gives a good comparative measure for the block error rate (BLER) of linear block codes with identical minimum distance, in particular at a high SNR regime under maximum likelihood (ML) decoding. A code with a smaller error coefficient would give a lower BLER. Unlike polar codes, a closed-form expression for the enumeration of the error coefficient of polarization-adjusted convolutional (PAC) codes is yet unknown. As PAC codes are convolutionally pre-transformed polar codes, we study the impact of pre-transformation on polar codes in terms of minimum Hamming distance and error coefficient by partitioning the codewords into cosets. We show that the minimum distance of PAC codes does not decrease; however, the pre-transformation may reduce the error coefficient depending on the choice of convolutional polynomial. We recognize the properties of the cosets where pre-transformation is ineffective in decreasing the error coefficient, giving a lower bound for the error coefficient. Then, we propose a low-complexity enumeration method that determines the number of minimum weight codewords of PAC codes relying on the error coefficient of polar codes. That is, given the error coefficient \u0000<inline-formula> <tex-math>${mathcal {A}}_{w_{min}}$ </tex-math></inline-formula>\u0000 of polar codes, we determine the reduction \u0000<inline-formula> <tex-math>$X$ </tex-math></inline-formula>\u0000 in the error coefficient due to convolutional pre-transformation in PAC coding and subtract it from the error coefficient of polar codes, \u0000<inline-formula> <tex-math>${mathcal {A}}_{w_{min}}-X$ </tex-math></inline-formula>\u0000. Furthermore, we numerically analyze the tightness of the lower bound and the impact of the choice of the convolutional polynomial on the error coefficient based on the sub-patterns in the polynomial’s coefficients. Eventually, we show how we can further reduce the error coefficient in the cosets.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"487-498"},"PeriodicalIF":0.0,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50426700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Securely Aggregated Coded Matrix Inversion 安全聚合编码矩阵反演
Pub Date : 2023-09-08 DOI: 10.1109/JSAIT.2023.3312233
Neophytos Charalambides;Mert Pilanci;Alfred O. Hero
Coded computing is a method for mitigating straggling workers in a centralized computing network, by using erasure-coding techniques. Federated learning is a decentralized model for training data distributed across client devices. In this work we propose approximating the inverse of an aggregated data matrix, where the data is generated by clients; similar to the federated learning paradigm, while also being resilient to stragglers. To do so, we propose a coded computing method based on gradient coding. We modify this method so that the coordinator does not access the local data at any point; while the clients access the aggregated matrix in order to complete their tasks. The network we consider is not centrally administrated, and the communications which take place are secure against potential eavesdroppers.
编码计算是一种通过使用擦除编码技术来减轻集中式计算网络中分散工作的方法。联邦学习是一种分散的模型,用于训练分布在客户机设备上的数据。在这项工作中,我们建议近似汇总数据矩阵的逆,其中数据由客户生成;类似于联邦学习范式,同时也能适应掉队者。为此,我们提出了一种基于梯度编码的编码计算方法。我们修改这个方法,使协调器在任何时候都不访问本地数据;而客户端访问聚合矩阵以完成其任务。我们考虑的网络不是集中管理的,发生的通信是安全的,防止潜在的窃听者。
{"title":"Securely Aggregated Coded Matrix Inversion","authors":"Neophytos Charalambides;Mert Pilanci;Alfred O. Hero","doi":"10.1109/JSAIT.2023.3312233","DOIUrl":"10.1109/JSAIT.2023.3312233","url":null,"abstract":"Coded computing is a method for mitigating straggling workers in a centralized computing network, by using erasure-coding techniques. Federated learning is a decentralized model for training data distributed across client devices. In this work we propose approximating the inverse of an aggregated data matrix, where the data is generated by clients; similar to the federated learning paradigm, while also being resilient to stragglers. To do so, we propose a coded computing method based on gradient coding. We modify this method so that the coordinator does not access the local data at any point; while the clients access the aggregated matrix in order to complete their tasks. The network we consider is not centrally administrated, and the communications which take place are secure against potential eavesdroppers.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"405-419"},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46029173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Randomized Polar Codes for Anytime Distributed Machine Learning 随时分布式机器学习的随机极性码
Pub Date : 2023-09-04 DOI: 10.1109/JSAIT.2023.3310931
Burak Bartan;Mert Pilanci
We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations. The proposed mechanism integrates the concepts of randomized sketching and polar codes in the context of coded computation. We propose a sequential decoding algorithm designed to handle real valued data while maintaining low computational complexity for recovery. Additionally, we provide an anytime estimator that can generate provably accurate estimates even when the set of available node outputs is not decodable. We demonstrate the potential applications of this framework in various contexts, such as large-scale matrix multiplication and black-box optimization. We present the implementation of these methods on a serverless cloud computing system and provide numerical results to demonstrate their scalability in practice, including ImageNet scale computations.
我们提出了一种新的分布式计算框架,该框架对慢速计算节点具有鲁棒性,并且能够对线性运算进行近似和精确计算。所提出的机制在编码计算的背景下集成了随机绘制和极性代码的概念。我们提出了一种顺序解码算法,旨在处理实值数据,同时保持较低的恢复计算复杂度。此外,我们提供了一种随时估计器,即使在可用节点输出集不可解码的情况下,该估计器也可以生成可证明的精确估计。我们展示了该框架在各种环境中的潜在应用,如大规模矩阵乘法和黑盒优化。我们介绍了这些方法在无服务器云计算系统上的实现,并提供了数值结果来证明它们在实践中的可扩展性,包括ImageNet规模的计算。
{"title":"Randomized Polar Codes for Anytime Distributed Machine Learning","authors":"Burak Bartan;Mert Pilanci","doi":"10.1109/JSAIT.2023.3310931","DOIUrl":"10.1109/JSAIT.2023.3310931","url":null,"abstract":"We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations. The proposed mechanism integrates the concepts of randomized sketching and polar codes in the context of coded computation. We propose a sequential decoding algorithm designed to handle real valued data while maintaining low computational complexity for recovery. Additionally, we provide an anytime estimator that can generate provably accurate estimates even when the set of available node outputs is not decodable. We demonstrate the potential applications of this framework in various contexts, such as large-scale matrix multiplication and black-box optimization. We present the implementation of these methods on a serverless cloud computing system and provide numerical results to demonstrate their scalability in practice, including ImageNet scale computations.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"393-404"},"PeriodicalIF":0.0,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43691019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Matrix Computations With Low-Weight Encodings 具有低权重编码的分布式矩阵计算
Pub Date : 2023-08-30 DOI: 10.1109/JSAIT.2023.3308768
Anindya Bijoy Das;Aditya Ramamoorthy;David J. Love;Christopher G. Brinton
Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework; this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combinations of submatrices to the worker nodes. When the input matrices are sparse, these approaches increase the number of non-zero entries in the encoded matrices, which in turn adversely affects the worker computation time. In this work, we develop a distributed matrix computation approach where the assigned encoded submatrices are random linear combinations of a small number of submatrices. In addition to being well suited for sparse input matrices, our approach continues to have the optimal straggler resilience in a certain range of problem parameters. Moreover, compared to recent sparse matrix computation approaches, the search for a “good” set of random coefficients to promote numerical stability in our method is much more computationally efficient. We show that our approach can efficiently utilize partial computations done by slower worker nodes in a heterogeneous system which can enhance the overall computation speed. Numerical experiments conducted through Amazon Web Services (AWS) demonstrate up to 30% reduction in per worker node computation time and $100times $ faster encoding compared to the available methods.
杂散节点是分布式矩阵计算的众所周知的瓶颈,其导致计算/通信速度的降低。减轻这种掉队者的一种常见策略是将基于Reed-Solomon的MDS(最大距离可分离)码合并到框架中;这可以实现对抗最优数量的掉队者的弹性。然而,这些代码将子矩阵的密集线性组合分配给工作节点。当输入矩阵是稀疏的时,这些方法会增加编码矩阵中非零项的数量,这反过来又会对工作者的计算时间产生不利影响。在这项工作中,我们开发了一种分布式矩阵计算方法,其中指定的编码子矩阵是少量子矩阵的随机线性组合。除了非常适合稀疏输入矩阵外,我们的方法在一定的问题参数范围内仍然具有最佳掉队者弹性。此外,与最近的稀疏矩阵计算方法相比,在我们的方法中,搜索一组“好”的随机系数来提高数值稳定性在计算上要高效得多。我们表明,我们的方法可以有效地利用异构系统中较慢的工作节点所做的部分计算,这可以提高整体计算速度。通过亚马逊网络服务(AWS)进行的数值实验表明,与现有方法相比,每个工作节点的计算时间减少了30%,编码速度加快了100倍。
{"title":"Distributed Matrix Computations With Low-Weight Encodings","authors":"Anindya Bijoy Das;Aditya Ramamoorthy;David J. Love;Christopher G. Brinton","doi":"10.1109/JSAIT.2023.3308768","DOIUrl":"https://doi.org/10.1109/JSAIT.2023.3308768","url":null,"abstract":"Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework; this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combinations of submatrices to the worker nodes. When the input matrices are sparse, these approaches increase the number of non-zero entries in the encoded matrices, which in turn adversely affects the worker computation time. In this work, we develop a distributed matrix computation approach where the assigned encoded submatrices are random linear combinations of a small number of submatrices. In addition to being well suited for sparse input matrices, our approach continues to have the optimal straggler resilience in a certain range of problem parameters. Moreover, compared to recent sparse matrix computation approaches, the search for a “good” set of random coefficients to promote numerical stability in our method is much more computationally efficient. We show that our approach can efficiently utilize partial computations done by slower worker nodes in a heterogeneous system which can enhance the overall computation speed. Numerical experiments conducted through Amazon Web Services (AWS) demonstrate up to 30% reduction in per worker node computation time and \u0000<inline-formula> <tex-math>$100times $ </tex-math></inline-formula>\u0000 faster encoding compared to the available methods.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"363-378"},"PeriodicalIF":0.0,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50427091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Channel Coding at Low Capacity 低容量信道编码
Pub Date : 2023-08-16 DOI: 10.1109/JSAIT.2023.3305874
Mohammad Fereydounian;Hamed Hassani;Mohammad Vahid Jamali;Hessam Mahdavifar
Low-capacity scenarios have become increasingly important in the technology of the Internet of Things (IoT) and the next generation of wireless networks. Such scenarios require efficient and reliable transmission over channels with an extremely small capacity. Within these constraints, the state-of-the-art coding techniques may not be directly applicable. Moreover, the prior work on the finite-length analysis of optimal channel coding provides inaccurate predictions of the limits in the low-capacity regime. In this paper, we study channel coding at low capacity from two perspectives: fundamental limits at finite length and code constructions. We first specify what a low-capacity regime means. We then characterize finite-length fundamental limits of channel coding in the low-capacity regime for various types of channels, including binary erasure channels (BECs), binary symmetric channels (BSCs), and additive white Gaussian noise (AWGN) channels. From the code construction perspective, we characterize the optimal number of repetitions for transmission over binary memoryless symmetric (BMS) channels, in terms of the code blocklength and the underlying channel capacity, such that the capacity loss due to the repetition is negligible. Furthermore, it is shown that capacity-achieving polar codes naturally adopt the aforementioned optimal number of repetitions.
在物联网(IoT)和下一代无线网络技术中,低容量场景变得越来越重要。这种场景要求在容量极小的信道上进行高效可靠的传输。在这些限制条件下,最先进的编码技术可能无法直接应用。此外,先前关于最优信道编码的有限长度分析的工作提供了对低容量区域限制的不准确预测。本文从有限长度的基本限制和编码结构两个方面研究了低容量信道编码。我们首先指定低容量制度的含义。然后,我们描述了低容量条件下各种类型信道编码的有限长度基本限制,包括二进制擦除信道(BECs),二进制对称信道(BSCs)和加性高斯白噪声(AWGN)信道。从代码结构的角度来看,我们描述了在二进制无内存对称(BMS)信道上传输的最佳重复次数,根据代码块长度和底层信道容量,使得由于重复造成的容量损失可以忽略不计。此外,研究表明,容量实现极化码自然采用上述最优重复数。
{"title":"Channel Coding at Low Capacity","authors":"Mohammad Fereydounian;Hamed Hassani;Mohammad Vahid Jamali;Hessam Mahdavifar","doi":"10.1109/JSAIT.2023.3305874","DOIUrl":"10.1109/JSAIT.2023.3305874","url":null,"abstract":"Low-capacity scenarios have become increasingly important in the technology of the Internet of Things (IoT) and the next generation of wireless networks. Such scenarios require efficient and reliable transmission over channels with an extremely small capacity. Within these constraints, the state-of-the-art coding techniques may not be directly applicable. Moreover, the prior work on the finite-length analysis of optimal channel coding provides inaccurate predictions of the limits in the low-capacity regime. In this paper, we study channel coding at low capacity from two perspectives: fundamental limits at finite length and code constructions. We first specify what a low-capacity regime means. We then characterize finite-length fundamental limits of channel coding in the low-capacity regime for various types of channels, including binary erasure channels (BECs), binary symmetric channels (BSCs), and additive white Gaussian noise (AWGN) channels. From the code construction perspective, we characterize the optimal number of repetitions for transmission over binary memoryless symmetric (BMS) channels, in terms of the code blocklength and the underlying channel capacity, such that the capacity loss due to the repetition is negligible. Furthermore, it is shown that capacity-achieving polar codes naturally adopt the aforementioned optimal number of repetitions.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"351-362"},"PeriodicalIF":0.0,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42302948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Continuous-Time Distributed Filtering With Sensing and Communication Constraints 具有传感和通信限制的连续时间分布式过滤
Pub Date : 2023-08-10 DOI: 10.1109/JSAIT.2023.3304249
Zhenyu Liu;Andrea Conti;Sanjoy K. Mitter;Moe Z. Win
Distributed filtering is crucial in many applications such as localization, radar, autonomy, and environmental monitoring. The aim of distributed filtering is to infer time-varying unknown states using data obtained via sensing and communication in a network. This paper analyzes continuous-time distributed filtering with sensing and communication constraints. In particular, the paper considers a building-block system of two nodes, where each node is tasked with inferring a time-varying unknown state. At each time, the two nodes obtain noisy observations of the unknown states via sensing and perform communication via a Gaussian feedback channel. The distributed filter of the unknown state is computed based on both the sensor observations and the received messages. We analyze the asymptotic performance of the distributed filter by deriving a necessary and sufficient condition of the sensing and communication capabilities under which the mean-square error of the distributed filter is bounded over time. Numerical results are presented to validate the derived necessary and sufficient condition.
分布式滤波在定位、雷达、自治和环境监测等许多应用中都是至关重要的。分布式滤波的目的是利用网络中通过感知和通信获得的数据推断出时变的未知状态。本文分析了具有传感和通信约束的连续时间分布式滤波。特别地,本文考虑了一个由两个节点组成的积木系统,其中每个节点的任务是推断一个时变的未知状态。每一次,两个节点通过感知获得未知状态的噪声观测值,并通过高斯反馈信道进行通信。基于传感器观测和接收到的消息计算未知状态的分布式滤波器。我们通过推导分布式滤波器的传感和通信能力的充分必要条件来分析分布式滤波器的渐近性能,在此条件下,分布式滤波器的均方误差随时间有界。数值结果验证了所推导的充要条件。
{"title":"Continuous-Time Distributed Filtering With Sensing and Communication Constraints","authors":"Zhenyu Liu;Andrea Conti;Sanjoy K. Mitter;Moe Z. Win","doi":"10.1109/JSAIT.2023.3304249","DOIUrl":"10.1109/JSAIT.2023.3304249","url":null,"abstract":"Distributed filtering is crucial in many applications such as localization, radar, autonomy, and environmental monitoring. The aim of distributed filtering is to infer time-varying unknown states using data obtained via sensing and communication in a network. This paper analyzes continuous-time distributed filtering with sensing and communication constraints. In particular, the paper considers a building-block system of two nodes, where each node is tasked with inferring a time-varying unknown state. At each time, the two nodes obtain noisy observations of the unknown states via sensing and perform communication via a Gaussian feedback channel. The distributed filter of the unknown state is computed based on both the sensor observations and the received messages. We analyze the asymptotic performance of the distributed filter by deriving a necessary and sufficient condition of the sensing and communication capabilities under which the mean-square error of the distributed filter is bounded over time. Numerical results are presented to validate the derived necessary and sufficient condition.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"667-681"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62353985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Implementation of Boolean Functions on Content-Addressable Memories 关于布尔函数在内容可寻址存储器上的实现
Pub Date : 2023-08-07 DOI: 10.1109/JSAIT.2023.3279333
Ron M. Roth
Let $[qrangle $ denote the integer set ${0,1, {ldots },q-1}$ and let ${{mathbb {B}}}={0,1}$ . The problem of implementing functions $[qrangle rightarrow {{mathbb {B}}}$ on content-addressable memories (CAMs) is considered. CAMs can be classified by the input alphabet and the state alphabet of their cells; for example, in binary CAMs, those alphabets are both ${{mathbb {B}}}$ , while in a ternary CAM (TCAM), both alphabets are endowed with a “don’t care” symbol. This work is motivated by recent proposals for using CAMs for fast inference on decision trees. In such learning models, the tree nodes carry out integer comparisons, such as testing equality $(x=t$ ?) or inequality $(xle t$ ?), where $xin [qrangle $ is an input to the node and $tin [qrangle $ is a node parameter. A CAM implementation of such comparisons includes mapping (i.e., encoding) $t$ into internal states of some number $n$ of cells and mapping $x$ into inputs to these cells, with the goal of minimizing $n$ . Such mappings are presented for various comparison families, as well as for the set of all functions $[qrangle rightarrow {{mathbb {B}}}$ , under several scenarios of input and state alphabets of the CAM cells. All those mappings are shown to be optimal in that they attain the smallest possible $n$ for any given $q$ .
设$[qrangle$表示整数集${0,1,{ldots},q-1}$,设${mathbb{B}}={0.1}$。考虑了在内容可寻址存储器(CAM)上实现函数$[q ranglerightarrow{math bb{B}}$的问题。CAM可以根据其单元的输入字母表和状态字母表进行分类;例如,在二进制CAM中,这些字母表都是$}}$,而在三元CAM(TCAM)中,两个字母表都被赋予了一个“不在乎”符号。这项工作的动机是最近提出的使用CAM对决策树进行快速推理的建议。在这样的学习模型中,树节点进行整数比较,例如测试等式$(x=t$?)或不等式$(xle t$,其中$xin[qrangle$是节点的输入,$tin[q rangle]是节点参数。这种比较的CAM实现包括映射(即编码)$t$映射到一些单元格$n$的内部状态,并将$x$映射到这些单元格的输入中,目的是最小化$n$。在CAM单元的输入和状态字母表的几种情况下,这种映射适用于各种比较族,以及所有函数$[qranglerightarrow{mathbb{B}}}$的集合。所有这些映射都是最优的,因为它们对于任何给定的$q$都获得了尽可能小的$n$。
{"title":"On the Implementation of Boolean Functions on Content-Addressable Memories","authors":"Ron M. Roth","doi":"10.1109/JSAIT.2023.3279333","DOIUrl":"https://doi.org/10.1109/JSAIT.2023.3279333","url":null,"abstract":"Let \u0000<inline-formula> <tex-math>$[qrangle $ </tex-math></inline-formula>\u0000 denote the integer set \u0000<inline-formula> <tex-math>${0,1, {ldots },q-1}$ </tex-math></inline-formula>\u0000 and let \u0000<inline-formula> <tex-math>${{mathbb {B}}}={0,1}$ </tex-math></inline-formula>\u0000. The problem of implementing functions \u0000<inline-formula> <tex-math>$[qrangle rightarrow {{mathbb {B}}}$ </tex-math></inline-formula>\u0000 on content-addressable memories (CAMs) is considered. CAMs can be classified by the input alphabet and the state alphabet of their cells; for example, in binary CAMs, those alphabets are both \u0000<inline-formula> <tex-math>${{mathbb {B}}}$ </tex-math></inline-formula>\u0000, while in a ternary CAM (TCAM), both alphabets are endowed with a “don’t care” symbol. This work is motivated by recent proposals for using CAMs for fast inference on decision trees. In such learning models, the tree nodes carry out integer comparisons, such as testing equality \u0000<inline-formula> <tex-math>$(x=t$ </tex-math></inline-formula>\u0000 ?) or inequality \u0000<inline-formula> <tex-math>$(xle t$ </tex-math></inline-formula>\u0000 ?), where \u0000<inline-formula> <tex-math>$xin [qrangle $ </tex-math></inline-formula>\u0000 is an input to the node and \u0000<inline-formula> <tex-math>$tin [qrangle $ </tex-math></inline-formula>\u0000 is a node parameter. A CAM implementation of such comparisons includes mapping (i.e., encoding) \u0000<inline-formula> <tex-math>$t$ </tex-math></inline-formula>\u0000 into internal states of some number \u0000<inline-formula> <tex-math>$n$ </tex-math></inline-formula>\u0000 of cells and mapping \u0000<inline-formula> <tex-math>$x$ </tex-math></inline-formula>\u0000 into inputs to these cells, with the goal of minimizing \u0000<inline-formula> <tex-math>$n$ </tex-math></inline-formula>\u0000. Such mappings are presented for various comparison families, as well as for the set of all functions \u0000<inline-formula> <tex-math>$[qrangle rightarrow {{mathbb {B}}}$ </tex-math></inline-formula>\u0000, under several scenarios of input and state alphabets of the CAM cells. All those mappings are shown to be optimal in that they attain the smallest possible \u0000<inline-formula> <tex-math>$n$ </tex-math></inline-formula>\u0000 for any given \u0000<inline-formula> <tex-math>$q$ </tex-math></inline-formula>\u0000.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"379-392"},"PeriodicalIF":0.0,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50354870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genomic Compression With Read Alignment at the Decoder 基因组压缩与读对齐在解码器
Pub Date : 2023-08-01 DOI: 10.1109/JSAIT.2023.3300831
Yotam Gershon;Yuval Cassuto
We propose a new compression scheme for genomic data given as sequence fragments called reads. The scheme uses a reference genome at the decoder side only, freeing the encoder from the burdens of storing references and performing computationally costly alignment operations. The main ingredient of the scheme is a multi-layer code construction, delivering to the decoder sufficient information to align the reads, correct their differences from the reference, validate their reconstruction, and correct reconstruction errors. The core of the method is the well-known concept of distributed source coding with decoder side information, fortified by a generalized-concatenation code construction enabling efficient embedding of all the information needed for reliable reconstruction. We first present the scheme for the case of substitution errors only between the reads and the reference, and then extend it to support reads with a single deletion and multiple substitutions. A central tool in this extension is a new distance metric that is shown analytically to improve alignment performance over existing distance metrics.
我们提出了一种新的基因组数据压缩方案,称为reads序列片段。该方案仅在解码器端使用参考基因组,从而将编码器从存储参考和执行计算上昂贵的比对操作的负担中解放出来。该方案的主要组成部分是多层码结构,向解码器提供足够的信息来对齐读取,纠正它们与参考的差异,验证它们的重建,并纠正重建错误。该方法的核心是众所周知的具有解码器侧信息的分布式源编码概念,通过通用级联代码结构进行强化,可以有效地嵌入可靠重建所需的所有信息。我们首先提出了仅在读取和引用之间存在替换错误的方案,然后将其扩展到支持一次删除和多次替换的读取。这个扩展的一个中心工具是一个新的距离度量,分析显示,以提高现有距离度量的对齐性能。
{"title":"Genomic Compression With Read Alignment at the Decoder","authors":"Yotam Gershon;Yuval Cassuto","doi":"10.1109/JSAIT.2023.3300831","DOIUrl":"10.1109/JSAIT.2023.3300831","url":null,"abstract":"We propose a new compression scheme for genomic data given as sequence fragments called reads. The scheme uses a reference genome at the decoder side only, freeing the encoder from the burdens of storing references and performing computationally costly alignment operations. The main ingredient of the scheme is a multi-layer code construction, delivering to the decoder sufficient information to align the reads, correct their differences from the reference, validate their reconstruction, and correct reconstruction errors. The core of the method is the well-known concept of distributed source coding with decoder side information, fortified by a generalized-concatenation code construction enabling efficient embedding of all the information needed for reliable reconstruction. We first present the scheme for the case of substitution errors only between the reads and the reference, and then extend it to support reads with a single deletion and multiple substitutions. A central tool in this extension is a new distance metric that is shown analytically to improve alignment performance over existing distance metrics.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"314-330"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42367267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
IEEE journal on selected areas in information theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1