IEEE journal on selected areas in information theory最新文献_第9页

Distributed Matrix Computations With Low-Weight Encodings 具有低权重编码的分布式矩阵计算

IEEE journal on selected areas in information theory

Pub Date : 2023-08-30 DOI: 10.1109/JSAIT.2023.3308768

Anindya Bijoy Das;Aditya Ramamoorthy;David J. Love;Christopher G. Brinton

Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework; this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combinations of submatrices to the worker nodes. When the input matrices are sparse, these approaches increase the number of non-zero entries in the encoded matrices, which in turn adversely affects the worker computation time. In this work, we develop a distributed matrix computation approach where the assigned encoded submatrices are random linear combinations of a small number of submatrices. In addition to being well suited for sparse input matrices, our approach continues to have the optimal straggler resilience in a certain range of problem parameters. Moreover, compared to recent sparse matrix computation approaches, the search for a “good” set of random coefficients to promote numerical stability in our method is much more computationally efficient. We show that our approach can efficiently utilize partial computations done by slower worker nodes in a heterogeneous system which can enhance the overall computation speed. Numerical experiments conducted through Amazon Web Services (AWS) demonstrate up to 30% reduction in per worker node computation time and

$100times $

faster encoding compared to the available methods.

杂散节点是分布式矩阵计算的众所周知的瓶颈，其导致计算/通信速度的降低。减轻这种掉队者的一种常见策略是将基于Reed-Solomon的MDS（最大距离可分离）码合并到框架中；这可以实现对抗最优数量的掉队者的弹性。然而，这些代码将子矩阵的密集线性组合分配给工作节点。当输入矩阵是稀疏的时，这些方法会增加编码矩阵中非零项的数量，这反过来又会对工作者的计算时间产生不利影响。在这项工作中，我们开发了一种分布式矩阵计算方法，其中指定的编码子矩阵是少量子矩阵的随机线性组合。除了非常适合稀疏输入矩阵外，我们的方法在一定的问题参数范围内仍然具有最佳掉队者弹性。此外，与最近的稀疏矩阵计算方法相比，在我们的方法中，搜索一组“好”的随机系数来提高数值稳定性在计算上要高效得多。我们表明，我们的方法可以有效地利用异构系统中较慢的工作节点所做的部分计算，这可以提高整体计算速度。通过亚马逊网络服务（AWS）进行的数值实验表明，与现有方法相比，每个工作节点的计算时间减少了30%，编码速度加快了100倍。

{"title":"Distributed Matrix Computations With Low-Weight Encodings","authors":"Anindya Bijoy Das;Aditya Ramamoorthy;David J. Love;Christopher G. Brinton","doi":"10.1109/JSAIT.2023.3308768","DOIUrl":"https://doi.org/10.1109/JSAIT.2023.3308768","url":null,"abstract":"Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework; this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combinations of submatrices to the worker nodes. When the input matrices are sparse, these approaches increase the number of non-zero entries in the encoded matrices, which in turn adversely affects the worker computation time. In this work, we develop a distributed matrix computation approach where the assigned encoded submatrices are random linear combinations of a small number of submatrices. In addition to being well suited for sparse input matrices, our approach continues to have the optimal straggler resilience in a certain range of problem parameters. Moreover, compared to recent sparse matrix computation approaches, the search for a “good” set of random coefficients to promote numerical stability in our method is much more computationally efficient. We show that our approach can efficiently utilize partial computations done by slower worker nodes in a heterogeneous system which can enhance the overall computation speed. Numerical experiments conducted through Amazon Web Services (AWS) demonstrate up to 30% reduction in per worker node computation time and \u0000<inline-formula> <tex-math>$100times $ </tex-math></inline-formula>\u0000 faster encoding compared to the available methods.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"363-378"},"PeriodicalIF":0.0,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50427091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Channel Coding at Low Capacity 低容量信道编码

IEEE journal on selected areas in information theory

Pub Date : 2023-08-16 DOI: 10.1109/JSAIT.2023.3305874

Mohammad Fereydounian;Hamed Hassani;Mohammad Vahid Jamali;Hessam Mahdavifar

Low-capacity scenarios have become increasingly important in the technology of the Internet of Things (IoT) and the next generation of wireless networks. Such scenarios require efficient and reliable transmission over channels with an extremely small capacity. Within these constraints, the state-of-the-art coding techniques may not be directly applicable. Moreover, the prior work on the finite-length analysis of optimal channel coding provides inaccurate predictions of the limits in the low-capacity regime. In this paper, we study channel coding at low capacity from two perspectives: fundamental limits at finite length and code constructions. We first specify what a low-capacity regime means. We then characterize finite-length fundamental limits of channel coding in the low-capacity regime for various types of channels, including binary erasure channels (BECs), binary symmetric channels (BSCs), and additive white Gaussian noise (AWGN) channels. From the code construction perspective, we characterize the optimal number of repetitions for transmission over binary memoryless symmetric (BMS) channels, in terms of the code blocklength and the underlying channel capacity, such that the capacity loss due to the repetition is negligible. Furthermore, it is shown that capacity-achieving polar codes naturally adopt the aforementioned optimal number of repetitions.

在物联网(IoT)和下一代无线网络技术中，低容量场景变得越来越重要。这种场景要求在容量极小的信道上进行高效可靠的传输。在这些限制条件下，最先进的编码技术可能无法直接应用。此外，先前关于最优信道编码的有限长度分析的工作提供了对低容量区域限制的不准确预测。本文从有限长度的基本限制和编码结构两个方面研究了低容量信道编码。我们首先指定低容量制度的含义。然后，我们描述了低容量条件下各种类型信道编码的有限长度基本限制，包括二进制擦除信道(BECs)，二进制对称信道(BSCs)和加性高斯白噪声(AWGN)信道。从代码结构的角度来看，我们描述了在二进制无内存对称(BMS)信道上传输的最佳重复次数，根据代码块长度和底层信道容量，使得由于重复造成的容量损失可以忽略不计。此外，研究表明，容量实现极化码自然采用上述最优重复数。

{"title":"Channel Coding at Low Capacity","authors":"Mohammad Fereydounian;Hamed Hassani;Mohammad Vahid Jamali;Hessam Mahdavifar","doi":"10.1109/JSAIT.2023.3305874","DOIUrl":"10.1109/JSAIT.2023.3305874","url":null,"abstract":"Low-capacity scenarios have become increasingly important in the technology of the Internet of Things (IoT) and the next generation of wireless networks. Such scenarios require efficient and reliable transmission over channels with an extremely small capacity. Within these constraints, the state-of-the-art coding techniques may not be directly applicable. Moreover, the prior work on the finite-length analysis of optimal channel coding provides inaccurate predictions of the limits in the low-capacity regime. In this paper, we study channel coding at low capacity from two perspectives: fundamental limits at finite length and code constructions. We first specify what a low-capacity regime means. We then characterize finite-length fundamental limits of channel coding in the low-capacity regime for various types of channels, including binary erasure channels (BECs), binary symmetric channels (BSCs), and additive white Gaussian noise (AWGN) channels. From the code construction perspective, we characterize the optimal number of repetitions for transmission over binary memoryless symmetric (BMS) channels, in terms of the code blocklength and the underlying channel capacity, such that the capacity loss due to the repetition is negligible. Furthermore, it is shown that capacity-achieving polar codes naturally adopt the aforementioned optimal number of repetitions.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"351-362"},"PeriodicalIF":0.0,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42302948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Continuous-Time Distributed Filtering With Sensing and Communication Constraints 具有传感和通信限制的连续时间分布式过滤

IEEE journal on selected areas in information theory

Pub Date : 2023-08-10 DOI: 10.1109/JSAIT.2023.3304249

Zhenyu Liu;Andrea Conti;Sanjoy K. Mitter;Moe Z. Win

Distributed filtering is crucial in many applications such as localization, radar, autonomy, and environmental monitoring. The aim of distributed filtering is to infer time-varying unknown states using data obtained via sensing and communication in a network. This paper analyzes continuous-time distributed filtering with sensing and communication constraints. In particular, the paper considers a building-block system of two nodes, where each node is tasked with inferring a time-varying unknown state. At each time, the two nodes obtain noisy observations of the unknown states via sensing and perform communication via a Gaussian feedback channel. The distributed filter of the unknown state is computed based on both the sensor observations and the received messages. We analyze the asymptotic performance of the distributed filter by deriving a necessary and sufficient condition of the sensing and communication capabilities under which the mean-square error of the distributed filter is bounded over time. Numerical results are presented to validate the derived necessary and sufficient condition.

分布式滤波在定位、雷达、自治和环境监测等许多应用中都是至关重要的。分布式滤波的目的是利用网络中通过感知和通信获得的数据推断出时变的未知状态。本文分析了具有传感和通信约束的连续时间分布式滤波。特别地，本文考虑了一个由两个节点组成的积木系统，其中每个节点的任务是推断一个时变的未知状态。每一次，两个节点通过感知获得未知状态的噪声观测值，并通过高斯反馈信道进行通信。基于传感器观测和接收到的消息计算未知状态的分布式滤波器。我们通过推导分布式滤波器的传感和通信能力的充分必要条件来分析分布式滤波器的渐近性能，在此条件下，分布式滤波器的均方误差随时间有界。数值结果验证了所推导的充要条件。

引用次数: 0

On the Implementation of Boolean Functions on Content-Addressable Memories 关于布尔函数在内容可寻址存储器上的实现

IEEE journal on selected areas in information theory

Pub Date : 2023-08-07 DOI: 10.1109/JSAIT.2023.3279333

Ron M. Roth

Let

$[qrangle $

denote the integer set

${0,1, {ldots },q-1}$

and let

${{mathbb {B}}}={0,1}$

. The problem of implementing functions

$[qrangle rightarrow {{mathbb {B}}}$

on content-addressable memories (CAMs) is considered. CAMs can be classified by the input alphabet and the state alphabet of their cells; for example, in binary CAMs, those alphabets are both

${{mathbb {B}}}$

, while in a ternary CAM (TCAM), both alphabets are endowed with a “don’t care” symbol. This work is motivated by recent proposals for using CAMs for fast inference on decision trees. In such learning models, the tree nodes carry out integer comparisons, such as testing equality

$(x=t$

?) or inequality

$(xle t$

?), where

$xin [qrangle $

is an input to the node and

$tin [qrangle $

is a node parameter. A CAM implementation of such comparisons includes mapping (i.e., encoding)

$t$

into internal states of some number

$n$

of cells and mapping

$x$

into inputs to these cells, with the goal of minimizing

$n$

. Such mappings are presented for various comparison families, as well as for the set of all functions

$[qrangle rightarrow {{mathbb {B}}}$

, under several scenarios of input and state alphabets of the CAM cells. All those mappings are shown to be optimal in that they attain the smallest possible

$n$

for any given

$q$

.

设$[qrangle$表示整数集$｛0,1，｛ldots｝，q-1｝$，设$｛mathbb｛B｝｝＝｛0.1｝$。考虑了在内容可寻址存储器（CAM）上实现函数$[q ranglerightarrow｛math bb｛B｝}$的问题。CAM可以根据其单元的输入字母表和状态字母表进行分类；例如，在二进制CAM中，这些字母表都是$}}$，而在三元CAM（TCAM）中，两个字母表都被赋予了一个“不在乎”符号。这项工作的动机是最近提出的使用CAM对决策树进行快速推理的建议。在这样的学习模型中，树节点进行整数比较，例如测试等式$（x=t$？）或不等式$（xle t$，其中$xin[qrangle$是节点的输入，$tin[q rangle]是节点参数。这种比较的CAM实现包括映射（即编码）$t$映射到一些单元格$n$的内部状态，并将$x$映射到这些单元格的输入中，目的是最小化$n$。在CAM单元的输入和状态字母表的几种情况下，这种映射适用于各种比较族，以及所有函数$[qranglerightarrow｛mathbb｛B｝｝}$的集合。所有这些映射都是最优的，因为它们对于任何给定的$q$都获得了尽可能小的$n$。

{"title":"On the Implementation of Boolean Functions on Content-Addressable Memories","authors":"Ron M. Roth","doi":"10.1109/JSAIT.2023.3279333","DOIUrl":"https://doi.org/10.1109/JSAIT.2023.3279333","url":null,"abstract":"Let \u0000<inline-formula> <tex-math>$[qrangle $ </tex-math></inline-formula>\u0000 denote the integer set \u0000<inline-formula> <tex-math>${0,1, {ldots },q-1}$ </tex-math></inline-formula>\u0000 and let \u0000<inline-formula> <tex-math>${{mathbb {B}}}={0,1}$ </tex-math></inline-formula>\u0000. The problem of implementing functions \u0000<inline-formula> <tex-math>$[qrangle rightarrow {{mathbb {B}}}$ </tex-math></inline-formula>\u0000 on content-addressable memories (CAMs) is considered. CAMs can be classified by the input alphabet and the state alphabet of their cells; for example, in binary CAMs, those alphabets are both \u0000<inline-formula> <tex-math>${{mathbb {B}}}$ </tex-math></inline-formula>\u0000, while in a ternary CAM (TCAM), both alphabets are endowed with a “don’t care” symbol. This work is motivated by recent proposals for using CAMs for fast inference on decision trees. In such learning models, the tree nodes carry out integer comparisons, such as testing equality \u0000<inline-formula> <tex-math>$(x=t$ </tex-math></inline-formula>\u0000 ?) or inequality \u0000<inline-formula> <tex-math>$(xle t$ </tex-math></inline-formula>\u0000 ?), where \u0000<inline-formula> <tex-math>$xin [qrangle $ </tex-math></inline-formula>\u0000 is an input to the node and \u0000<inline-formula> <tex-math>$tin [qrangle $ </tex-math></inline-formula>\u0000 is a node parameter. A CAM implementation of such comparisons includes mapping (i.e., encoding) \u0000<inline-formula> <tex-math>$t$ </tex-math></inline-formula>\u0000 into internal states of some number \u0000<inline-formula> <tex-math>$n$ </tex-math></inline-formula>\u0000 of cells and mapping \u0000<inline-formula> <tex-math>$x$ </tex-math></inline-formula>\u0000 into inputs to these cells, with the goal of minimizing \u0000<inline-formula> <tex-math>$n$ </tex-math></inline-formula>\u0000. Such mappings are presented for various comparison families, as well as for the set of all functions \u0000<inline-formula> <tex-math>$[qrangle rightarrow {{mathbb {B}}}$ </tex-math></inline-formula>\u0000, under several scenarios of input and state alphabets of the CAM cells. All those mappings are shown to be optimal in that they attain the smallest possible \u0000<inline-formula> <tex-math>$n$ </tex-math></inline-formula>\u0000 for any given \u0000<inline-formula> <tex-math>$q$ </tex-math></inline-formula>\u0000.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"379-392"},"PeriodicalIF":0.0,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50354870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Genomic Compression With Read Alignment at the Decoder 基因组压缩与读对齐在解码器

IEEE journal on selected areas in information theory

Pub Date : 2023-08-01 DOI: 10.1109/JSAIT.2023.3300831

Yotam Gershon;Yuval Cassuto

We propose a new compression scheme for genomic data given as sequence fragments called reads. The scheme uses a reference genome at the decoder side only, freeing the encoder from the burdens of storing references and performing computationally costly alignment operations. The main ingredient of the scheme is a multi-layer code construction, delivering to the decoder sufficient information to align the reads, correct their differences from the reference, validate their reconstruction, and correct reconstruction errors. The core of the method is the well-known concept of distributed source coding with decoder side information, fortified by a generalized-concatenation code construction enabling efficient embedding of all the information needed for reliable reconstruction. We first present the scheme for the case of substitution errors only between the reads and the reference, and then extend it to support reads with a single deletion and multiple substitutions. A central tool in this extension is a new distance metric that is shown analytically to improve alignment performance over existing distance metrics.

我们提出了一种新的基因组数据压缩方案，称为reads序列片段。该方案仅在解码器端使用参考基因组，从而将编码器从存储参考和执行计算上昂贵的比对操作的负担中解放出来。该方案的主要组成部分是多层码结构，向解码器提供足够的信息来对齐读取，纠正它们与参考的差异，验证它们的重建，并纠正重建错误。该方法的核心是众所周知的具有解码器侧信息的分布式源编码概念，通过通用级联代码结构进行强化，可以有效地嵌入可靠重建所需的所有信息。我们首先提出了仅在读取和引用之间存在替换错误的方案，然后将其扩展到支持一次删除和多次替换的读取。这个扩展的一个中心工具是一个新的距离度量，分析显示，以提高现有距离度量的对齐性能。

引用次数: 1

Machine Learning-Aided Efficient Decoding of Reed–Muller Subcodes Reed-Muller子码的机器学习辅助高效解码

IEEE journal on selected areas in information theory

Pub Date : 2023-07-25 DOI: 10.1109/JSAIT.2023.3298362

Mohammad Vahid Jamali;Xiyang Liu;Ashok Vardhan Makkuva;Hessam Mahdavifar;Sewoong Oh;Pramod Viswanath

Reed-Muller (RM) codes achieve the capacity of general binary-input memoryless symmetric channels and are conjectured to have a comparable performance to that of random codes in terms of scaling laws. However, such results are established assuming maximum-likelihood decoders for general code parameters. Also, RM codes only admit limited sets of rates. Efficient decoders such as successive cancellation list (SCL) decoder and recently-introduced recursive projection-aggregation (RPA) decoders are available for RM codes at finite lengths. In this paper, we focus on subcodes of RM codes with flexible rates. We first extend the RPA decoding algorithm to RM subcodes. To lower the complexity of our decoding algorithm, referred to as subRPA, we investigate different approaches to prune the projections. Next, we derive the soft-decision based version of our algorithm, called soft-subRPA, that not only improves upon the performance of subRPA but also enables a differentiable decoding algorithm. Building upon the soft-subRPA algorithm, we then provide a framework for training a machine learning (ML) model to search for good sets of projections that minimize the decoding error rate. Training our ML model enables achieving very close to the performance of full-projection decoding with a significantly smaller number of projections. We also show that the choice of the projections in decoding RM subcodes matters significantly, and our ML-aided projection pruning scheme is able to find a good selection, i.e., with negligible performance degradation compared to the full-projection case, given a reasonable number of projections.

Reed-Muller（RM）码实现了一般二进制输入无记忆对称信道的容量，并被推测在比例律方面具有与随机码相当的性能。然而，这样的结果是在假设通用代码参数的最大似然解码器的情况下建立的。此外，RM代码只允许有限的费率集。诸如连续消除列表（SCL）解码器和最近引入的递归投影聚合（RPA）解码器之类的高效解码器可用于有限长度的RM码。本文主要研究具有灵活速率的RM码的子码。我们首先将RPA解码算法扩展到RM子码。为了降低我们的解码算法（称为subRPA）的复杂性，我们研究了修剪投影的不同方法。接下来，我们导出了我们算法的基于软判决的版本，称为软subRPA，它不仅提高了subRPA的性能，而且实现了可微分解码算法。在软subRPA算法的基础上，我们提供了一个用于训练机器学习（ML）模型的框架，以搜索最小化解码错误率的良好投影集。训练我们的ML模型能够以显著较少的投影数量实现非常接近全投影解码的性能。我们还表明，在解码RM子码时，投影的选择非常重要，并且我们的ML辅助投影修剪方案能够找到一个很好的选择，即，在给定合理数量的投影的情况下，与全投影情况相比，性能退化可以忽略不计。

{"title":"Machine Learning-Aided Efficient Decoding of Reed–Muller Subcodes","authors":"Mohammad Vahid Jamali;Xiyang Liu;Ashok Vardhan Makkuva;Hessam Mahdavifar;Sewoong Oh;Pramod Viswanath","doi":"10.1109/JSAIT.2023.3298362","DOIUrl":"10.1109/JSAIT.2023.3298362","url":null,"abstract":"Reed-Muller (RM) codes achieve the capacity of general binary-input memoryless symmetric channels and are conjectured to have a comparable performance to that of random codes in terms of scaling laws. However, such results are established assuming maximum-likelihood decoders for general code parameters. Also, RM codes only admit limited sets of rates. Efficient decoders such as successive cancellation list (SCL) decoder and recently-introduced recursive projection-aggregation (RPA) decoders are available for RM codes at finite lengths. In this paper, we focus on subcodes of RM codes with flexible rates. We first extend the RPA decoding algorithm to RM subcodes. To lower the complexity of our decoding algorithm, referred to as subRPA, we investigate different approaches to prune the projections. Next, we derive the soft-decision based version of our algorithm, called soft-subRPA, that not only improves upon the performance of subRPA but also enables a differentiable decoding algorithm. Building upon the soft-subRPA algorithm, we then provide a framework for training a machine learning (ML) model to search for \u0000<italic>good</i>\u0000 sets of projections that minimize the decoding error rate. Training our ML model enables achieving very close to the performance of full-projection decoding with a significantly smaller number of projections. We also show that the choice of the projections in decoding RM subcodes matters significantly, and our ML-aided projection pruning scheme is able to find a \u0000<italic>good</i>\u0000 selection, i.e., with negligible performance degradation compared to the full-projection case, given a reasonable number of projections.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"260-275"},"PeriodicalIF":0.0,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47958050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Efficient Algorithms for the Bee-Identification Problem 蜜蜂识别问题的有效算法

IEEE journal on selected areas in information theory

Pub Date : 2023-07-18 DOI: 10.1109/JSAIT.2023.3296077

Han Mao Kiah;Alexander Vardy;Hanwen Yao

The bee-identification problem, formally defined by Tandon, Tan, and Varshney (2019), requires the receiver to identify “bees” using a set of unordered noisy measurements. In this previous work, Tandon, Tan, and Varshney studied error exponents and showed that decoding the measurements jointly results in a significantly larger error exponent. In this work, we study algorithms related to this joint decoder. First, we demonstrate how to perform joint decoding efficiently. By reducing to the problem of finding perfect matching and minimum-cost matchings, we obtain joint decoders that run in time quadratic and cubic in the number of “bees” for the binary erasure (BEC) and binary symmetric channels (BSC), respectively. Next, by studying the matching algorithms in the context of channel coding, we further reduce the running times by using classical tools like peeling decoders and list-decoders. In particular, we show that our identifier algorithms when used with Reed-Muller codes terminate in almost linear and quadratic time for BEC and BSC, respectively. Finally, for explicit codebooks, we study when these joint decoders fail to identify the “bees” correctly. Specifically, we provide practical methods of estimating the probability of erroneous identification for given codebooks.

由Tandon, Tan和Varshney(2019)正式定义的蜜蜂识别问题要求接收器使用一组无序噪声测量来识别“蜜蜂”。在之前的工作中，Tandon、Tan和Varshney研究了误差指数，并表明联合解码测量结果会导致更大的误差指数。在这项工作中，我们研究了与该联合解码器相关的算法。首先，我们演示了如何有效地进行联合解码。通过简化到寻找完美匹配和最小代价匹配的问题，我们获得了二进制擦除(BEC)和二进制对称信道(BSC)的联合解码器，其运行时间分别为二次和三次的“蜜蜂”数量。接下来，通过研究信道编码背景下的匹配算法，我们使用剥离解码器和列表解码器等经典工具进一步减少运行时间。特别地，我们证明了我们的标识符算法在与Reed-Muller码一起使用时，分别在BEC和BSC的几乎线性和二次时间内终止。最后，对于显式密码本，我们研究了当这些联合解码器无法正确识别“蜜蜂”时。具体来说，我们提供了估计给定码本错误识别概率的实用方法。

{"title":"Efficient Algorithms for the Bee-Identification Problem","authors":"Han Mao Kiah;Alexander Vardy;Hanwen Yao","doi":"10.1109/JSAIT.2023.3296077","DOIUrl":"10.1109/JSAIT.2023.3296077","url":null,"abstract":"The bee-identification problem, formally defined by Tandon, Tan, and Varshney (2019), requires the receiver to identify “bees” using a set of unordered noisy measurements. In this previous work, Tandon, Tan, and Varshney studied error exponents and showed that decoding the measurements jointly results in a significantly larger error exponent. In this work, we study algorithms related to this joint decoder. First, we demonstrate how to perform joint decoding efficiently. By reducing to the problem of finding perfect matching and minimum-cost matchings, we obtain joint decoders that run in time quadratic and cubic in the number of “bees” for the binary erasure (BEC) and binary symmetric channels (BSC), respectively. Next, by studying the matching algorithms in the context of channel coding, we further reduce the running times by using classical tools like peeling decoders and list-decoders. In particular, we show that our identifier algorithms when used with Reed-Muller codes terminate in almost linear and quadratic time for BEC and BSC, respectively. Finally, for explicit codebooks, we study when these joint decoders fail to identify the “bees” correctly. Specifically, we provide practical methods of estimating the probability of erroneous identification for given codebooks.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"205-218"},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45416529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Active Privacy-Utility Trade-Off Against Inference in Time-Series Data Sharing 时间序列数据共享中的主动隐私效用权衡

IEEE journal on selected areas in information theory

Pub Date : 2023-06-28 DOI: 10.1109/JSAIT.2023.3287929

Ecenaz Erdemir;Pier Luigi Dragotti;Deniz Gündüz

Internet of Things devices have become highly popular thanks to the services they offer. However, they also raise privacy concerns since they share fine-grained time-series user data with untrusted third parties. We model the user’s personal information as the secret variable, to be kept private from an honest-but-curious service provider, and the useful variable, to be disclosed for utility. We consider an active learning framework, where one out of a finite set of measurement mechanisms is chosen at each time step, each revealing some information about the underlying secret and useful variables, albeit with different statistics. The measurements are taken such that the correct value of useful variable can be detected quickly, while the confidence on the secret variable remains below a predefined level. For privacy measure, we consider both the probability of correctly detecting the secret variable value and the mutual information between the secret and released data. We formulate both problems as partially observable Markov decision processes, and numerically solve by advantage actor-critic deep reinforcement learning. We evaluate the privacy-utility trade-off of the proposed policies on both the synthetic and real-world time-series datasets.

物联网设备由于其提供的服务而变得非常受欢迎。然而，它们也引起了隐私问题，因为它们与不受信任的第三方共享细粒度的时间序列用户数据。我们将用户的个人信息建模为秘密变量，对诚实但好奇的服务提供商保密，并将有用的变量建模为实用性披露。我们考虑了一个主动学习框架，在该框架中，在每个时间步长从有限的一组测量机制中选择一个，每个机制都揭示了一些关于潜在秘密和有用变量的信息，尽管统计数据不同。进行测量使得可以快速检测有用变量的正确值，同时对秘密变量的置信度保持在预定水平以下。对于隐私度量，我们既考虑了正确检测秘密变量值的概率，也考虑了秘密数据和已发布数据之间的相互信息。我们将这两个问题公式化为部分可观察的马尔可夫决策过程，并通过优因子-批评家深度强化学习进行数值求解。我们在合成和真实世界的时间序列数据集上评估了所提出的策略的隐私效用权衡。

{"title":"Active Privacy-Utility Trade-Off Against Inference in Time-Series Data Sharing","authors":"Ecenaz Erdemir;Pier Luigi Dragotti;Deniz Gündüz","doi":"10.1109/JSAIT.2023.3287929","DOIUrl":"10.1109/JSAIT.2023.3287929","url":null,"abstract":"Internet of Things devices have become highly popular thanks to the services they offer. However, they also raise privacy concerns since they share fine-grained time-series user data with untrusted third parties. We model the user’s personal information as the secret variable, to be kept private from an honest-but-curious service provider, and the useful variable, to be disclosed for utility. We consider an active learning framework, where one out of a finite set of measurement mechanisms is chosen at each time step, each revealing some information about the underlying secret and useful variables, albeit with different statistics. The measurements are taken such that the correct value of useful variable can be detected quickly, while the confidence on the secret variable remains below a predefined level. For privacy measure, we consider both the probability of correctly detecting the secret variable value and the mutual information between the secret and released data. We formulate both problems as partially observable Markov decision processes, and numerically solve by advantage actor-critic deep reinforcement learning. We evaluate the privacy-utility trade-off of the proposed policies on both the synthetic and real-world time-series datasets.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"159-173"},"PeriodicalIF":0.0,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49236323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

SPRT-Based Efficient Best Arm Identification in Stochastic Bandits 基于sprt的随机盗匪有效最佳臂识别

IEEE journal on selected areas in information theory

Pub Date : 2023-06-23 DOI: 10.1109/JSAIT.2023.3288988

Arpan Mukherjee;Ali Tajer

This paper investigates the best arm identification (BAI) problem in stochastic multi-armed bandits in the fixed confidence setting. The general class of the exponential family of bandits is considered. The existing algorithms for the exponential family of bandits face computational challenges. To mitigate these challenges, the BAI problem is viewed and analyzed as a sequential composite hypothesis testing task, and a framework is proposed that adopts the likelihood ratio-based tests known to be effective for sequential testing. Based on this test statistic, a BAI algorithm is designed that leverages the canonical sequential probability ratio tests for arm selection and is amenable to tractable analysis for the exponential family of bandits. This algorithm has two key features: (1) its sample complexity is asymptotically optimal, and (2) it is guaranteed to be

$delta -$

PAC. Existing efficient approaches focus on the Gaussian setting and require Thompson sampling for the arm deemed the best and the challenger arm. Additionally, this paper analytically quantifies the computational expense of identifying the challenger in an existing approach. Finally, numerical experiments are provided to support the analysis.

本文研究了在固定置信度条件下随机多武装匪徒的最佳武器识别问题。考虑了土匪指数族的一般类。现有的土匪指数族算法面临计算挑战。为了缓解这些挑战，BAI问题被视为一个连续的复合假设测试任务，并进行了分析，提出了一个框架，该框架采用了已知对连续测试有效的基于似然比的测试。基于这一测试统计，设计了一种BAI算法，该算法利用正则序列概率比测试进行手臂选择，并适用于指数土匪家族的易处理分析。该算法具有两个关键特征：（1）其样本复杂度是渐近最优的；（2）保证其为$delta-$PAC。现有的有效方法侧重于高斯设置，并且需要对被认为是最好的手臂和挑战者手臂进行汤普森采样。此外，本文分析量化了现有方法中识别挑战者的计算费用。最后，通过数值实验为分析提供了支持。

{"title":"SPRT-Based Efficient Best Arm Identification in Stochastic Bandits","authors":"Arpan Mukherjee;Ali Tajer","doi":"10.1109/JSAIT.2023.3288988","DOIUrl":"10.1109/JSAIT.2023.3288988","url":null,"abstract":"This paper investigates the best arm identification (BAI) problem in stochastic multi-armed bandits in the fixed confidence setting. The general class of the exponential family of bandits is considered. The existing algorithms for the exponential family of bandits face computational challenges. To mitigate these challenges, the BAI problem is viewed and analyzed as a sequential composite hypothesis testing task, and a framework is proposed that adopts the likelihood ratio-based tests known to be effective for sequential testing. Based on this test statistic, a BAI algorithm is designed that leverages the canonical sequential probability ratio tests for arm selection and is amenable to tractable analysis for the exponential family of bandits. This algorithm has two key features: (1) its sample complexity is asymptotically optimal, and (2) it is guaranteed to be \u0000<inline-formula> <tex-math>$delta -$ </tex-math></inline-formula>\u0000PAC. Existing efficient approaches focus on the Gaussian setting and require Thompson sampling for the arm deemed the best and the challenger arm. Additionally, this paper analytically quantifies the computational expense of identifying the challenger in an existing approach. Finally, numerical experiments are provided to support the analysis.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"128-143"},"PeriodicalIF":0.0,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44500225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Dual-Blind Deconvolution for Overlaid Radar-Communications Systems 地面雷达通信系统的双盲反卷积

IEEE journal on selected areas in information theory

Pub Date : 2023-06-22 DOI: 10.1109/JSAIT.2023.3287823

Edwin Vargas;Kumar Vijay Mishra;Roman Jacome;Brian M. Sadler;Henry Arguello

The increasingly crowded spectrum has spurred the design of joint radar-communications systems that share hardware resources and efficiently use the radio frequency spectrum. We study a general spectral coexistence scenario, wherein the channels and transmit signals of both radar and communications systems are unknown at the receiver. In this dual-blind deconvolution (DBD) problem, a common receiver admits a multi-carrier wireless communications signal that is overlaid with the radar signal reflected off multiple targets. The communications and radar channels are represented by continuous-valued range-time and Doppler velocities of multiple transmission paths and multiple targets. We exploit the sparsity of both channels to solve the highly ill-posed DBD problem by casting it into a sum of multivariate atomic norms (SoMAN) minimization. We devise a semidefinite program to estimate the unknown target and communications parameters using the theories of positive-hyperoctant trigonometric polynomials (PhTP). Our theoretical analyses show that the minimum number of samples required for near-perfect recovery is dependent on the logarithm of the maximum of number of radar targets and communications paths rather than their sum. We show that our SoMAN method and PhTP formulations are also applicable to more general scenarios such as unsynchronized transmission, the presence of noise, and multiple emitters. Numerical experiments demonstrate great performance enhancements during parameter recovery under different scenarios.

日益拥挤的频谱刺激了联合雷达通信系统的设计，这种系统可以共享硬件资源并有效地利用无线电频谱。我们研究了一般的频谱共存场景，其中雷达和通信系统的信道和发射信号在接收器是未知的。在这种双盲反褶积(DBD)问题中，普通接收机接收到的多载波无线通信信号与多个目标反射的雷达信号叠加在一起。通信和雷达信道由多个传输路径和多个目标的连续距离时间和多普勒速度表示。我们利用两个信道的稀疏性来解决高度不适定的DBD问题，将其转化为多元原子规范(SoMAN)最小化的和。我们设计了一个半确定程序来估计未知目标和通信参数，使用正高八域三角多项式(PhTP)理论。我们的理论分析表明，近乎完美恢复所需的最小样本数量取决于雷达目标和通信路径的最大数量的对数，而不是它们的总和。我们展示了我们的SoMAN方法和PhTP公式也适用于更一般的场景，例如不同步传输、存在噪声和多个发射器。数值实验表明，在不同的场景下，参数恢复对性能有很大的提高。

{"title":"Dual-Blind Deconvolution for Overlaid Radar-Communications Systems","authors":"Edwin Vargas;Kumar Vijay Mishra;Roman Jacome;Brian M. Sadler;Henry Arguello","doi":"10.1109/JSAIT.2023.3287823","DOIUrl":"10.1109/JSAIT.2023.3287823","url":null,"abstract":"The increasingly crowded spectrum has spurred the design of joint radar-communications systems that share hardware resources and efficiently use the radio frequency spectrum. We study a general spectral coexistence scenario, wherein the channels and transmit signals of both radar and communications systems are unknown at the receiver. In this \u0000<italic>dual-blind deconvolution</i>\u0000 (DBD) problem, a common receiver admits a multi-carrier wireless communications signal that is overlaid with the radar signal reflected off multiple targets. The communications and radar channels are represented by \u0000<italic>continuous-valued</i>\u0000 range-time and Doppler velocities of multiple transmission paths and multiple targets. We exploit the sparsity of both channels to solve the highly ill-posed DBD problem by casting it into a sum of multivariate atomic norms (SoMAN) minimization. We devise a semidefinite program to estimate the unknown target and communications parameters using the theories of positive-hyperoctant trigonometric polynomials (PhTP). Our theoretical analyses show that the minimum number of samples required for near-perfect recovery is dependent on the logarithm of the maximum of number of radar targets and communications paths rather than their sum. We show that our SoMAN method and PhTP formulations are also applicable to more general scenarios such as unsynchronized transmission, the presence of noise, and multiple emitters. Numerical experiments demonstrate great performance enhancements during parameter recovery under different scenarios.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"4 ","pages":"75-93"},"PeriodicalIF":0.0,"publicationDate":"2023-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45365256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5