首页 > 最新文献

arXiv - CS - Information Theory最新文献

英文 中文
Properties of Shannon and Rényi entropies of the Poisson distribution as the functions of intensity parameter 泊松分布的香农熵和雷尼熵作为强度参数函数的特性
Pub Date : 2024-02-06 DOI: arxiv-2403.08805
Volodymyr Braiman, Anatoliy Malyarenko, Yuliya Mishura, Yevheniia Anastasiia Rudyk
We consider two types of entropy, namely, Shannon and R'{e}nyi entropies ofthe Poisson distribution, and establish their properties as the functions ofintensity parameter. More precisely, we prove that both entropies increase withintensity. While for Shannon entropy the proof is comparatively simple, forR'{e}nyi entropy, which depends on additional parameter $alpha>0$, we cancharacterize it as nontrivial. The proof is based on application of Karamata'sinequality to the terms of Poisson distribution.
我们考虑了两种熵,即泊松分布的香农熵和 R'{e}nyi 熵,并确定了它们作为强度参数函数的性质。更准确地说,我们证明了这两种熵都随强度的增加而增加。对于香农熵,证明相对简单,而对于依赖于附加参数 $alpha>0$ 的 R/'{e}nyi 熵,我们可以将其描述为非难证。证明基于对泊松分布项应用卡拉马塔正弦定理。
{"title":"Properties of Shannon and Rényi entropies of the Poisson distribution as the functions of intensity parameter","authors":"Volodymyr Braiman, Anatoliy Malyarenko, Yuliya Mishura, Yevheniia Anastasiia Rudyk","doi":"arxiv-2403.08805","DOIUrl":"https://doi.org/arxiv-2403.08805","url":null,"abstract":"We consider two types of entropy, namely, Shannon and R'{e}nyi entropies of\u0000the Poisson distribution, and establish their properties as the functions of\u0000intensity parameter. More precisely, we prove that both entropies increase with\u0000intensity. While for Shannon entropy the proof is comparatively simple, for\u0000R'{e}nyi entropy, which depends on additional parameter $alpha>0$, we can\u0000characterize it as nontrivial. The proof is based on application of Karamata's\u0000inequality to the terms of Poisson distribution.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140148967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TexShape: Information Theoretic Sentence Embedding for Language Models TexShape:语言模型的信息论句子嵌入
Pub Date : 2024-02-05 DOI: arxiv-2402.05132
H. Kaan Kale, Homa Esfahanizadeh, Noel Elias, Oguzhan Baser, Muriel Medard, Sriram Vishwanath
With the exponential growth in data volume and the emergence ofdata-intensive applications, particularly in the field of machine learning,concerns related to resource utilization, privacy, and fairness have becomeparamount. This paper focuses on the textual domain of data and addresseschallenges regarding encoding sentences to their optimized representationsthrough the lens of information-theory. In particular, we use empiricalestimates of mutual information, using the Donsker-Varadhan definition ofKullback-Leibler divergence. Our approach leverages this estimation to train aninformation-theoretic sentence embedding, called TexShape, for (task-based)data compression or for filtering out sensitive information, enhancing privacyand fairness. In this study, we employ a benchmark language model for initialtext representation, complemented by neural networks for information-theoreticcompression and mutual information estimations. Our experiments demonstratesignificant advancements in preserving maximal targeted information and minimalsensitive information over adverse compression ratios, in terms of predictiveaccuracy of downstream models that are trained using the compressed data.
随着数据量的指数级增长和数据密集型应用的出现,特别是在机器学习领域,与资源利用、隐私和公平性相关的问题变得尤为重要。本文重点关注文本数据领域,并通过信息论的视角来解决有关将句子编码为其优化表示的挑战。特别是,我们使用 Donsker-Varadhan 定义的库尔巴克-莱伯勒发散(Kullback-Leibler divergence)对互信息进行了经验性估计。我们的方法利用这种估计来训练一种信息论句子嵌入(称为 TexShape),用于(基于任务的)数据压缩或过滤敏感信息,从而提高隐私性和公平性。在这项研究中,我们采用了一个基准语言模型作为初始文本表示,并辅以神经网络进行信息理论压缩和互信息估算。我们的实验表明,在使用压缩数据训练的下游模型的预测准确性方面,我们在保留最大目标信息和最小敏感信息方面取得了显著进步。
{"title":"TexShape: Information Theoretic Sentence Embedding for Language Models","authors":"H. Kaan Kale, Homa Esfahanizadeh, Noel Elias, Oguzhan Baser, Muriel Medard, Sriram Vishwanath","doi":"arxiv-2402.05132","DOIUrl":"https://doi.org/arxiv-2402.05132","url":null,"abstract":"With the exponential growth in data volume and the emergence of\u0000data-intensive applications, particularly in the field of machine learning,\u0000concerns related to resource utilization, privacy, and fairness have become\u0000paramount. This paper focuses on the textual domain of data and addresses\u0000challenges regarding encoding sentences to their optimized representations\u0000through the lens of information-theory. In particular, we use empirical\u0000estimates of mutual information, using the Donsker-Varadhan definition of\u0000Kullback-Leibler divergence. Our approach leverages this estimation to train an\u0000information-theoretic sentence embedding, called TexShape, for (task-based)\u0000data compression or for filtering out sensitive information, enhancing privacy\u0000and fairness. In this study, we employ a benchmark language model for initial\u0000text representation, complemented by neural networks for information-theoretic\u0000compression and mutual information estimations. Our experiments demonstrate\u0000significant advancements in preserving maximal targeted information and minimal\u0000sensitive information over adverse compression ratios, in terms of predictive\u0000accuracy of downstream models that are trained using the compressed data.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139761406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characterization of the Distortion-Perception Tradeoff for Finite Channels with Arbitrary Metrics 具有任意指标的有限信道的失真与感知权衡特征
Pub Date : 2024-02-03 DOI: arxiv-2402.02265
Dror Freirich, Nir Weinberger, Ron Meir
Whenever inspected by humans, reconstructed signals should not bedistinguished from real ones. Typically, such a high perceptual quality comesat the price of high reconstruction error, and vice versa. We study thisdistortion-perception (DP) tradeoff over finite-alphabet channels, for theWasserstein-$1$ distance induced by a general metric as the perception index,and an arbitrary distortion matrix. Under this setting, we show that computingthe DP function and the optimal reconstructions is equivalent to solving a setof linear programming problems. We provide a structural characterization of theDP tradeoff, where the DP function is piecewise linear in the perception index.We further derive a closed-form expression for the case of binary sources.
在人类检测时,重建信号不应与真实信号有任何区别。通常,这种高感知质量是以高重建误差为代价的,反之亦然。我们研究了有限字母信道上的这种失真-感知(DP)权衡,研究对象是以一般度量作为感知指数的瓦瑟斯坦-1 美元距离和任意失真矩阵。在这种情况下,我们证明计算 DP 函数和最优重构等同于求解一组线性规划问题。我们提供了 DP 权衡的结构特征,其中 DP 函数与感知指数成片断线性关系。
{"title":"Characterization of the Distortion-Perception Tradeoff for Finite Channels with Arbitrary Metrics","authors":"Dror Freirich, Nir Weinberger, Ron Meir","doi":"arxiv-2402.02265","DOIUrl":"https://doi.org/arxiv-2402.02265","url":null,"abstract":"Whenever inspected by humans, reconstructed signals should not be\u0000distinguished from real ones. Typically, such a high perceptual quality comes\u0000at the price of high reconstruction error, and vice versa. We study this\u0000distortion-perception (DP) tradeoff over finite-alphabet channels, for the\u0000Wasserstein-$1$ distance induced by a general metric as the perception index,\u0000and an arbitrary distortion matrix. Under this setting, we show that computing\u0000the DP function and the optimal reconstructions is equivalent to solving a set\u0000of linear programming problems. We provide a structural characterization of the\u0000DP tradeoff, where the DP function is piecewise linear in the perception index.\u0000We further derive a closed-form expression for the case of binary sources.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139761469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Information of Large Language Model Geometry 大语言模型几何的信息
Pub Date : 2024-02-01 DOI: arxiv-2402.03471
Zhiquan Tan, Chenghai Li, Weiran Huang
This paper investigates the information encoded in the embeddings of largelanguage models (LLMs). We conduct simulations to analyze the representationentropy and discover a power law relationship with model sizes. Building uponthis observation, we propose a theory based on (conditional) entropy toelucidate the scaling law phenomenon. Furthermore, we delve into theauto-regressive structure of LLMs and examine the relationship between the lasttoken and previous context tokens using information theory and regressiontechniques. Specifically, we establish a theoretical connection between theinformation gain of new tokens and ridge regression. Additionally, we explorethe effectiveness of Lasso regression in selecting meaningful tokens, whichsometimes outperforms the closely related attention weights. Finally, weconduct controlled experiments, and find that information is distributed acrosstokens, rather than being concentrated in specific "meaningful" tokens alone.
本文研究了大型语言模型(LLMs)的嵌入中编码的信息。我们通过模拟来分析表征熵,发现它与模型大小之间存在幂律关系。基于这一观察结果,我们提出了一种基于(条件)熵的理论来解释缩放定律现象。此外,我们还深入研究了 LLM 的自回归结构,并利用信息论和回归技术研究了最后一个标记与之前上下文标记之间的关系。具体来说,我们在新标记的信息增益和脊回归之间建立了理论联系。此外,我们还探索了 Lasso 回归在选择有意义标记方面的有效性,其效果有时优于密切相关的注意力权重。最后,我们进行了对照实验,发现信息是分布在各个词块上的,而不是仅仅集中在特定的 "有意义 "词块上。
{"title":"The Information of Large Language Model Geometry","authors":"Zhiquan Tan, Chenghai Li, Weiran Huang","doi":"arxiv-2402.03471","DOIUrl":"https://doi.org/arxiv-2402.03471","url":null,"abstract":"This paper investigates the information encoded in the embeddings of large\u0000language models (LLMs). We conduct simulations to analyze the representation\u0000entropy and discover a power law relationship with model sizes. Building upon\u0000this observation, we propose a theory based on (conditional) entropy to\u0000elucidate the scaling law phenomenon. Furthermore, we delve into the\u0000auto-regressive structure of LLMs and examine the relationship between the last\u0000token and previous context tokens using information theory and regression\u0000techniques. Specifically, we establish a theoretical connection between the\u0000information gain of new tokens and ridge regression. Additionally, we explore\u0000the effectiveness of Lasso regression in selecting meaningful tokens, which\u0000sometimes outperforms the closely related attention weights. Finally, we\u0000conduct controlled experiments, and find that information is distributed across\u0000tokens, rather than being concentrated in specific \"meaningful\" tokens alone.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"127 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139761407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extracting and visualizing a new classification system for Colombia's National Administrative Department of Statistics. A visual analytics framework case study 哥伦比亚国家统计局新分类系统的提取和可视化。可视化分析框架案例研究
Pub Date : 2024-01-29 DOI: arxiv-2401.15994
Pierre RaimbaudUNIANDES, Jaime Camilo Espitia CastilloUNIANDES, John Guerra-GomezNortheastern University, Silicon Valley Campus
In a world filled with data, it is expected for a nation to take decisionsinformed by data. However, countries need to first collect and publish suchdata in a way meaningful for both citizens and policy makers. A good thematicclassification could be instrumental in helping users navigate and find theright resources on a rich data repository as the one collected by Colombia'sNational Administrative Department of Statistics (DANE). The Visual AnalyticsFramework is a methodology for conducting visual analysis developed by T.Munzner et al. [T. Munzner, Visualization Analysis and Design, A K PetersVisualization Series, 1, 2014] that could help with this task. This paperpresents a case study applying such framework conducted to help the DANE bettervisualize their data repository, and present a more understandableclassification of it. It describes three main analysis tasks identified, theproposed solutions and the collection of insights generated from them.
在一个充满数据的世界里,人们期待一个国家根据数据做出决策。然而,各国首先需要收集和发布对公民和决策者都有意义的数据。一个好的主题分类可以帮助用户在丰富的数据资源库中导航并找到正确的资源,哥伦比亚国家统计局(DANE)就收集了大量的数据。可视化分析框架(Visual AnalyticsFramework)是 T.Munzner 等人开发的一种进行可视化分析的方法[T. Munzner, Visualization Analysis and Visualization Analytics]。Munzner,《可视化分析与设计》,A K PetersVisualization Series,1,2014 年]开发的一种进行可视化分析的方法,可以帮助完成这项任务。本文介绍了一项应用此类框架的案例研究,该研究旨在帮助 DANE 更好地可视化其数据存储库,并对其进行更易于理解的分类。本文介绍了确定的三项主要分析任务、建议的解决方案以及从中收集的见解。
{"title":"Extracting and visualizing a new classification system for Colombia's National Administrative Department of Statistics. A visual analytics framework case study","authors":"Pierre RaimbaudUNIANDES, Jaime Camilo Espitia CastilloUNIANDES, John Guerra-GomezNortheastern University, Silicon Valley Campus","doi":"arxiv-2401.15994","DOIUrl":"https://doi.org/arxiv-2401.15994","url":null,"abstract":"In a world filled with data, it is expected for a nation to take decisions\u0000informed by data. However, countries need to first collect and publish such\u0000data in a way meaningful for both citizens and policy makers. A good thematic\u0000classification could be instrumental in helping users navigate and find the\u0000right resources on a rich data repository as the one collected by Colombia's\u0000National Administrative Department of Statistics (DANE). The Visual Analytics\u0000Framework is a methodology for conducting visual analysis developed by T.\u0000Munzner et al. [T. Munzner, Visualization Analysis and Design, A K Peters\u0000Visualization Series, 1, 2014] that could help with this task. This paper\u0000presents a case study applying such framework conducted to help the DANE better\u0000visualize their data repository, and present a more understandable\u0000classification of it. It describes three main analysis tasks identified, the\u0000proposed solutions and the collection of insights generated from them.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139584743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predictability and Randomness 可预测性和随机性
Pub Date : 2024-01-23 DOI: arxiv-2401.13066
Lenhart K. Schubert
Algorithmic theories of randomness can be related to theories ofprobabilistic sequence prediction through the notion of a predictor, defined asa function which supplies lower bounds on initial-segment probabilities ofinfinite sequences. An infinite binary sequence $z$ is called unpredictable iffits initial-segment "redundancy" $n+log p(z(n))$ remains sufficiently lowrelative to every effective predictor $p$. A predictor which maximizes theinitial-segment redundancy of a sequence is called optimal for that sequence.It turns out that a sequence is random iff it is unpredictable. More generally,a sequence is random relative to an arbitrary computable distribution iff thedistribution is itself an optimal predictor for the sequence. Here "random" canbe taken in the sense of Martin-L"{o}f by using weak criteria ofeffectiveness, or in the sense of Schnorr by using stronger criteria ofeffectiveness. Under the weaker criteria of effectiveness it is possible toconstruct a universal predictor which is optimal for all infinite sequences.This predictor assigns nonvanishing limit probabilities precisely to therecursive sequences. Under the stronger criteria of effectiveness it ispossible to establish a law of large numbers for sequences random relative to acomputable distribution, which may be useful as a criterion of "rationality"for methods of probabilistic prediction. A remarkable feature of effectivepredictors is the fact that they are expressible in the special form firstproposed by Solomonoff. In this form sequence prediction reduces to assigninghigh probabilities to initial segments with short and/or numerous encodings.This fact provides the link between theories of randomness and Solomonoff'stheory of prediction.
随机性的算法理论可以通过预测器的概念与概率序列预测理论联系起来,预测器被定义为一个函数,它为无限序列的初始段概率提供下限。如果一个无穷二进制序列 $z$ 的初始段 "冗余度"$n+/log p(z(n))$相对于每一个有效的预测因子 $p$ 仍然足够低,那么这个序列就被称为不可预测序列。如果一个序列是不可预测的,那么它就是随机的。更广义地说,如果一个序列的分布本身就是该序列的最优预测器,那么这个序列相对于一个任意可计算的分布就是随机的。这里的 "随机 "可以在马丁-勒夫(Martin-L"{o}f)的意义上使用,即使用较弱的有效性标准;也可以在施诺尔(Schnorr)的意义上使用,即使用较强的有效性标准。在较弱的有效性标准下,可以构建一个对所有无限序列都是最优的通用预测器。在更强的有效性标准下,有可能为相对于可计算分布的随机序列建立大数定律,这可能是概率预测方法的 "合理性 "标准。有效预测器的一个显著特点是,它们可以用所罗门夫首次提出的特殊形式来表达。在这种形式下,序列预测简化为对编码短和/或编码多的初始片段赋予高概率。
{"title":"Predictability and Randomness","authors":"Lenhart K. Schubert","doi":"arxiv-2401.13066","DOIUrl":"https://doi.org/arxiv-2401.13066","url":null,"abstract":"Algorithmic theories of randomness can be related to theories of\u0000probabilistic sequence prediction through the notion of a predictor, defined as\u0000a function which supplies lower bounds on initial-segment probabilities of\u0000infinite sequences. An infinite binary sequence $z$ is called unpredictable iff\u0000its initial-segment \"redundancy\" $n+log p(z(n))$ remains sufficiently low\u0000relative to every effective predictor $p$. A predictor which maximizes the\u0000initial-segment redundancy of a sequence is called optimal for that sequence.\u0000It turns out that a sequence is random iff it is unpredictable. More generally,\u0000a sequence is random relative to an arbitrary computable distribution iff the\u0000distribution is itself an optimal predictor for the sequence. Here \"random\" can\u0000be taken in the sense of Martin-L\"{o}f by using weak criteria of\u0000effectiveness, or in the sense of Schnorr by using stronger criteria of\u0000effectiveness. Under the weaker criteria of effectiveness it is possible to\u0000construct a universal predictor which is optimal for all infinite sequences.\u0000This predictor assigns nonvanishing limit probabilities precisely to the\u0000recursive sequences. Under the stronger criteria of effectiveness it is\u0000possible to establish a law of large numbers for sequences random relative to a\u0000computable distribution, which may be useful as a criterion of \"rationality\"\u0000for methods of probabilistic prediction. A remarkable feature of effective\u0000predictors is the fact that they are expressible in the special form first\u0000proposed by Solomonoff. In this form sequence prediction reduces to assigning\u0000high probabilities to initial segments with short and/or numerous encodings.\u0000This fact provides the link between theories of randomness and Solomonoff's\u0000theory of prediction.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139561948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure 基于条件分布感知测量的速率-失真-感知权衡
Pub Date : 2024-01-22 DOI: arxiv-2401.12207
Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu
We study the rate-distortion-perception (RDP) tradeoff for a memorylesssource model in the asymptotic limit of large block-lengths. Our perceptionmeasure is based on a divergence between the distributions of the source andreconstruction sequences conditioned on the encoder output, which was firstproposed in [1], [2]. We consider the case when there is no shared randomnessbetween the encoder and the decoder. For the case of discrete memorylesssources we derive a single-letter characterization of the RDP function, thussettling a problem that remains open for the marginal metric introduced in Blauand Michaeli [3] (with no shared randomness). Our achievability scheme is basedon lossy source coding with a posterior reference map proposed in [4]. For thecase of continuous valued sources under squared error distortion measure andsquared quadratic Wasserstein perception measure we also derive a single-lettercharacterization and show that a noise-adding mechanism at the decoder sufficesto achieve the optimal representation. For the case of zero perception loss, weshow that our characterization interestingly coincides with the results for themarginal metric derived in [5], [6] and again demonstrate that zero perceptionloss can be achieved with a $3$-dB penalty in the minimum distortion. Finallywe specialize our results to the case of Gaussian sources. We derive the RDPfunction for vector Gaussian sources and propose a waterfilling type solution.We also partially characterize the RDP function for a mixture of vectorGaussians.
我们研究了大块长度渐近极限下无记忆源模型的速率-失真-感知(RDP)权衡。我们的感知度量基于编码器输出条件下的源序列和重构序列分布之间的发散,这是在 [1] 和 [2] 中首次提出的。我们考虑的是编码器和解码器之间不存在共享随机性的情况。对于离散无内存源的情况,我们推导出了 RDP 函数的单字母特征,从而解决了一个对于 Blauand Michaeli [3] 中引入的边际度量(无共享随机性)来说仍然悬而未决的问题。我们的可实现性方案是基于 [4] 中提出的后参考图的有损信源编码。对于在平方误差失真度和平方二次瓦瑟斯坦感知度下的连续值源,我们还推导出了一个单字母表征,并证明在解码器上的噪声添加机制足以实现最优表征。对于零感知损失的情况,我们展示了我们的描述与 [5]、[6] 中得出的边际度量结果有趣地重合,并再次证明零感知损失可以通过对最小失真进行 3 美元-分贝的惩罚来实现。最后,我们将结果专门用于高斯源情况。我们推导出了矢量高斯源的 RDP 函数,并提出了填水型解决方案。
{"title":"Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure","authors":"Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu","doi":"arxiv-2401.12207","DOIUrl":"https://doi.org/arxiv-2401.12207","url":null,"abstract":"We study the rate-distortion-perception (RDP) tradeoff for a memoryless\u0000source model in the asymptotic limit of large block-lengths. Our perception\u0000measure is based on a divergence between the distributions of the source and\u0000reconstruction sequences conditioned on the encoder output, which was first\u0000proposed in [1], [2]. We consider the case when there is no shared randomness\u0000between the encoder and the decoder. For the case of discrete memoryless\u0000sources we derive a single-letter characterization of the RDP function, thus\u0000settling a problem that remains open for the marginal metric introduced in Blau\u0000and Michaeli [3] (with no shared randomness). Our achievability scheme is based\u0000on lossy source coding with a posterior reference map proposed in [4]. For the\u0000case of continuous valued sources under squared error distortion measure and\u0000squared quadratic Wasserstein perception measure we also derive a single-letter\u0000characterization and show that a noise-adding mechanism at the decoder suffices\u0000to achieve the optimal representation. For the case of zero perception loss, we\u0000show that our characterization interestingly coincides with the results for the\u0000marginal metric derived in [5], [6] and again demonstrate that zero perception\u0000loss can be achieved with a $3$-dB penalty in the minimum distortion. Finally\u0000we specialize our results to the case of Gaussian sources. We derive the RDP\u0000function for vector Gaussian sources and propose a waterfilling type solution.\u0000We also partially characterize the RDP function for a mixture of vector\u0000Gaussians.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139558511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Near-Field Localization with $1$-bit Quantized Hybrid A/D Reception 利用 1 美元位量化混合 A/D 接收进行近场定位
Pub Date : 2024-01-22 DOI: arxiv-2401.12029
Ioannis Gavras, Italo Atzeni, George C. Alexandropoulos
In this paper, we consider a hybrid Analog and Digital (A/D) receiverarchitecture with an extremely large Dynamic Metasurface Antenna (DMA) and an$1$-bit resolution Analog-to-Digital Converter (ADC) at each of its receptionradio-frequency chains, and present a localization approach for User Equipment(UE) lying in its near-field regime. The proposed algorithm scans the UE areaof interest to identify the DMA-based analog combining configuration resultingto the peak in a received pseudo-spectrum, yielding the UE position estimationin three dimensions. Our simulation results demonstrate the validity of theproposed scheme, especially for increasing DMA sizes, and showcase theinterplay among various system parameters.
在本文中,我们考虑了一种模拟和数字(A/D)混合接收器架构,该架构在其每个接收射频链上都配备了一个超大型动态元面天线(DMA)和一个 1 美元位分辨率模数转换器(ADC),并提出了一种针对位于其近场系统中的用户设备(UE)的定位方法。所提出的算法扫描 UE 感兴趣的区域,以识别导致接收伪频谱峰值的基于 DMA 的模拟组合配置,从而得出 UE 的三维位置估计。我们的仿真结果证明了所提方案的有效性,尤其是在 DMA 规模不断增大的情况下,并展示了各种系统参数之间的相互作用。
{"title":"Near-Field Localization with $1$-bit Quantized Hybrid A/D Reception","authors":"Ioannis Gavras, Italo Atzeni, George C. Alexandropoulos","doi":"arxiv-2401.12029","DOIUrl":"https://doi.org/arxiv-2401.12029","url":null,"abstract":"In this paper, we consider a hybrid Analog and Digital (A/D) receiver\u0000architecture with an extremely large Dynamic Metasurface Antenna (DMA) and an\u0000$1$-bit resolution Analog-to-Digital Converter (ADC) at each of its reception\u0000radio-frequency chains, and present a localization approach for User Equipment\u0000(UE) lying in its near-field regime. The proposed algorithm scans the UE area\u0000of interest to identify the DMA-based analog combining configuration resulting\u0000to the peak in a received pseudo-spectrum, yielding the UE position estimation\u0000in three dimensions. Our simulation results demonstrate the validity of the\u0000proposed scheme, especially for increasing DMA sizes, and showcase the\u0000interplay among various system parameters.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"157 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139558799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalization and Informativeness of Conformal Prediction 共形预测的普遍性和信息量
Pub Date : 2024-01-22 DOI: arxiv-2401.11810
Matteo Zecchin, Sangwoo Park, Osvaldo Simeone, Fredrik Hellström
The safe integration of machine learning modules in decision-making processeshinges on their ability to quantify uncertainty. A popular technique to achievethis goal is conformal prediction (CP), which transforms an arbitrary basepredictor into a set predictor with coverage guarantees. While CP certifies thepredicted set to contain the target quantity with a user-defined tolerance, itdoes not provide control over the average size of the predicted sets, i.e.,over the informativeness of the prediction. In this work, a theoreticalconnection is established between the generalization properties of the basepredictor and the informativeness of the resulting CP prediction sets. To thisend, an upper bound is derived on the expected size of the CP set predictorthat builds on generalization error bounds for the base predictor. The derivedupper bound provides insights into the dependence of the average size of the CPset predictor on the amount of calibration data, the target reliability, andthe generalization performance of the base predictor. The theoretical insightsare validated using simple numerical regression and classification tasks.
机器学习模块能否安全地集成到决策过程中,取决于其量化不确定性的能力。实现这一目标的流行技术是保形预测(CP),它将任意基准预测器转换为具有覆盖保证的集合预测器。虽然 CP 可以证明预测集合包含用户定义容差的目标量,但它无法控制预测集合的平均大小,即预测的信息量。在这项工作中,我们在基础预测器的泛化特性和所得到的 CP 预测集的信息量之间建立了理论联系。为此,以基本预测器的泛化误差边界为基础,推导出了 CP 预测集预期大小的上界。推导出的上界有助于深入了解 CP 集预测器的平均大小与标定数据量、目标可靠性和基础预测器的泛化性能之间的关系。这些理论见解通过简单的数值回归和分类任务得到了验证。
{"title":"Generalization and Informativeness of Conformal Prediction","authors":"Matteo Zecchin, Sangwoo Park, Osvaldo Simeone, Fredrik Hellström","doi":"arxiv-2401.11810","DOIUrl":"https://doi.org/arxiv-2401.11810","url":null,"abstract":"The safe integration of machine learning modules in decision-making processes\u0000hinges on their ability to quantify uncertainty. A popular technique to achieve\u0000this goal is conformal prediction (CP), which transforms an arbitrary base\u0000predictor into a set predictor with coverage guarantees. While CP certifies the\u0000predicted set to contain the target quantity with a user-defined tolerance, it\u0000does not provide control over the average size of the predicted sets, i.e.,\u0000over the informativeness of the prediction. In this work, a theoretical\u0000connection is established between the generalization properties of the base\u0000predictor and the informativeness of the resulting CP prediction sets. To this\u0000end, an upper bound is derived on the expected size of the CP set predictor\u0000that builds on generalization error bounds for the base predictor. The derived\u0000upper bound provides insights into the dependence of the average size of the CP\u0000set predictor on the amount of calibration data, the target reliability, and\u0000the generalization performance of the base predictor. The theoretical insights\u0000are validated using simple numerical regression and classification tasks.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"146 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139559339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-oriented Coordinated Uplink Transmission for Massive IoT System 面向数据的大规模物联网系统协调上行链路传输
Pub Date : 2024-01-22 DOI: arxiv-2401.11761
Jyri Hämäläinen, Rui Dinis, Mehmet C. Ilter
Recently, the paradigm of massive ultra-reliable low-latency IoTcommunications (URLLC-IoT) has gained growing interest. Reliable delay-criticaluplink transmission in IoT is a challenging task since low-complex devicestypically do not support multiple antennas or demanding signal processingtasks. However, in many IoT services the data volumes are small and deploymentsmay include massive number of devices. We consider on a clustered uplinktransmission with two cooperation approaches: First, we focus on scenario wherelocation-based channel knowledge map (CKM) is applied to enable cooperation.Second, we consider a scenario where scarce channel side-information is appliedin transmission. In both scenarios we also model and analyse the impact oferroneous information. In the performance evaluation we apply the recentlyintroduced data-oriented approach that has gathered significant attention inthe context of short-packet transmissions. Specifically, it introduces atransient performance metric for small data transmissions, where the amount ofdata and available bandwidth play crucial roles. Results show that cooperationbetween clustered IoT devices may provide notable benefits in terms ofincreased range. It is noticed that the performance is heavily depending on thestrength of the static channel component in the CKM based cooperation. Thechannel side-information based cooperation is robust against changes in theradio environment but sensitive to possible errors in the channelside-information. Even with large IoT device clusters, side-information errorsmay set a limit for the use of services assuming high-reliability andlow-latency. Analytic results are verified against simulations, showing onlyminor differences at low probability levels.
最近,大规模超可靠低延迟物联网通信(URLLC-IoT)的范例越来越受到关注。物联网中可靠的延迟关键上行链路传输是一项具有挑战性的任务,因为低复杂度设备通常不支持多天线或苛刻的信号处理任务。然而,在许多物联网服务中,数据量较小,部署可能包括大量设备。我们考虑采用两种合作方式进行集群上行链路传输:首先,我们将重点放在应用基于位置的信道知识图(CKM)来实现合作的场景上;其次,我们考虑在传输中应用稀缺信道侧信息的场景。在这两种情况下,我们还对错误信息的影响进行了建模和分析。在性能评估中,我们采用了最近推出的面向数据的方法,这种方法在短数据包传输中备受关注。具体来说,它为小型数据传输引入了瞬时性能指标,其中数据量和可用带宽起着至关重要的作用。结果表明,集群物联网设备之间的合作可在增加传输距离方面带来显著优势。我们注意到,在基于 CKM 的合作中,性能在很大程度上取决于静态信道组件的强度。基于信道侧信息的合作对无线电环境的变化具有鲁棒性,但对信道侧信息可能出现的错误很敏感。即使是大型物联网设备集群,侧信息错误也可能会限制高可靠性和低延迟服务的使用。分析结果与模拟结果进行了验证,显示在低概率水平下仅存在微小差异。
{"title":"Data-oriented Coordinated Uplink Transmission for Massive IoT System","authors":"Jyri Hämäläinen, Rui Dinis, Mehmet C. Ilter","doi":"arxiv-2401.11761","DOIUrl":"https://doi.org/arxiv-2401.11761","url":null,"abstract":"Recently, the paradigm of massive ultra-reliable low-latency IoT\u0000communications (URLLC-IoT) has gained growing interest. Reliable delay-critical\u0000uplink transmission in IoT is a challenging task since low-complex devices\u0000typically do not support multiple antennas or demanding signal processing\u0000tasks. However, in many IoT services the data volumes are small and deployments\u0000may include massive number of devices. We consider on a clustered uplink\u0000transmission with two cooperation approaches: First, we focus on scenario where\u0000location-based channel knowledge map (CKM) is applied to enable cooperation.\u0000Second, we consider a scenario where scarce channel side-information is applied\u0000in transmission. In both scenarios we also model and analyse the impact of\u0000erroneous information. In the performance evaluation we apply the recently\u0000introduced data-oriented approach that has gathered significant attention in\u0000the context of short-packet transmissions. Specifically, it introduces a\u0000transient performance metric for small data transmissions, where the amount of\u0000data and available bandwidth play crucial roles. Results show that cooperation\u0000between clustered IoT devices may provide notable benefits in terms of\u0000increased range. It is noticed that the performance is heavily depending on the\u0000strength of the static channel component in the CKM based cooperation. The\u0000channel side-information based cooperation is robust against changes in the\u0000radio environment but sensitive to possible errors in the channel\u0000side-information. Even with large IoT device clusters, side-information errors\u0000may set a limit for the use of services assuming high-reliability and\u0000low-latency. Analytic results are verified against simulations, showing only\u0000minor differences at low probability levels.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139558663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Information Theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1