We consider two types of entropy, namely, Shannon and R'{e}nyi entropies of the Poisson distribution, and establish their properties as the functions of intensity parameter. More precisely, we prove that both entropies increase with intensity. While for Shannon entropy the proof is comparatively simple, for R'{e}nyi entropy, which depends on additional parameter $alpha>0$, we can characterize it as nontrivial. The proof is based on application of Karamata's inequality to the terms of Poisson distribution.
{"title":"Properties of Shannon and Rényi entropies of the Poisson distribution as the functions of intensity parameter","authors":"Volodymyr Braiman, Anatoliy Malyarenko, Yuliya Mishura, Yevheniia Anastasiia Rudyk","doi":"arxiv-2403.08805","DOIUrl":"https://doi.org/arxiv-2403.08805","url":null,"abstract":"We consider two types of entropy, namely, Shannon and R'{e}nyi entropies of\u0000the Poisson distribution, and establish their properties as the functions of\u0000intensity parameter. More precisely, we prove that both entropies increase with\u0000intensity. While for Shannon entropy the proof is comparatively simple, for\u0000R'{e}nyi entropy, which depends on additional parameter $alpha>0$, we can\u0000characterize it as nontrivial. The proof is based on application of Karamata's\u0000inequality to the terms of Poisson distribution.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140148967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the exponential growth in data volume and the emergence of data-intensive applications, particularly in the field of machine learning, concerns related to resource utilization, privacy, and fairness have become paramount. This paper focuses on the textual domain of data and addresses challenges regarding encoding sentences to their optimized representations through the lens of information-theory. In particular, we use empirical estimates of mutual information, using the Donsker-Varadhan definition of Kullback-Leibler divergence. Our approach leverages this estimation to train an information-theoretic sentence embedding, called TexShape, for (task-based) data compression or for filtering out sensitive information, enhancing privacy and fairness. In this study, we employ a benchmark language model for initial text representation, complemented by neural networks for information-theoretic compression and mutual information estimations. Our experiments demonstrate significant advancements in preserving maximal targeted information and minimal sensitive information over adverse compression ratios, in terms of predictive accuracy of downstream models that are trained using the compressed data.
{"title":"TexShape: Information Theoretic Sentence Embedding for Language Models","authors":"H. Kaan Kale, Homa Esfahanizadeh, Noel Elias, Oguzhan Baser, Muriel Medard, Sriram Vishwanath","doi":"arxiv-2402.05132","DOIUrl":"https://doi.org/arxiv-2402.05132","url":null,"abstract":"With the exponential growth in data volume and the emergence of\u0000data-intensive applications, particularly in the field of machine learning,\u0000concerns related to resource utilization, privacy, and fairness have become\u0000paramount. This paper focuses on the textual domain of data and addresses\u0000challenges regarding encoding sentences to their optimized representations\u0000through the lens of information-theory. In particular, we use empirical\u0000estimates of mutual information, using the Donsker-Varadhan definition of\u0000Kullback-Leibler divergence. Our approach leverages this estimation to train an\u0000information-theoretic sentence embedding, called TexShape, for (task-based)\u0000data compression or for filtering out sensitive information, enhancing privacy\u0000and fairness. In this study, we employ a benchmark language model for initial\u0000text representation, complemented by neural networks for information-theoretic\u0000compression and mutual information estimations. Our experiments demonstrate\u0000significant advancements in preserving maximal targeted information and minimal\u0000sensitive information over adverse compression ratios, in terms of predictive\u0000accuracy of downstream models that are trained using the compressed data.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139761406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Whenever inspected by humans, reconstructed signals should not be distinguished from real ones. Typically, such a high perceptual quality comes at the price of high reconstruction error, and vice versa. We study this distortion-perception (DP) tradeoff over finite-alphabet channels, for the Wasserstein-$1$ distance induced by a general metric as the perception index, and an arbitrary distortion matrix. Under this setting, we show that computing the DP function and the optimal reconstructions is equivalent to solving a set of linear programming problems. We provide a structural characterization of the DP tradeoff, where the DP function is piecewise linear in the perception index. We further derive a closed-form expression for the case of binary sources.
{"title":"Characterization of the Distortion-Perception Tradeoff for Finite Channels with Arbitrary Metrics","authors":"Dror Freirich, Nir Weinberger, Ron Meir","doi":"arxiv-2402.02265","DOIUrl":"https://doi.org/arxiv-2402.02265","url":null,"abstract":"Whenever inspected by humans, reconstructed signals should not be\u0000distinguished from real ones. Typically, such a high perceptual quality comes\u0000at the price of high reconstruction error, and vice versa. We study this\u0000distortion-perception (DP) tradeoff over finite-alphabet channels, for the\u0000Wasserstein-$1$ distance induced by a general metric as the perception index,\u0000and an arbitrary distortion matrix. Under this setting, we show that computing\u0000the DP function and the optimal reconstructions is equivalent to solving a set\u0000of linear programming problems. We provide a structural characterization of the\u0000DP tradeoff, where the DP function is piecewise linear in the perception index.\u0000We further derive a closed-form expression for the case of binary sources.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139761469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper investigates the information encoded in the embeddings of large language models (LLMs). We conduct simulations to analyze the representation entropy and discover a power law relationship with model sizes. Building upon this observation, we propose a theory based on (conditional) entropy to elucidate the scaling law phenomenon. Furthermore, we delve into the auto-regressive structure of LLMs and examine the relationship between the last token and previous context tokens using information theory and regression techniques. Specifically, we establish a theoretical connection between the information gain of new tokens and ridge regression. Additionally, we explore the effectiveness of Lasso regression in selecting meaningful tokens, which sometimes outperforms the closely related attention weights. Finally, we conduct controlled experiments, and find that information is distributed across tokens, rather than being concentrated in specific "meaningful" tokens alone.
{"title":"The Information of Large Language Model Geometry","authors":"Zhiquan Tan, Chenghai Li, Weiran Huang","doi":"arxiv-2402.03471","DOIUrl":"https://doi.org/arxiv-2402.03471","url":null,"abstract":"This paper investigates the information encoded in the embeddings of large\u0000language models (LLMs). We conduct simulations to analyze the representation\u0000entropy and discover a power law relationship with model sizes. Building upon\u0000this observation, we propose a theory based on (conditional) entropy to\u0000elucidate the scaling law phenomenon. Furthermore, we delve into the\u0000auto-regressive structure of LLMs and examine the relationship between the last\u0000token and previous context tokens using information theory and regression\u0000techniques. Specifically, we establish a theoretical connection between the\u0000information gain of new tokens and ridge regression. Additionally, we explore\u0000the effectiveness of Lasso regression in selecting meaningful tokens, which\u0000sometimes outperforms the closely related attention weights. Finally, we\u0000conduct controlled experiments, and find that information is distributed across\u0000tokens, rather than being concentrated in specific \"meaningful\" tokens alone.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"127 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139761407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pierre RaimbaudUNIANDES, Jaime Camilo Espitia CastilloUNIANDES, John Guerra-GomezNortheastern University, Silicon Valley Campus
In a world filled with data, it is expected for a nation to take decisions informed by data. However, countries need to first collect and publish such data in a way meaningful for both citizens and policy makers. A good thematic classification could be instrumental in helping users navigate and find the right resources on a rich data repository as the one collected by Colombia's National Administrative Department of Statistics (DANE). The Visual Analytics Framework is a methodology for conducting visual analysis developed by T. Munzner et al. [T. Munzner, Visualization Analysis and Design, A K Peters Visualization Series, 1, 2014] that could help with this task. This paper presents a case study applying such framework conducted to help the DANE better visualize their data repository, and present a more understandable classification of it. It describes three main analysis tasks identified, the proposed solutions and the collection of insights generated from them.
在一个充满数据的世界里,人们期待一个国家根据数据做出决策。然而,各国首先需要收集和发布对公民和决策者都有意义的数据。一个好的主题分类可以帮助用户在丰富的数据资源库中导航并找到正确的资源,哥伦比亚国家统计局(DANE)就收集了大量的数据。可视化分析框架(Visual AnalyticsFramework)是 T.Munzner 等人开发的一种进行可视化分析的方法[T. Munzner, Visualization Analysis and Visualization Analytics]。Munzner,《可视化分析与设计》,A K PetersVisualization Series,1,2014 年]开发的一种进行可视化分析的方法,可以帮助完成这项任务。本文介绍了一项应用此类框架的案例研究,该研究旨在帮助 DANE 更好地可视化其数据存储库,并对其进行更易于理解的分类。本文介绍了确定的三项主要分析任务、建议的解决方案以及从中收集的见解。
{"title":"Extracting and visualizing a new classification system for Colombia's National Administrative Department of Statistics. A visual analytics framework case study","authors":"Pierre RaimbaudUNIANDES, Jaime Camilo Espitia CastilloUNIANDES, John Guerra-GomezNortheastern University, Silicon Valley Campus","doi":"arxiv-2401.15994","DOIUrl":"https://doi.org/arxiv-2401.15994","url":null,"abstract":"In a world filled with data, it is expected for a nation to take decisions\u0000informed by data. However, countries need to first collect and publish such\u0000data in a way meaningful for both citizens and policy makers. A good thematic\u0000classification could be instrumental in helping users navigate and find the\u0000right resources on a rich data repository as the one collected by Colombia's\u0000National Administrative Department of Statistics (DANE). The Visual Analytics\u0000Framework is a methodology for conducting visual analysis developed by T.\u0000Munzner et al. [T. Munzner, Visualization Analysis and Design, A K Peters\u0000Visualization Series, 1, 2014] that could help with this task. This paper\u0000presents a case study applying such framework conducted to help the DANE better\u0000visualize their data repository, and present a more understandable\u0000classification of it. It describes three main analysis tasks identified, the\u0000proposed solutions and the collection of insights generated from them.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139584743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Algorithmic theories of randomness can be related to theories of probabilistic sequence prediction through the notion of a predictor, defined as a function which supplies lower bounds on initial-segment probabilities of infinite sequences. An infinite binary sequence $z$ is called unpredictable iff its initial-segment "redundancy" $n+log p(z(n))$ remains sufficiently low relative to every effective predictor $p$. A predictor which maximizes the initial-segment redundancy of a sequence is called optimal for that sequence. It turns out that a sequence is random iff it is unpredictable. More generally, a sequence is random relative to an arbitrary computable distribution iff the distribution is itself an optimal predictor for the sequence. Here "random" can be taken in the sense of Martin-L"{o}f by using weak criteria of effectiveness, or in the sense of Schnorr by using stronger criteria of effectiveness. Under the weaker criteria of effectiveness it is possible to construct a universal predictor which is optimal for all infinite sequences. This predictor assigns nonvanishing limit probabilities precisely to the recursive sequences. Under the stronger criteria of effectiveness it is possible to establish a law of large numbers for sequences random relative to a computable distribution, which may be useful as a criterion of "rationality" for methods of probabilistic prediction. A remarkable feature of effective predictors is the fact that they are expressible in the special form first proposed by Solomonoff. In this form sequence prediction reduces to assigning high probabilities to initial segments with short and/or numerous encodings. This fact provides the link between theories of randomness and Solomonoff's theory of prediction.
{"title":"Predictability and Randomness","authors":"Lenhart K. Schubert","doi":"arxiv-2401.13066","DOIUrl":"https://doi.org/arxiv-2401.13066","url":null,"abstract":"Algorithmic theories of randomness can be related to theories of\u0000probabilistic sequence prediction through the notion of a predictor, defined as\u0000a function which supplies lower bounds on initial-segment probabilities of\u0000infinite sequences. An infinite binary sequence $z$ is called unpredictable iff\u0000its initial-segment \"redundancy\" $n+log p(z(n))$ remains sufficiently low\u0000relative to every effective predictor $p$. A predictor which maximizes the\u0000initial-segment redundancy of a sequence is called optimal for that sequence.\u0000It turns out that a sequence is random iff it is unpredictable. More generally,\u0000a sequence is random relative to an arbitrary computable distribution iff the\u0000distribution is itself an optimal predictor for the sequence. Here \"random\" can\u0000be taken in the sense of Martin-L\"{o}f by using weak criteria of\u0000effectiveness, or in the sense of Schnorr by using stronger criteria of\u0000effectiveness. Under the weaker criteria of effectiveness it is possible to\u0000construct a universal predictor which is optimal for all infinite sequences.\u0000This predictor assigns nonvanishing limit probabilities precisely to the\u0000recursive sequences. Under the stronger criteria of effectiveness it is\u0000possible to establish a law of large numbers for sequences random relative to a\u0000computable distribution, which may be useful as a criterion of \"rationality\"\u0000for methods of probabilistic prediction. A remarkable feature of effective\u0000predictors is the fact that they are expressible in the special form first\u0000proposed by Solomonoff. In this form sequence prediction reduces to assigning\u0000high probabilities to initial segments with short and/or numerous encodings.\u0000This fact provides the link between theories of randomness and Solomonoff's\u0000theory of prediction.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139561948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu
We study the rate-distortion-perception (RDP) tradeoff for a memoryless source model in the asymptotic limit of large block-lengths. Our perception measure is based on a divergence between the distributions of the source and reconstruction sequences conditioned on the encoder output, which was first proposed in [1], [2]. We consider the case when there is no shared randomness between the encoder and the decoder. For the case of discrete memoryless sources we derive a single-letter characterization of the RDP function, thus settling a problem that remains open for the marginal metric introduced in Blau and Michaeli [3] (with no shared randomness). Our achievability scheme is based on lossy source coding with a posterior reference map proposed in [4]. For the case of continuous valued sources under squared error distortion measure and squared quadratic Wasserstein perception measure we also derive a single-letter characterization and show that a noise-adding mechanism at the decoder suffices to achieve the optimal representation. For the case of zero perception loss, we show that our characterization interestingly coincides with the results for the marginal metric derived in [5], [6] and again demonstrate that zero perception loss can be achieved with a $3$-dB penalty in the minimum distortion. Finally we specialize our results to the case of Gaussian sources. We derive the RDP function for vector Gaussian sources and propose a waterfilling type solution. We also partially characterize the RDP function for a mixture of vector Gaussians.
{"title":"Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure","authors":"Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu","doi":"arxiv-2401.12207","DOIUrl":"https://doi.org/arxiv-2401.12207","url":null,"abstract":"We study the rate-distortion-perception (RDP) tradeoff for a memoryless\u0000source model in the asymptotic limit of large block-lengths. Our perception\u0000measure is based on a divergence between the distributions of the source and\u0000reconstruction sequences conditioned on the encoder output, which was first\u0000proposed in [1], [2]. We consider the case when there is no shared randomness\u0000between the encoder and the decoder. For the case of discrete memoryless\u0000sources we derive a single-letter characterization of the RDP function, thus\u0000settling a problem that remains open for the marginal metric introduced in Blau\u0000and Michaeli [3] (with no shared randomness). Our achievability scheme is based\u0000on lossy source coding with a posterior reference map proposed in [4]. For the\u0000case of continuous valued sources under squared error distortion measure and\u0000squared quadratic Wasserstein perception measure we also derive a single-letter\u0000characterization and show that a noise-adding mechanism at the decoder suffices\u0000to achieve the optimal representation. For the case of zero perception loss, we\u0000show that our characterization interestingly coincides with the results for the\u0000marginal metric derived in [5], [6] and again demonstrate that zero perception\u0000loss can be achieved with a $3$-dB penalty in the minimum distortion. Finally\u0000we specialize our results to the case of Gaussian sources. We derive the RDP\u0000function for vector Gaussian sources and propose a waterfilling type solution.\u0000We also partially characterize the RDP function for a mixture of vector\u0000Gaussians.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139558511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ioannis Gavras, Italo Atzeni, George C. Alexandropoulos
In this paper, we consider a hybrid Analog and Digital (A/D) receiver architecture with an extremely large Dynamic Metasurface Antenna (DMA) and an $1$-bit resolution Analog-to-Digital Converter (ADC) at each of its reception radio-frequency chains, and present a localization approach for User Equipment (UE) lying in its near-field regime. The proposed algorithm scans the UE area of interest to identify the DMA-based analog combining configuration resulting to the peak in a received pseudo-spectrum, yielding the UE position estimation in three dimensions. Our simulation results demonstrate the validity of the proposed scheme, especially for increasing DMA sizes, and showcase the interplay among various system parameters.
{"title":"Near-Field Localization with $1$-bit Quantized Hybrid A/D Reception","authors":"Ioannis Gavras, Italo Atzeni, George C. Alexandropoulos","doi":"arxiv-2401.12029","DOIUrl":"https://doi.org/arxiv-2401.12029","url":null,"abstract":"In this paper, we consider a hybrid Analog and Digital (A/D) receiver\u0000architecture with an extremely large Dynamic Metasurface Antenna (DMA) and an\u0000$1$-bit resolution Analog-to-Digital Converter (ADC) at each of its reception\u0000radio-frequency chains, and present a localization approach for User Equipment\u0000(UE) lying in its near-field regime. The proposed algorithm scans the UE area\u0000of interest to identify the DMA-based analog combining configuration resulting\u0000to the peak in a received pseudo-spectrum, yielding the UE position estimation\u0000in three dimensions. Our simulation results demonstrate the validity of the\u0000proposed scheme, especially for increasing DMA sizes, and showcase the\u0000interplay among various system parameters.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"157 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139558799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matteo Zecchin, Sangwoo Park, Osvaldo Simeone, Fredrik Hellström
The safe integration of machine learning modules in decision-making processes hinges on their ability to quantify uncertainty. A popular technique to achieve this goal is conformal prediction (CP), which transforms an arbitrary base predictor into a set predictor with coverage guarantees. While CP certifies the predicted set to contain the target quantity with a user-defined tolerance, it does not provide control over the average size of the predicted sets, i.e., over the informativeness of the prediction. In this work, a theoretical connection is established between the generalization properties of the base predictor and the informativeness of the resulting CP prediction sets. To this end, an upper bound is derived on the expected size of the CP set predictor that builds on generalization error bounds for the base predictor. The derived upper bound provides insights into the dependence of the average size of the CP set predictor on the amount of calibration data, the target reliability, and the generalization performance of the base predictor. The theoretical insights are validated using simple numerical regression and classification tasks.
{"title":"Generalization and Informativeness of Conformal Prediction","authors":"Matteo Zecchin, Sangwoo Park, Osvaldo Simeone, Fredrik Hellström","doi":"arxiv-2401.11810","DOIUrl":"https://doi.org/arxiv-2401.11810","url":null,"abstract":"The safe integration of machine learning modules in decision-making processes\u0000hinges on their ability to quantify uncertainty. A popular technique to achieve\u0000this goal is conformal prediction (CP), which transforms an arbitrary base\u0000predictor into a set predictor with coverage guarantees. While CP certifies the\u0000predicted set to contain the target quantity with a user-defined tolerance, it\u0000does not provide control over the average size of the predicted sets, i.e.,\u0000over the informativeness of the prediction. In this work, a theoretical\u0000connection is established between the generalization properties of the base\u0000predictor and the informativeness of the resulting CP prediction sets. To this\u0000end, an upper bound is derived on the expected size of the CP set predictor\u0000that builds on generalization error bounds for the base predictor. The derived\u0000upper bound provides insights into the dependence of the average size of the CP\u0000set predictor on the amount of calibration data, the target reliability, and\u0000the generalization performance of the base predictor. The theoretical insights\u0000are validated using simple numerical regression and classification tasks.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"146 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139559339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, the paradigm of massive ultra-reliable low-latency IoT communications (URLLC-IoT) has gained growing interest. Reliable delay-critical uplink transmission in IoT is a challenging task since low-complex devices typically do not support multiple antennas or demanding signal processing tasks. However, in many IoT services the data volumes are small and deployments may include massive number of devices. We consider on a clustered uplink transmission with two cooperation approaches: First, we focus on scenario where location-based channel knowledge map (CKM) is applied to enable cooperation. Second, we consider a scenario where scarce channel side-information is applied in transmission. In both scenarios we also model and analyse the impact of erroneous information. In the performance evaluation we apply the recently introduced data-oriented approach that has gathered significant attention in the context of short-packet transmissions. Specifically, it introduces a transient performance metric for small data transmissions, where the amount of data and available bandwidth play crucial roles. Results show that cooperation between clustered IoT devices may provide notable benefits in terms of increased range. It is noticed that the performance is heavily depending on the strength of the static channel component in the CKM based cooperation. The channel side-information based cooperation is robust against changes in the radio environment but sensitive to possible errors in the channel side-information. Even with large IoT device clusters, side-information errors may set a limit for the use of services assuming high-reliability and low-latency. Analytic results are verified against simulations, showing only minor differences at low probability levels.
{"title":"Data-oriented Coordinated Uplink Transmission for Massive IoT System","authors":"Jyri Hämäläinen, Rui Dinis, Mehmet C. Ilter","doi":"arxiv-2401.11761","DOIUrl":"https://doi.org/arxiv-2401.11761","url":null,"abstract":"Recently, the paradigm of massive ultra-reliable low-latency IoT\u0000communications (URLLC-IoT) has gained growing interest. Reliable delay-critical\u0000uplink transmission in IoT is a challenging task since low-complex devices\u0000typically do not support multiple antennas or demanding signal processing\u0000tasks. However, in many IoT services the data volumes are small and deployments\u0000may include massive number of devices. We consider on a clustered uplink\u0000transmission with two cooperation approaches: First, we focus on scenario where\u0000location-based channel knowledge map (CKM) is applied to enable cooperation.\u0000Second, we consider a scenario where scarce channel side-information is applied\u0000in transmission. In both scenarios we also model and analyse the impact of\u0000erroneous information. In the performance evaluation we apply the recently\u0000introduced data-oriented approach that has gathered significant attention in\u0000the context of short-packet transmissions. Specifically, it introduces a\u0000transient performance metric for small data transmissions, where the amount of\u0000data and available bandwidth play crucial roles. Results show that cooperation\u0000between clustered IoT devices may provide notable benefits in terms of\u0000increased range. It is noticed that the performance is heavily depending on the\u0000strength of the static channel component in the CKM based cooperation. The\u0000channel side-information based cooperation is robust against changes in the\u0000radio environment but sensitive to possible errors in the channel\u0000side-information. Even with large IoT device clusters, side-information errors\u0000may set a limit for the use of services assuming high-reliability and\u0000low-latency. Analytic results are verified against simulations, showing only\u0000minor differences at low probability levels.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139558663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}