首页 > 最新文献

IEEE journal on selected areas in information theory最新文献

英文 中文
Learning Algorithm Generalization Error Bounds via Auxiliary Distributions 通过辅助分布实现学习算法泛化误差边界
Pub Date : 2024-04-25 DOI: 10.1109/JSAIT.2024.3391900
Gholamali Aminian;Saeed Masiha;Laura Toni;Miguel R. D. Rodrigues
Generalization error bounds are essential for comprehending how well machine learning models work. In this work, we suggest a novel method, i.e., the Auxiliary Distribution Method, that leads to new upper bounds on expected generalization errors that are appropriate for supervised learning scenarios. We show that our general upper bounds can be specialized under some conditions to new bounds involving the $alpha $ -Jensen-Shannon, $alpha $ -Rényi $(0lt alpha lt 1)$ information between a random variable modeling the set of training samples and another random variable modeling the set of hypotheses. Our upper bounds based on $alpha $ -Jensen-Shannon information are also finite. Additionally, we demonstrate how our auxiliary distribution method can be used to derive the upper bounds on excess risk of some learning algorithms in the supervised learning context and the generalization error under the distribution mismatch scenario in supervised learning algorithms, where the distribution mismatch is modeled as $alpha $ -Jensen-Shannon or $alpha $ -Rényi divergence between the distribution of test and training data samples distributions. We also outline the conditions for which our proposed upper bounds might be tighter than other earlier upper bounds.
泛化误差边界对于理解机器学习模型的工作原理至关重要。在这项工作中,我们提出了一种新方法(即辅助分布法),它能为监督学习场景带来新的预期泛化误差上界。我们证明,在某些条件下,我们的一般上界可以专门化为涉及训练样本集建模的随机变量和假设集建模的另一个随机变量之间的 $alpha $ -Jensen-Shannon, $alpha $ -Rényi $(0lt alpha lt 1)$ 信息的新上界。我们基于 $alpha $ -Jensen-Shannon 信息的上限也是有限的。此外,我们还演示了我们的辅助分布方法如何用于推导监督学习背景下某些学习算法的超额风险上限,以及监督学习算法中分布不匹配情况下的泛化误差,其中分布不匹配被建模为测试数据样本分布与训练数据样本分布之间的 $alpha $ -Jensen-Shannon 或 $alpha $ -Rényi 分歧。我们还概述了我们提出的上界可能比其他早期上界更严格的条件。
{"title":"Learning Algorithm Generalization Error Bounds via Auxiliary Distributions","authors":"Gholamali Aminian;Saeed Masiha;Laura Toni;Miguel R. D. Rodrigues","doi":"10.1109/JSAIT.2024.3391900","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3391900","url":null,"abstract":"Generalization error bounds are essential for comprehending how well machine learning models work. In this work, we suggest a novel method, i.e., the Auxiliary Distribution Method, that leads to new upper bounds on expected generalization errors that are appropriate for supervised learning scenarios. We show that our general upper bounds can be specialized under some conditions to new bounds involving the \u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000-Jensen-Shannon, \u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000-Rényi \u0000<inline-formula> <tex-math>$(0lt alpha lt 1)$ </tex-math></inline-formula>\u0000 information between a random variable modeling the set of training samples and another random variable modeling the set of hypotheses. Our upper bounds based on \u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000-Jensen-Shannon information are also finite. Additionally, we demonstrate how our auxiliary distribution method can be used to derive the upper bounds on excess risk of some learning algorithms in the supervised learning context and the generalization error under the distribution mismatch scenario in supervised learning algorithms, where the distribution mismatch is modeled as \u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000-Jensen-Shannon or \u0000<inline-formula> <tex-math>$alpha $ </tex-math></inline-formula>\u0000-Rényi divergence between the distribution of test and training data samples distributions. We also outline the conditions for which our proposed upper bounds might be tighter than other earlier upper bounds.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"273-284"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Distributed Compressor Discovers Binning 神经分布式压缩器发现分选功能
Pub Date : 2024-04-24 DOI: 10.1109/JSAIT.2024.3393429
Ezgi Ozyilkan;Johannes Ballé;Elza Erkip
We consider lossy compression of an information source when the decoder has lossless access to a correlated one. This setup, also known as the Wyner-Ziv problem, is a special case of distributed source coding. To this day, practical approaches for the Wyner-Ziv problem have neither been fully developed nor heavily investigated. We propose a data-driven method based on machine learning that leverages the universal function approximation capability of artificial neural networks. We find that our neural network-based compression scheme, based on variational vector quantization, recovers some principles of the optimum theoretical solution of the Wyner-Ziv setup, such as binning in the source space as well as optimal combination of the quantization index and side information, for exemplary sources. These behaviors emerge although no structure exploiting knowledge of the source distributions was imposed. Binning is a widely used tool in information theoretic proofs and methods, and to our knowledge, this is the first time it has been explicitly observed to emerge from data-driven learning.
我们考虑的是当解码器可以无损访问相关信息源时,对信息源进行有损压缩的问题。这种设置也称为 Wyner-Ziv 问题,是分布式信源编码的一个特例。时至今日,针对 Wyner-Ziv 问题的实用方法既没有得到充分发展,也没有得到深入研究。我们提出了一种基于机器学习的数据驱动方法,该方法利用了人工神经网络的通用函数逼近能力。我们发现,基于神经网络的压缩方案以变异矢量量化为基础,恢复了 Wyner-Ziv 设置的最佳理论解的一些原则,如源空间的分档以及量化指数和侧信息的最佳组合。虽然没有施加利用源分布知识的结构,但这些行为还是出现了。分选是信息论证明和方法中广泛使用的工具,据我们所知,这是第一次明确观察到它出现在数据驱动学习中。
{"title":"Neural Distributed Compressor Discovers Binning","authors":"Ezgi Ozyilkan;Johannes Ballé;Elza Erkip","doi":"10.1109/JSAIT.2024.3393429","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3393429","url":null,"abstract":"We consider lossy compression of an information source when the decoder has lossless access to a correlated one. This setup, also known as the Wyner-Ziv problem, is a special case of distributed source coding. To this day, practical approaches for the Wyner-Ziv problem have neither been fully developed nor heavily investigated. We propose a data-driven method based on machine learning that leverages the universal function approximation capability of artificial neural networks. We find that our neural network-based compression scheme, based on variational vector quantization, recovers some principles of the optimum theoretical solution of the Wyner-Ziv setup, such as binning in the source space as well as optimal combination of the quantization index and side information, for exemplary sources. These behaviors emerge although no structure exploiting knowledge of the source distributions was imposed. Binning is a widely used tool in information theoretic proofs and methods, and to our knowledge, this is the first time it has been explicitly observed to emerge from data-driven learning.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"246-260"},"PeriodicalIF":0.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140949055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Training Generative Models From Privatized Data via Entropic Optimal Transport 通过熵优化传输从私有化数据中训练生成模型
Pub Date : 2024-04-16 DOI: 10.1109/JSAIT.2024.3387463
Daria Reshetova;Wei-Ning Chen;Ayfer Özgür
Local differential privacy is a powerful method for privacy-preserving data collection. In this paper, we develop a framework for training Generative Adversarial Networks (GANs) on differentially privatized data. We show that entropic regularization of optimal transport – a popular regularization method in the literature that has often been leveraged for its computational benefits – enables the generator to learn the raw (unprivatized) data distribution even though it only has access to privatized samples. We prove that at the same time this leads to fast statistical convergence at the parametric rate. This shows that entropic regularization of optimal transport uniquely enables the mitigation of both the effects of privatization noise and the curse of dimensionality in statistical convergence. We provide experimental evidence to support the efficacy of our framework in practice.
局部差分隐私是一种强大的隐私保护数据收集方法。在本文中,我们开发了一种在差异化隐私数据上训练生成对抗网络(GAN)的框架。我们证明,最优传输的熵正则化(这是文献中一种流行的正则化方法,通常因其在计算上的优势而被广泛利用)能让生成器学习原始(未私有化)数据分布,即使它只能访问私有化样本。我们证明,这同时也会导致参数速率的快速统计收敛。这表明,最优传输的熵正则化能独特地减轻私有化噪声的影响和统计收敛中的维度诅咒。我们提供了实验证据,以支持我们的框架在实践中的有效性。
{"title":"Training Generative Models From Privatized Data via Entropic Optimal Transport","authors":"Daria Reshetova;Wei-Ning Chen;Ayfer Özgür","doi":"10.1109/JSAIT.2024.3387463","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3387463","url":null,"abstract":"Local differential privacy is a powerful method for privacy-preserving data collection. In this paper, we develop a framework for training Generative Adversarial Networks (GANs) on differentially privatized data. We show that entropic regularization of optimal transport – a popular regularization method in the literature that has often been leveraged for its computational benefits – enables the generator to learn the raw (unprivatized) data distribution even though it only has access to privatized samples. We prove that at the same time this leads to fast statistical convergence at the parametric rate. This shows that entropic regularization of optimal transport uniquely enables the mitigation of both the effects of privatization noise and the curse of dimensionality in statistical convergence. We provide experimental evidence to support the efficacy of our framework in practice.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"221-235"},"PeriodicalIF":0.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140820376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differentially Private Stochastic Linear Bandits: (Almost) for Free 差分私有随机线性强盗:(几乎)免费
Pub Date : 2024-04-16 DOI: 10.1109/JSAIT.2024.3389954
Osama Hanna;Antonious M. Girgis;Christina Fragouli;Suhas Diggavi
In this paper, we propose differentially private algorithms for the problem of stochastic linear bandits in the central, local and shuffled models. In the central model, we achieve almost the same regret as the optimal non-private algorithms, which means we get privacy for free. In particular, we achieve a regret of $tilde {O}left({sqrt {T}+{}frac {1}{varepsilon }}right)$ matching the known lower bound for private linear bandits, while the best previously known algorithm achieves $tilde {O}left({{}frac {1}{varepsilon }sqrt {T}}right)$ . In the local case, we achieve a regret of $tilde {O}left({{}frac {1}{varepsilon }{sqrt {T}}}right)$ which matches the non-private regret for constant $varepsilon $ , but suffers a regret penalty when $varepsilon $ is small. In the shuffled model, we also achieve regret of $tilde {O}left({sqrt {T}+{}frac {1}{varepsilon }}right)$ while the best previously known algorithm suffers a regret of $tilde {O}left({{}frac {1}{varepsilon }{T^{3/5}}}right)$ . Our numerical evaluation validates our theoretical results. Our results generalize for contextual linear bandits with known context distributions.
在本文中,我们针对随机线性匪帮问题提出了中心模型、局部模型和洗牌模型中的不同隐私算法。在中心模型中,我们实现了与最优非私有算法几乎相同的遗憾,这意味着我们免费获得了隐私。特别是,我们实现的遗憾值为 $tilde {O}left({sqrt {T}+{}frac {1}{varepsilon }}right)$ ,与已知的私有线性匪徒下限相匹配,而之前已知的最佳算法实现的遗憾值为 $tilde {O}left({{}frac {1}{varepsilon }sqrt {T}}right)$ 。在局部情况下,我们的遗憾值为 $tilde {O}left({{}frac {1}{varepsilon }{sqrt {T}}right)$ ,这与恒定 $varepsilon $ 时的非私人遗憾值相匹配,但是当 $varepsilon $ 较小时,遗憾值会受到惩罚。在洗牌模型中,我们的遗憾值也达到了 $tilde {O}left({sqrt {T}+{}frac {1}{varepsilon }}right)$ ,而之前已知的最佳算法的遗憾值为 $tilde {O}left({{}frac {1}{varepsilon }{T^{3/5}}right)$ 。我们的数值评估验证了我们的理论结果。我们的结果适用于已知上下文分布的线性匪帮。
{"title":"Differentially Private Stochastic Linear Bandits: (Almost) for Free","authors":"Osama Hanna;Antonious M. Girgis;Christina Fragouli;Suhas Diggavi","doi":"10.1109/JSAIT.2024.3389954","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3389954","url":null,"abstract":"In this paper, we propose differentially private algorithms for the problem of stochastic linear bandits in the central, local and shuffled models. In the central model, we achieve almost the same regret as the optimal non-private algorithms, which means we get privacy for free. In particular, we achieve a regret of \u0000<inline-formula> <tex-math>$tilde {O}left({sqrt {T}+{}frac {1}{varepsilon }}right)$ </tex-math></inline-formula>\u0000 matching the known lower bound for private linear bandits, while the best previously known algorithm achieves \u0000<inline-formula> <tex-math>$tilde {O}left({{}frac {1}{varepsilon }sqrt {T}}right)$ </tex-math></inline-formula>\u0000. In the local case, we achieve a regret of \u0000<inline-formula> <tex-math>$tilde {O}left({{}frac {1}{varepsilon }{sqrt {T}}}right)$ </tex-math></inline-formula>\u0000 which matches the non-private regret for constant \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000, but suffers a regret penalty when \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000 is small. In the shuffled model, we also achieve regret of \u0000<inline-formula> <tex-math>$tilde {O}left({sqrt {T}+{}frac {1}{varepsilon }}right)$ </tex-math></inline-formula>\u0000 while the best previously known algorithm suffers a regret of \u0000<inline-formula> <tex-math>$tilde {O}left({{}frac {1}{varepsilon }{T^{3/5}}}right)$ </tex-math></inline-formula>\u0000. Our numerical evaluation validates our theoretical results. Our results generalize for contextual linear bandits with known context distributions.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"135-147"},"PeriodicalIF":0.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Group Testing via Gradient Descent 通过梯度下降改进分组测试
Pub Date : 2024-04-15 DOI: 10.1109/JSAIT.2024.3386182
Sundara Rajan Srinivasavaradhan;Pavlos Nikolopoulos;Christina Fragouli;Suhas Diggavi
We study the problem of group testing with non-identical, independent priors. So far, the pooling strategies that have been proposed in the literature take the following approach: a hand-crafted test design along with a decoding strategy is proposed, and guarantees are provided on how many tests are sufficient in order to identify all infections in a population. In this paper, we take a different, yet perhaps more practical, approach: we fix the decoder and the number of tests, and we ask, given these, what is the best test design one could use? We explore this question for the Definite Non-Defectives (DND) decoder. We formulate a (non-convex) optimization problem, where the objective function is the expected number of errors for a particular design. We find approximate solutions via gradient descent, which we further optimize with informed initialization. We illustrate through simulations that our method can achieve significant performance improvement over traditional approaches.
我们研究的是具有非相同、独立先验的分组测试问题。迄今为止,文献中提出的集合策略都采用了以下方法:提出手工设计的测试和解码策略,并保证有多少次测试足以识别群体中的所有感染。在本文中,我们采用了一种不同但也许更实用的方法:我们固定了解码器和测试次数,然后我们问,在这些条件下,可以使用的最佳测试设计是什么?我们针对定无缺陷(DND)解码器来探讨这个问题。我们提出了一个(非凸)优化问题,目标函数是特定设计的预期错误数。我们通过梯度下降找到近似解,并通过知情初始化进一步优化。我们通过仿真说明,与传统方法相比,我们的方法能显著提高性能。
{"title":"Improving Group Testing via Gradient Descent","authors":"Sundara Rajan Srinivasavaradhan;Pavlos Nikolopoulos;Christina Fragouli;Suhas Diggavi","doi":"10.1109/JSAIT.2024.3386182","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3386182","url":null,"abstract":"We study the problem of group testing with non-identical, independent priors. So far, the pooling strategies that have been proposed in the literature take the following approach: a hand-crafted test design along with a decoding strategy is proposed, and guarantees are provided on how many tests are sufficient in order to identify all infections in a population. In this paper, we take a different, yet perhaps more practical, approach: we fix the decoder and the number of tests, and we ask, given these, what is the best test design one could use? We explore this question for the Definite Non-Defectives (DND) decoder. We formulate a (non-convex) optimization problem, where the objective function is the expected number of errors for a particular design. We find approximate solutions via gradient descent, which we further optimize with informed initialization. We illustrate through simulations that our method can achieve significant performance improvement over traditional approaches.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"236-245"},"PeriodicalIF":0.0,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140844479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Binary Differential Privacy via Graphs 通过图形实现最佳二进制差分隐私保护
Pub Date : 2024-04-11 DOI: 10.1109/JSAIT.2024.3384183
Sahel Torkamani;Javad B. Ebrahimi;Parastoo Sadeghi;Rafael G. L. D’Oliveira;Muriel Médard
We present the notion of reasonable utility for binary mechanisms, which applies to all utility functions in the literature. This notion induces a partial ordering on the performance of all binary differentially private (DP) mechanisms. DP mechanisms that are maximal elements of this ordering are optimal DP mechanisms for every reasonable utility. By looking at differential privacy as a randomized graph coloring, we characterize these optimal DP in terms of their behavior on a certain subset of the boundary datasets we call a boundary hitting set. In the process of establishing our results, we also introduce a useful notion that generalizes DP conditions for binary-valued queries, which we coin as suitable pairs. Suitable pairs abstract away the algebraic roles of $varepsilon ,delta $ in the DP framework, making the derivations and understanding of our proofs simpler. Additionally, the notion of a suitable pair can potentially capture privacy conditions in frameworks other than DP and may be of independent interest.
我们提出了二元机制的合理效用概念,它适用于文献中的所有效用函数。这一概念为所有二元差异私有(DP)机制的性能诱导了一个部分排序。对于每个合理效用而言,作为该排序最大元素的 DP 机制都是最优 DP 机制。通过将差异隐私视为随机图着色,我们根据这些最优 DP 在边界数据集的某个子集上的行为来描述它们,我们称之为边界命中集。在建立结果的过程中,我们还引入了一个有用的概念,它概括了二值查询的 DP 条件,我们将其称为合适对。合适对抽象掉了 $varepsilon ,delta $ 在 DP 框架中的代数作用,从而使我们的证明的推导和理解更加简单。此外,合适对的概念有可能捕捉到DP框架之外的其他框架中的隐私条件,这可能会引起我们的兴趣。
{"title":"Optimal Binary Differential Privacy via Graphs","authors":"Sahel Torkamani;Javad B. Ebrahimi;Parastoo Sadeghi;Rafael G. L. D’Oliveira;Muriel Médard","doi":"10.1109/JSAIT.2024.3384183","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3384183","url":null,"abstract":"We present the notion of \u0000<italic>reasonable utility</i>\u0000 for binary mechanisms, which applies to all utility functions in the literature. This notion induces a partial ordering on the performance of all binary differentially private (DP) mechanisms. DP mechanisms that are maximal elements of this ordering are optimal DP mechanisms for every reasonable utility. By looking at differential privacy as a randomized graph coloring, we characterize these optimal DP in terms of their behavior on a certain subset of the boundary datasets we call a boundary hitting set. In the process of establishing our results, we also introduce a useful notion that generalizes DP conditions for binary-valued queries, which we coin as suitable pairs. Suitable pairs abstract away the algebraic roles of \u0000<inline-formula> <tex-math>$varepsilon ,delta $ </tex-math></inline-formula>\u0000 in the DP framework, making the derivations and understanding of our proofs simpler. Additionally, the notion of a suitable pair can potentially capture privacy conditions in frameworks other than DP and may be of independent interest.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"162-174"},"PeriodicalIF":0.0,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Iterative Sketching for Secure Coded Regression 安全编码回归的迭代草图绘制
Pub Date : 2024-04-04 DOI: 10.1109/JSAIT.2024.3384395
Neophytos Charalambides;Hessam Mahdavifar;Mert Pilanci;Alfred O. Hero
Linear regression is a fundamental and primitive problem in supervised machine learning, with applications ranging from epidemiology to finance. In this work, we propose methods for speeding up distributed linear regression. We do so by leveraging randomized techniques, while also ensuring security and straggler resiliency in asynchronous distributed computing systems. Specifically, we randomly rotate the basis of the system of equations and then subsample blocks, to simultaneously secure the information and reduce the dimension of the regression problem. In our setup, the basis rotation corresponds to an encoded encryption in an approximate gradient coding scheme, and the subsampling corresponds to the responses of the non-straggling servers in the centralized coded computing framework. This results in a distributive iterative stochastic approach for matrix compression and steepest descent.
线性回归是有监督机器学习中最基本、最原始的问题,应用范围从流行病学到金融学。在这项工作中,我们提出了加速分布式线性回归的方法。我们利用随机化技术实现了这一目标,同时还确保了异步分布式计算系统的安全性和流浪者恢复能力。具体来说,我们随机旋转方程组的基础,然后对区块进行子采样,从而同时确保信息安全并降低回归问题的维度。在我们的设置中,基础旋转对应于近似梯度编码方案中的编码加密,而子采样对应于集中编码计算框架中不串行服务器的响应。这就产生了一种用于矩阵压缩和最陡下降的分布式迭代随机方法。
{"title":"Iterative Sketching for Secure Coded Regression","authors":"Neophytos Charalambides;Hessam Mahdavifar;Mert Pilanci;Alfred O. Hero","doi":"10.1109/JSAIT.2024.3384395","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3384395","url":null,"abstract":"Linear regression is a fundamental and primitive problem in supervised machine learning, with applications ranging from epidemiology to finance. In this work, we propose methods for speeding up distributed linear regression. We do so by leveraging randomized techniques, while also ensuring security and straggler resiliency in asynchronous distributed computing systems. Specifically, we randomly rotate the basis of the system of equations and then subsample \u0000<italic>blocks</i>\u0000, to simultaneously secure the information and reduce the dimension of the regression problem. In our setup, the basis rotation corresponds to an encoded encryption in an \u0000<italic>approximate gradient coding scheme</i>\u0000, and the subsampling corresponds to the responses of the non-straggling servers in the centralized coded computing framework. This results in a distributive \u0000<italic>iterative</i>\u0000 stochastic approach for matrix compression and steepest descent.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"148-161"},"PeriodicalIF":0.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140924701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Total Variation Meets Differential Privacy 总变化符合差异隐私
Pub Date : 2024-04-04 DOI: 10.1109/JSAIT.2024.3384083
Elena Ghazi;Ibrahim Issa
The framework of approximate differential privacy is considered, and augmented by leveraging the notion of “the total variation of a (privacy-preserving) mechanism” (denoted by $eta $ -TV). With this refinement, an exact composition result is derived, and shown to be significantly tighter than the optimal bounds for differential privacy (which do not consider the total variation). Furthermore, it is shown that $(varepsilon ,delta )$ -DP with $eta $ -TV is closed under subsampling. The induced total variation of commonly used mechanisms are computed. Moreover, the notion of total variation of a mechanism is studied in the local privacy setting and privacy-utility tradeoffs are investigated. In particular, total variation distance and KL divergence are considered as utility functions and studied through the lens of contraction coefficients. Finally, the results are compared and connected to the locally differentially private setting.
本文考虑了近似差分隐私的框架,并利用"(隐私保护)机制的总变化 "概念(用 $eta $ -TV表示)对其进行了扩展。通过这种改进,得出了精确的组成结果,并证明它比差分隐私的最优边界(不考虑总变化)要严密得多。此外,还证明了$(varepsilon ,delta)$-DP与$ea$-TV在子采样下是封闭的。计算了常用机制的诱导总变化。此外,在局部隐私设置中研究了机制总变化的概念,并探讨了隐私-效用的权衡。特别是,总变异距离和 KL 发散被视为效用函数,并通过收缩系数的视角进行研究。最后,对结果进行了比较,并将其与局部差异隐私设置联系起来。
{"title":"Total Variation Meets Differential Privacy","authors":"Elena Ghazi;Ibrahim Issa","doi":"10.1109/JSAIT.2024.3384083","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3384083","url":null,"abstract":"The framework of approximate differential privacy is considered, and augmented by leveraging the notion of “the total variation of a (privacy-preserving) mechanism” (denoted by \u0000<inline-formula> <tex-math>$eta $ </tex-math></inline-formula>\u0000-TV). With this refinement, an exact composition result is derived, and shown to be significantly tighter than the optimal bounds for differential privacy (which do not consider the total variation). Furthermore, it is shown that \u0000<inline-formula> <tex-math>$(varepsilon ,delta )$ </tex-math></inline-formula>\u0000-DP with \u0000<inline-formula> <tex-math>$eta $ </tex-math></inline-formula>\u0000-TV is closed under subsampling. The induced total variation of commonly used mechanisms are computed. Moreover, the notion of total variation of a mechanism is studied in the local privacy setting and privacy-utility tradeoffs are investigated. In particular, total variation distance and KL divergence are considered as utility functions and studied through the lens of contraction coefficients. Finally, the results are compared and connected to the locally differentially private setting.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"207-220"},"PeriodicalIF":0.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140820416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Worst-Case Data-Generating Probability Measure in Statistical Learning 统计学习中的最坏情况数据生成概率度量
Pub Date : 2024-04-02 DOI: 10.1109/JSAIT.2024.3383281
Xinying Zou;Samir M. Perlaza;Iñaki Esnaola;Eitan Altman;H. Vincent Poor
The worst-case data-generating (WCDG) probability measure is introduced as a tool for characterizing the generalization capabilities of machine learning algorithms. Such a WCDG probability measure is shown to be the unique solution to two different optimization problems: $(a)$ The maximization of the expected loss over the set of probability measures on the datasets whose relative entropy with respect to a reference measure is not larger than a given threshold; and $(b)$ The maximization of the expected loss with regularization by relative entropy with respect to the reference measure. Such a reference measure can be interpreted as a prior on the datasets. The WCDG cumulants are finite and bounded in terms of the cumulants of the reference measure. To analyze the concentration of the expected empirical risk induced by the WCDG probability measure, the notion of $(epsilon, delta )$ -robustness of models is introduced. Closed-form expressions are presented for the sensitivity of the expected loss for a fixed model. These results lead to a novel expression for the generalization error of arbitrary machine learning algorithms. This exact expression is provided in terms of the WCDG probability measure and leads to an upper bound that is equal to the sum of the mutual information and the lautum information between the models and the datasets, up to a constant factor. This upper bound is achieved by a Gibbs algorithm. This finding reveals that an exploration into the generalization error of the Gibbs algorithm facilitates the derivation of overarching insights applicable to any machine learning algorithm.
本文介绍了最坏情况数据生成(WCDG)概率度量,作为表征机器学习算法泛化能力的工具。这种 WCDG 概率度量被证明是两个不同优化问题的唯一解:$(a)$ 数据集上概率度量集合的预期损失最大化,其相对于参考度量的相对熵不大于给定阈值;$(b)$ 预期损失最大化,其正则化为相对于参考度量的相对熵。这种参考度量可以解释为数据集的先验。WCDG 的累积量是有限的,并且与参考量的累积量有界。为了分析 WCDG 概率度量引起的预期经验风险的集中,引入了模型的 $(epsilon, delta )$ 稳健性概念。对固定模型的预期损失敏感性提出了闭式表达式。这些结果引出了任意机器学习算法泛化误差的新表达式。这个精确表达式以 WCDG 概率度量的形式提供,并得出一个上界,等于模型与数据集之间的互信息和劳顿信息之和,最多不超过一个常数因子。这个上限是通过吉布斯算法实现的。这一发现揭示了对吉布斯算法泛化误差的探索有助于得出适用于任何机器学习算法的总体见解。
{"title":"The Worst-Case Data-Generating Probability Measure in Statistical Learning","authors":"Xinying Zou;Samir M. Perlaza;Iñaki Esnaola;Eitan Altman;H. Vincent Poor","doi":"10.1109/JSAIT.2024.3383281","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3383281","url":null,"abstract":"The worst-case data-generating (WCDG) probability measure is introduced as a tool for characterizing the generalization capabilities of machine learning algorithms. Such a WCDG probability measure is shown to be the unique solution to two different optimization problems: \u0000<inline-formula> <tex-math>$(a)$ </tex-math></inline-formula>\u0000 The maximization of the expected loss over the set of probability measures on the datasets whose relative entropy with respect to a \u0000<italic>reference measure</i>\u0000 is not larger than a given threshold; and \u0000<inline-formula> <tex-math>$(b)$ </tex-math></inline-formula>\u0000 The maximization of the expected loss with regularization by relative entropy with respect to the reference measure. Such a reference measure can be interpreted as a prior on the datasets. The WCDG cumulants are finite and bounded in terms of the cumulants of the reference measure. To analyze the concentration of the expected empirical risk induced by the WCDG probability measure, the notion of \u0000<inline-formula> <tex-math>$(epsilon, delta )$ </tex-math></inline-formula>\u0000-robustness of models is introduced. Closed-form expressions are presented for the sensitivity of the expected loss for a fixed model. These results lead to a novel expression for the generalization error of arbitrary machine learning algorithms. This exact expression is provided in terms of the WCDG probability measure and leads to an upper bound that is equal to the sum of the mutual information and the lautum information between the models and the datasets, up to a constant factor. This upper bound is achieved by a Gibbs algorithm. This finding reveals that an exploration into the generalization error of the Gibbs algorithm facilitates the derivation of overarching insights applicable to any machine learning algorithm.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"175-189"},"PeriodicalIF":0.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140844521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Computation of the Gaussian Rate–Distortion–Perception Function 关于高斯速率-失真-感知函数的计算
Pub Date : 2024-03-29 DOI: 10.1109/JSAIT.2024.3381230
Giuseppe Serra;Photios A. Stavrou;Marios Kountouris
In this paper, we study the computation of the rate-distortion-perception function (RDPF) for a multivariate Gaussian source assuming jointly Gaussian reconstruction under mean squared error (MSE) distortion and, respectively, Kullback–Leibler divergence, geometric Jensen-Shannon divergence, squared Hellinger distance, and squared Wasserstein-2 distance perception metrics. To this end, we first characterize the analytical bounds of the scalar Gaussian RDPF for the aforementioned divergence functions, also providing the RDPF-achieving forward “test-channel” realization. Focusing on the multivariate case, assuming jointly Gaussian reconstruction and tensorizable distortion and perception metrics, we establish that the optimal solution resides on the vector space spanned by the eigenvector of the source covariance matrix. Consequently, the multivariate optimization problem can be expressed as a function of the scalar Gaussian RDPFs of the source marginals, constrained by global distortion and perception levels. Leveraging this characterization, we design an alternating minimization scheme based on the block nonlinear Gauss–Seidel method, which optimally solves the problem while identifying the Gaussian RDPF-achieving realization. Furthermore, the associated algorithmic embodiment is provided, as well as the convergence and the rate of convergence characterization. Lastly, for the “perfect realism” regime, the analytical solution for the multivariate Gaussian RDPF is obtained. We corroborate our results with numerical simulations and draw connections to existing results.
本文研究了在均方误差(MSE)失真和库尔贝-莱布勒发散、几何詹森-香农发散、平方海灵格距离和平方瓦瑟斯坦-2距离感知度量下,假设联合高斯重构的多元高斯源的速率-失真-感知函数(RDPF)的计算。为此,我们首先描述了上述发散函数的标量高斯 RDPF 的分析边界,同时还提供了实现 RDPF 的前向 "测试通道 "实现。在多变量情况下,假设联合高斯重构以及可张量化的失真和感知度量,我们确定最优解位于源协方差矩阵特征向量所跨的向量空间。因此,多元优化问题可以表示为源边际的标量高斯 RDPF 的函数,并受到全局失真和感知水平的限制。利用这一特征,我们设计了一种基于分块非线性高斯-赛德尔法的交替最小化方案,在识别高斯 RDPF 实现的同时优化了问题的解决。此外,还提供了相关的算法体现,以及收敛性和收敛率的特征。最后,针对 "完美现实 "机制,我们得到了多变量高斯 RDPF 的解析解。我们用数值模拟证实了我们的结果,并得出了与现有结果的联系。
{"title":"On the Computation of the Gaussian Rate–Distortion–Perception Function","authors":"Giuseppe Serra;Photios A. Stavrou;Marios Kountouris","doi":"10.1109/JSAIT.2024.3381230","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3381230","url":null,"abstract":"In this paper, we study the computation of the rate-distortion-perception function (RDPF) for a multivariate Gaussian source assuming jointly Gaussian reconstruction under mean squared error (MSE) distortion and, respectively, Kullback–Leibler divergence, geometric Jensen-Shannon divergence, squared Hellinger distance, and squared Wasserstein-2 distance perception metrics. To this end, we first characterize the analytical bounds of the scalar Gaussian RDPF for the aforementioned divergence functions, also providing the RDPF-achieving forward “test-channel” realization. Focusing on the multivariate case, assuming jointly Gaussian reconstruction and tensorizable distortion and perception metrics, we establish that the optimal solution resides on the vector space spanned by the eigenvector of the source covariance matrix. Consequently, the multivariate optimization problem can be expressed as a function of the scalar Gaussian RDPFs of the source marginals, constrained by global distortion and perception levels. Leveraging this characterization, we design an alternating minimization scheme based on the block nonlinear Gauss–Seidel method, which optimally solves the problem while identifying the Gaussian RDPF-achieving realization. Furthermore, the associated algorithmic embodiment is provided, as well as the convergence and the rate of convergence characterization. Lastly, for the “perfect realism” regime, the analytical solution for the multivariate Gaussian RDPF is obtained. We corroborate our results with numerical simulations and draw connections to existing results.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"314-330"},"PeriodicalIF":0.0,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141084836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE journal on selected areas in information theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1