首页 > 最新文献

IEEE journal on selected areas in information theory最新文献

英文 中文
Toward General Function Approximation in Nonstationary Reinforcement Learning 在非稳态强化学习中实现通用函数逼近
Pub Date : 2024-03-29 DOI: 10.1109/JSAIT.2024.3381818
Songtao Feng;Ming Yin;Ruiquan Huang;Yu-Xiang Wang;Jing Yang;Yingbin Liang
Function approximation has experienced significant success in the field of reinforcement learning (RL). Despite a handful of progress on developing theory for nonstationary RL with function approximation under structural assumptions, existing work for nonstationary RL with general function approximation is still limited. In this work, we investigate two different approaches for nonstationary RL with general function approximation: confidence-set based algorithm and UCB-type algorithm. For the first approach, we introduce a new complexity measure called dynamic Bellman Eluder (DBE) for nonstationary MDPs, and then propose a confidence-set based algorithm SW-OPEA based on the complexity metric. SW-OPEA features the sliding window mechanism and a novel confidence set design for nonstationary MDPs. For the second approach, we propose a UCB-type algorithm LSVI-Nonstationary following the popular least-square-value-iteration (LSVI) framework, and mitigate the computational efficiency challenge of the confidence-set based approach. LSVI-Nonstationary features the restart mechanism and a new design of the bonus term to handle nonstationarity. The two proposed algorithms outperform the existing algorithms for nonstationary linear and tabular MDPs in the small variation budget setting. To the best of our knowledge, the two approaches are the first confidence-set based algorithm and UCB-type algorithm in the context of nonstationary MDPs.
函数逼近在强化学习(RL)领域取得了巨大成功。尽管在结构假设下的非稳态函数逼近 RL 理论发展方面取得了一些进展,但针对一般函数逼近的非稳态 RL 的现有研究仍然有限。在这项工作中,我们研究了两种不同的非稳态 RL 方法:基于置信集的算法和 UCB 型算法。对于第一种方法,我们为非稳态 MDPs 引入了一种新的复杂度度量--动态 Bellman Eluder(DBE),然后基于该复杂度度量提出了一种基于置信集的算法 SW-OPEA。SW-OPEA 具有滑动窗口机制和针对非稳态 MDP 的新型置信集设计。对于第二种方法,我们按照流行的最小平方值迭代(LSVI)框架提出了一种 UCB 型算法 LSVI-Nonstationary,并缓解了基于置信集方法的计算效率挑战。LSVI-Nonstationary 具有重启机制和处理非平稳性的奖励项新设计。在小变化预算设置中,针对非平稳线性和表格 MDP,这两种拟议算法的性能优于现有算法。据我们所知,这两种方法是第一种基于置信集的非平稳 MDP 算法和 UCB 型算法。
{"title":"Toward General Function Approximation in Nonstationary Reinforcement Learning","authors":"Songtao Feng;Ming Yin;Ruiquan Huang;Yu-Xiang Wang;Jing Yang;Yingbin Liang","doi":"10.1109/JSAIT.2024.3381818","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3381818","url":null,"abstract":"Function approximation has experienced significant success in the field of reinforcement learning (RL). Despite a handful of progress on developing theory for nonstationary RL with function approximation under structural assumptions, existing work for nonstationary RL with general function approximation is still limited. In this work, we investigate two different approaches for nonstationary RL with general function approximation: confidence-set based algorithm and UCB-type algorithm. For the first approach, we introduce a new complexity measure called dynamic Bellman Eluder (DBE) for nonstationary MDPs, and then propose a confidence-set based algorithm SW-OPEA based on the complexity metric. SW-OPEA features the sliding window mechanism and a novel confidence set design for nonstationary MDPs. For the second approach, we propose a UCB-type algorithm LSVI-Nonstationary following the popular least-square-value-iteration (LSVI) framework, and mitigate the computational efficiency challenge of the confidence-set based approach. LSVI-Nonstationary features the restart mechanism and a new design of the bonus term to handle nonstationarity. The two proposed algorithms outperform the existing algorithms for nonstationary linear and tabular MDPs in the small variation budget setting. To the best of our knowledge, the two approaches are the first confidence-set based algorithm and UCB-type algorithm in the context of nonstationary MDPs.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"190-206"},"PeriodicalIF":0.0,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140820417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exactly Optimal and Communication-Efficient Private Estimation via Block Designs 通过分块设计实现精确最优和通信高效的私有估计
Pub Date : 2024-03-27 DOI: 10.1109/JSAIT.2024.3381195
Hyun-Young Park;Seung-Hyun Nam;Si-Hyeon Lee
In this paper, we propose a new class of local differential privacy (LDP) schemes based on combinatorial block designs for discrete distribution estimation. This class not only recovers many known LDP schemes in a unified framework of combinatorial block design, but also suggests a novel way of finding new schemes achieving the exactly optimal (or near-optimal) privacy-utility trade-off with lower communication costs. Indeed, we find many new LDP schemes that achieve the exactly optimal privacy-utility trade-off, with the minimum communication cost among all the unbiased or consistent schemes, for a certain set of input data size and LDP constraint. Furthermore, to partially solve the sparse existence issue of block design schemes, we consider a broader class of LDP schemes based on regular and pairwise-balanced designs, called RPBD schemes, which relax one of the symmetry requirements on block designs. By considering this broader class of RPBD schemes, we can find LDP schemes achieving near-optimal privacy-utility trade-off with reasonably low communication costs for a much larger set of input data size and LDP constraint.
本文基于离散分布估计的组合块设计,提出了一类新的局部差分隐私(LDP)方案。这一类方案不仅在组合块设计的统一框架下恢复了许多已知的 LDP 方案,而且还提出了一种新的方法,即以较低的通信成本找到实现完全最优(或接近最优)隐私-效用权衡的新方案。事实上,我们发现了许多新的 LDP 方案,这些方案能在特定的输入数据大小和 LDP 约束条件下,在所有无偏或一致方案中以最小的通信成本实现完全最优的隐私-效用权衡。此外,为了部分解决块设计方案的稀疏存在性问题,我们考虑了更广泛的一类基于规则和配对平衡设计的 LDP 方案,称为 RPBD 方案,它放宽了对块设计的对称性要求之一。通过考虑这一大类 RPBD 方案,我们可以找到在输入数据大小和 LDP 约束更大的情况下,以合理的低通信成本实现接近最优的隐私-效用权衡的 LDP 方案。
{"title":"Exactly Optimal and Communication-Efficient Private Estimation via Block Designs","authors":"Hyun-Young Park;Seung-Hyun Nam;Si-Hyeon Lee","doi":"10.1109/JSAIT.2024.3381195","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3381195","url":null,"abstract":"In this paper, we propose a new class of local differential privacy (LDP) schemes based on combinatorial block designs for discrete distribution estimation. This class not only recovers many known LDP schemes in a unified framework of combinatorial block design, but also suggests a novel way of finding new schemes achieving the exactly optimal (or near-optimal) privacy-utility trade-off with lower communication costs. Indeed, we find many new LDP schemes that achieve the exactly optimal privacy-utility trade-off, with the minimum communication cost among all the unbiased or consistent schemes, for a certain set of input data size and LDP constraint. Furthermore, to partially solve the sparse existence issue of block design schemes, we consider a broader class of LDP schemes based on regular and pairwise-balanced designs, called RPBD schemes, which relax one of the symmetry requirements on block designs. By considering this broader class of RPBD schemes, we can find LDP schemes achieving near-optimal privacy-utility trade-off with reasonably low communication costs for a much larger set of input data size and LDP constraint.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"123-134"},"PeriodicalIF":0.0,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140621268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Robust to Distributional Uncertainties and Adversarial Data 适应分布不确定性和对抗性数据的鲁棒学习
Pub Date : 2024-03-26 DOI: 10.1109/JSAIT.2024.3381869
Alireza Sadeghi;Gang Wang;Georgios B. Giannakis
Successful training of data-intensive deep neural networks critically rely on vast, clean, and high-quality datasets. In practice however, their reliability diminishes, particularly with noisy, outlier-corrupted data samples encountered in testing. This challenge intensifies when dealing with anonymized, heterogeneous data sets stored across geographically distinct locations due to, e.g., privacy concerns. This present paper introduces robust learning frameworks tailored for centralized and federated learning scenarios. Our goal is to fortify model resilience with a focus that lies in (i) addressing distribution shifts from training to inference time; and, (ii) ensuring test-time robustness, when a trained model may encounter outliers or adversarially contaminated test data samples. To this aim, we start with a centralized setting where the true data distribution is considered unknown, but residing within a Wasserstein ball centered at the empirical distribution. We obtain robust models by minimizing the worst-case expected loss within this ball, yielding an intractable infinite-dimensional optimization problem. Upon leverage the strong duality condition, we arrive at a tractable surrogate learning problem. We develop two stochastic primal-dual algorithms to solve the resultant problem: one for $epsilon $ -accurate convex sub-problems and another for a single gradient ascent step. We further develop a distributionally robust federated learning framework to learn robust model using heterogeneous data sets stored at distinct locations by solving per-learner’s sub-problems locally, offering robustness with modest computational overhead and considering data distribution. Numerical tests corroborate merits of our training algorithms against distributional uncertainties and adversarially corrupted test data samples.
数据密集型深度神经网络的成功训练主要依赖于庞大、干净和高质量的数据集。然而,在实践中,数据集的可靠性会降低,尤其是在测试中遇到噪声大、异常值被破坏的数据样本时。出于隐私等方面的考虑,在处理存储在不同地理位置的匿名异构数据集时,这一挑战会更加严峻。本文介绍了为集中式和联合式学习场景量身定制的稳健学习框架。我们的目标是加强模型的弹性,重点在于:(i) 解决从训练到推理时间的分布转移问题;(ii) 确保测试时间的鲁棒性,此时训练好的模型可能会遇到异常值或受到逆向污染的测试数据样本。为此,我们从集中化设置入手,在这种设置中,真实数据分布被认为是未知的,但位于以经验分布为中心的瓦瑟斯坦球内。我们通过最小化这个球内的最坏情况预期损失来获得稳健模型,这就产生了一个难以解决的无限维优化问题。利用强二元性条件,我们得到了一个简单易行的代理学习问题。我们开发了两种随机初等二元算法来解决由此产生的问题:一种是针对 $epsilon $ 精确凸子问题的算法,另一种是针对单一梯度上升步骤的算法。我们进一步开发了一种分布稳健的联合学习框架,通过在本地解决每个学习者的子问题,使用存储在不同位置的异构数据集学习稳健模型,在考虑数据分布的情况下,以适度的计算开销提供稳健性。数值测试证实了我们的训练算法在应对分布不确定性和对抗性破坏测试数据样本方面的优势。
{"title":"Learning Robust to Distributional Uncertainties and Adversarial Data","authors":"Alireza Sadeghi;Gang Wang;Georgios B. Giannakis","doi":"10.1109/JSAIT.2024.3381869","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3381869","url":null,"abstract":"Successful training of data-intensive deep neural networks critically rely on vast, clean, and high-quality datasets. In practice however, their reliability diminishes, particularly with noisy, outlier-corrupted data samples encountered in testing. This challenge intensifies when dealing with anonymized, heterogeneous data sets stored across geographically distinct locations due to, e.g., privacy concerns. This present paper introduces robust learning frameworks tailored for centralized and federated learning scenarios. Our goal is to fortify model resilience with a focus that lies in (i) addressing distribution shifts from training to inference time; and, (ii) ensuring test-time robustness, when a trained model may encounter outliers or adversarially contaminated test data samples. To this aim, we start with a centralized setting where the true data distribution is considered unknown, but residing within a Wasserstein ball centered at the empirical distribution. We obtain robust models by minimizing the worst-case expected loss within this ball, yielding an intractable infinite-dimensional optimization problem. Upon leverage the strong duality condition, we arrive at a tractable surrogate learning problem. We develop two stochastic primal-dual algorithms to solve the resultant problem: one for \u0000<inline-formula> <tex-math>$epsilon $ </tex-math></inline-formula>\u0000-accurate convex sub-problems and another for a single gradient ascent step. We further develop a distributionally robust federated learning framework to learn robust model using heterogeneous data sets stored at distinct locations by solving per-learner’s sub-problems locally, offering robustness with modest computational overhead and considering data distribution. Numerical tests corroborate merits of our training algorithms against distributional uncertainties and adversarially corrupted test data samples.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"105-122"},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140619580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exactly Tight Information-Theoretic Generalization Error Bound for the Quadratic Gaussian Problem 二次高斯问题的精确严密信息论广义误差约束
Pub Date : 2024-03-22 DOI: 10.1109/JSAIT.2024.3380598
Ruida Zhou;Chao Tian;Tie Liu
We provide a new information-theoretic generalization error bound that is exactly tight (i.e., matching even the constant) for the canonical quadratic Gaussian (location) problem. Most existing bounds are order-wise loose in this setting, which has raised concerns about the fundamental capability of information-theoretic bounds in reasoning the generalization behavior for machine learning. The proposed new bound adopts the individual-sample-based approach proposed by Bu et al., but also has several key new ingredients. Firstly, instead of applying the change of measure inequality on the loss function, we apply it to the generalization error function itself; secondly, the bound is derived in a conditional manner; lastly, a reference distribution is introduced. The combination of these components produces a KL-divergence-based generalization error bound. We show that although the latter two new ingredients can help make the bound exactly tight, removing them does not significantly degrade the bound, leading to an asymptotically tight mutual-information-based bound. We further consider the vector Gaussian setting, where a direct application of the proposed bound again does not lead to tight bounds except in special cases. A refined bound is then proposed by a decomposition of loss functions, leading to a tight bound for the vector setting.
我们为典型二次高斯(位置)问题提供了一种新的信息论泛化误差约束,它是完全严格的(即甚至与常数相匹配)。在这种情况下,现有的大多数约束都是有序宽松的,这引起了人们对信息论约束在推理机器学习泛化行为方面的基本能力的担忧。所提出的新边界采用了 Bu 等人提出的基于单个样本的方法,但也有几个关键的新成分。首先,我们不是在损失函数上应用度量变化不等式,而是将其应用于泛化误差函数本身;其次,约束是以条件方式导出的;最后,引入了参考分布。这些部分的组合产生了基于 KL-发散的广义误差约束。我们的研究表明,虽然后两个新成分有助于使约束精确严密,但去掉它们并不会明显降低约束,从而得到一个渐近严密的基于相互信息的约束。我们进一步考虑了矢量高斯设置,在这种情况下,除了特殊情况,直接应用所提出的约束也不会导致严格约束。然后,我们通过对损失函数进行分解,提出了一个细化的约束,从而得出了矢量环境下的严密约束。
{"title":"Exactly Tight Information-Theoretic Generalization Error Bound for the Quadratic Gaussian Problem","authors":"Ruida Zhou;Chao Tian;Tie Liu","doi":"10.1109/JSAIT.2024.3380598","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3380598","url":null,"abstract":"We provide a new information-theoretic generalization error bound that is exactly tight (i.e., matching even the constant) for the canonical quadratic Gaussian (location) problem. Most existing bounds are order-wise loose in this setting, which has raised concerns about the fundamental capability of information-theoretic bounds in reasoning the generalization behavior for machine learning. The proposed new bound adopts the individual-sample-based approach proposed by Bu et al., but also has several key new ingredients. Firstly, instead of applying the change of measure inequality on the loss function, we apply it to the generalization error function itself; secondly, the bound is derived in a conditional manner; lastly, a reference distribution is introduced. The combination of these components produces a KL-divergence-based generalization error bound. We show that although the latter two new ingredients can help make the bound exactly tight, removing them does not significantly degrade the bound, leading to an asymptotically tight mutual-information-based bound. We further consider the vector Gaussian setting, where a direct application of the proposed bound again does not lead to tight bounds except in special cases. A refined bound is then proposed by a decomposition of loss functions, leading to a tight bound for the vector setting.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"94-104"},"PeriodicalIF":0.0,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140606029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Summary Statistic Privacy in Data Sharing 数据共享中的隐私问题统计摘要
Pub Date : 2024-03-21 DOI: 10.1109/JSAIT.2024.3403811
Zinan Lin;Shuaiqi Wang;Vyas Sekar;Giulia Fanti
We study a setting where a data holder wishes to share data with a receiver, without revealing certain summary statistics of the data distribution (e.g., mean, standard deviation). It achieves this by passing the data through a randomization mechanism. We propose summary statistic privacy, a metric for quantifying the privacy risk of such a mechanism based on the worst-case probability of an adversary guessing the distributional secret within some threshold. Defining distortion as a worst-case Wasserstein-1 distance between the real and released data, we prove lower bounds on the tradeoff between privacy and distortion. We then propose a class of quantization mechanisms that can be adapted to different data distributions. We show that the quantization mechanism’s privacy-distortion tradeoff matches our lower bounds under certain regimes, up to small constant factors. Finally, we demonstrate on real-world datasets that the proposed quantization mechanisms achieve better privacy-distortion tradeoffs than alternative privacy mechanisms.
我们研究的是这样一种情况:数据持有者希望与接收者共享数据,但又不透露数据分布的某些汇总统计数据(如平均值、标准偏差)。它通过随机化机制传递数据来实现这一点。我们提出了 "摘要统计隐私"(summary statistic privacy),这是一种量化这种机制隐私风险的指标,它基于对手在某个阈值内猜测到分布秘密的最坏情况概率。我们将失真定义为真实数据与发布数据之间最坏情况下的 Wasserstein-1 距离,并证明了隐私与失真之间权衡的下限。然后,我们提出了一类可适应不同数据分布的量化机制。我们证明,在某些情况下,量化机制的隐私-失真权衡与我们的下限相匹配,甚至可以达到很小的常数因子。最后,我们在真实世界的数据集上证明,与其他隐私机制相比,所提出的量化机制实现了更好的隐私-失真权衡。
{"title":"Summary Statistic Privacy in Data Sharing","authors":"Zinan Lin;Shuaiqi Wang;Vyas Sekar;Giulia Fanti","doi":"10.1109/JSAIT.2024.3403811","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3403811","url":null,"abstract":"We study a setting where a data holder wishes to share data with a receiver, without revealing certain summary statistics of the data distribution (e.g., mean, standard deviation). It achieves this by passing the data through a randomization mechanism. We propose summary statistic privacy, a metric for quantifying the privacy risk of such a mechanism based on the worst-case probability of an adversary guessing the distributional secret within some threshold. Defining distortion as a worst-case Wasserstein-1 distance between the real and released data, we prove lower bounds on the tradeoff between privacy and distortion. We then propose a class of quantization mechanisms that can be adapted to different data distributions. We show that the quantization mechanism’s privacy-distortion tradeoff matches our lower bounds under certain regimes, up to small constant factors. Finally, we demonstrate on real-world datasets that the proposed quantization mechanisms achieve better privacy-distortion tradeoffs than alternative privacy mechanisms.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"369-384"},"PeriodicalIF":0.0,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141422569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Straggler-Resilient Differentially Private Decentralized Learning 徘徊者-弹性差异化私有分散学习
Pub Date : 2024-03-20 DOI: 10.1109/JSAIT.2024.3400995
Yauhen Yakimenka;Chung-Wei Weng;Hsuan-Yin Lin;Eirik Rosnes;Jörg Kliewer
We consider the straggler problem in decentralized learning over a logical ring while preserving user data privacy. Especially, we extend the recently proposed framework of differential privacy (DP) amplification by decentralization by Cyffers and Bellet to include overall training latency—comprising both computation and communication latency. Analytical results on both the convergence speed and the DP level are derived for both a skipping scheme (which ignores the stragglers after a timeout) and a baseline scheme that waits for each node to finish before the training continues. A trade-off between overall training latency, accuracy, and privacy, parameterized by the timeout of the skipping scheme, is identified and empirically validated for logistic regression on a real-world dataset and for image classification using the MNIST and CIFAR-10 datasets.
我们考虑了逻辑环上分散学习中的游离者问题,同时保留了用户数据隐私。特别是,我们扩展了 Cyffers 和 Bellet 最近提出的通过分散化放大差分隐私(DP)的框架,将整体训练延迟(包括计算延迟和通信延迟)包括在内。我们得出了跳过方案(超时后忽略落伍者)和基线方案(等待每个节点完成训练后再继续训练)的收敛速度和 DP 水平的分析结果。通过跳过方案的超时参数,确定了整体训练延迟、准确性和隐私之间的权衡,并在实际数据集的逻辑回归以及使用 MNIST 和 CIFAR-10 数据集的图像分类中进行了经验验证。
{"title":"Straggler-Resilient Differentially Private Decentralized Learning","authors":"Yauhen Yakimenka;Chung-Wei Weng;Hsuan-Yin Lin;Eirik Rosnes;Jörg Kliewer","doi":"10.1109/JSAIT.2024.3400995","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3400995","url":null,"abstract":"We consider the straggler problem in decentralized learning over a logical ring while preserving user data privacy. Especially, we extend the recently proposed framework of differential privacy (DP) amplification by decentralization by Cyffers and Bellet to include overall training latency—comprising both computation and communication latency. Analytical results on both the convergence speed and the DP level are derived for both a skipping scheme (which ignores the stragglers after a timeout) and a baseline scheme that waits for each node to finish before the training continues. A trade-off between overall training latency, accuracy, and privacy, parameterized by the timeout of the skipping scheme, is identified and empirically validated for logistic regression on a real-world dataset and for image classification using the MNIST and CIFAR-10 datasets.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"407-423"},"PeriodicalIF":0.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141495176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Algorithmic Recourse Under Model Multiplicity With Probabilistic Guarantees 模型多重性下的稳健算法求助与概率保证
Pub Date : 2024-03-15 DOI: 10.1109/JSAIT.2024.3401407
Faisal Hamman;Erfaun Noorani;Saumitra Mishra;Daniele Magazzeni;Sanghamitra Dutta
There is an emerging interest in generating robust algorithmic recourse that would remain valid if the model is updated or changed even slightly. Towards finding robust algorithmic recourse (or counterfactual explanations), existing literature often assumes that the original model m and the new model M are bounded in the parameter space, i.e., $|text {Params}(M){-}text {Params}(m)|{lt }Delta $ . However, models can often change significantly in the parameter space with little to no change in their predictions or accuracy on the given dataset. In this work, we introduce a mathematical abstraction termed naturally-occurring model change, which allows for arbitrary changes in the parameter space such that the change in predictions on points that lie on the data manifold is limited. Next, we propose a measure – that we call Stability – to quantify the robustness of counterfactuals to potential model changes for differentiable models, e.g., neural networks. Our main contribution is to show that counterfactuals with sufficiently high value of Stability as defined by our measure will remain valid after potential “naturally-occurring” model changes with high probability (leveraging concentration bounds for Lipschitz function of independent Gaussians). Since our quantification depends on the local Lipschitz constant around a data point which is not always available, we also examine estimators of our proposed measure and derive a fundamental lower bound on the sample size required to have a precise estimate. We explore methods of using stability measures to generate robust counterfactuals that are close, realistic, and remain valid after potential model changes. This work also has interesting connections with model multiplicity, also known as the Rashomon effect.
人们对产生稳健算法追索权的兴趣日渐浓厚,即使模型更新或稍有改变,这种追索权仍然有效。为了找到稳健的算法追索(或反事实解释),现有文献通常假定原始模型 m 和新模型 M 在参数空间中是有界的,即 $|text {Params}(M){-}text {Params}(m)|{lt }Delta $ 。 然而,模型通常会在参数空间中发生显著变化,而其对给定数据集的预测或准确性却几乎没有变化。在这项工作中,我们引入了一种数学抽象,称为自然发生的模型变化,它允许参数空间的任意变化,从而限制了对位于数据流形上的点的预测变化。接下来,我们提出了一种测量方法--我们称之为 "稳定性"--来量化反事实对可微分模型(如神经网络)潜在模型变化的稳健性。我们的主要贡献在于证明,根据我们的测量方法,具有足够高稳定性值的反事实在潜在的 "自然发生的 "模型变化后将以很高的概率保持有效(利用独立高斯的 Lipschitz 函数的集中边界)。由于我们的量化方法取决于数据点周围的局部 Lipschitz 常量,而该常量并不总是可用的,因此我们还研究了我们提出的测量方法的估计值,并得出了精确估计所需的样本量的基本下限。我们探讨了使用稳定性测量方法生成稳健的反事实的方法,这些反事实是接近的、现实的,并且在潜在的模型变化后仍然有效。这项工作还与模型多重性(又称罗生门效应)有着有趣的联系。
{"title":"Robust Algorithmic Recourse Under Model Multiplicity With Probabilistic Guarantees","authors":"Faisal Hamman;Erfaun Noorani;Saumitra Mishra;Daniele Magazzeni;Sanghamitra Dutta","doi":"10.1109/JSAIT.2024.3401407","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3401407","url":null,"abstract":"There is an emerging interest in generating robust algorithmic recourse that would remain valid if the model is updated or changed even slightly. Towards finding robust algorithmic recourse (or counterfactual explanations), existing literature often assumes that the original model \u0000<italic>m</i>\u0000 and the new model \u0000<italic>M</i>\u0000 are bounded in the parameter space, i.e., \u0000<inline-formula> <tex-math>$|text {Params}(M){-}text {Params}(m)|{lt }Delta $ </tex-math></inline-formula>\u0000. However, models can often change significantly in the parameter space with little to no change in their predictions or accuracy on the given dataset. In this work, we introduce a mathematical abstraction termed \u0000<italic>naturally-occurring</i>\u0000 model change, which allows for arbitrary changes in the parameter space such that the change in predictions on points that lie on the data manifold is limited. Next, we propose a measure – that we call \u0000<italic>Stability</i>\u0000 – to quantify the robustness of counterfactuals to potential model changes for differentiable models, e.g., neural networks. Our main contribution is to show that counterfactuals with sufficiently high value of \u0000<italic>Stability</i>\u0000 as defined by our measure will remain valid after potential “naturally-occurring” model changes with high probability (leveraging concentration bounds for Lipschitz function of independent Gaussians). Since our quantification depends on the local Lipschitz constant around a data point which is not always available, we also examine estimators of our proposed measure and derive a fundamental lower bound on the sample size required to have a precise estimate. We explore methods of using stability measures to generate robust counterfactuals that are close, realistic, and remain valid after potential model changes. This work also has interesting connections with model multiplicity, also known as the Rashomon effect.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"357-368"},"PeriodicalIF":0.0,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141435351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contraction of Locally Differentially Private Mechanisms 局部不同私有机制的收缩
Pub Date : 2024-03-09 DOI: 10.1109/JSAIT.2024.3397305
Shahab Asoodeh;Huanyu Zhang
We investigate the contraction properties of locally differentially private mechanisms. More specifically, we derive tight upper bounds on the divergence between $P{mathsf K}$ and $Q{mathsf K}$ output distributions of an $varepsilon $ -LDP mechanism $mathsf K$ in terms of a divergence between the corresponding input distributions P and Q, respectively. Our first main technical result presents a sharp upper bound on the $chi ^{2}$ -divergence $chi ^{2}(P{mathsf K}|Q{mathsf K})$ in terms of $chi ^{2}(P|Q)$ and $varepsilon $ . We also show that the same result holds for a large family of divergences, including KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound on $chi ^{2}(P{mathsf K}|Q{mathsf K})$ in terms of total variation distance ${textsf {TV}}(P, Q)$ and $varepsilon $ . We then utilize these bounds to establish locally private versions of the van Trees inequality, Le Cam’s, Assouad’s, and the mutual information methods —powerful tools for bounding minimax estimation risks. These results are shown to lead to tighter privacy analyses than the state-of-the-arts in several statistical problems such as entropy and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.
我们研究了局部差异私有机制的收缩特性。更具体地说,我们推导了$varepsilon $ -LDP 机制$mathsf K$的$P{mathsf K}$和$Q{mathsf K}$输出分布之间的发散的严格上限,它们分别是对应的输入分布P和Q之间的发散。我们的第一个主要技术结果以 $chi ^{2}(P|Q)$ 和 $varepsilon $ 的形式给出了 $chi ^{2}$ -发散 $chi ^{2}(P{mathsf K}|Q{mathsf K})$的尖锐上限。 我们还证明,同样的结果也适用于包括 KL 发散和平方海灵格距离在内的一大系列发散。第二个主要技术结果给出了总变异距离 ${textsf {TV}}(P, Q)$ 和 $varepsilon $ 的 $chi ^{2}(P{mathsf K}|Q{mathsf K})$上界。 然后,我们利用这些上界建立了本地私有版本的范特里不等式、勒卡姆方法、阿苏阿德方法和互信息方法--这些都是约束最小估计风险的有力工具。这些结果表明,在熵和离散分布估计、非参数密度估计和假设检验等多个统计问题中,隐私分析比现有技术更严密。
{"title":"Contraction of Locally Differentially Private Mechanisms","authors":"Shahab Asoodeh;Huanyu Zhang","doi":"10.1109/JSAIT.2024.3397305","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3397305","url":null,"abstract":"We investigate the contraction properties of locally differentially private mechanisms. More specifically, we derive tight upper bounds on the divergence between \u0000<inline-formula> <tex-math>$P{mathsf K}$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$Q{mathsf K}$ </tex-math></inline-formula>\u0000 output distributions of an \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000-LDP mechanism \u0000<inline-formula> <tex-math>$mathsf K$ </tex-math></inline-formula>\u0000 in terms of a divergence between the corresponding input distributions P and Q, respectively. Our first main technical result presents a sharp upper bound on the \u0000<inline-formula> <tex-math>$chi ^{2}$ </tex-math></inline-formula>\u0000-divergence \u0000<inline-formula> <tex-math>$chi ^{2}(P{mathsf K}|Q{mathsf K})$ </tex-math></inline-formula>\u0000 in terms of \u0000<inline-formula> <tex-math>$chi ^{2}(P|Q)$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000. We also show that the same result holds for a large family of divergences, including KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound on \u0000<inline-formula> <tex-math>$chi ^{2}(P{mathsf K}|Q{mathsf K})$ </tex-math></inline-formula>\u0000 in terms of total variation distance \u0000<inline-formula> <tex-math>${textsf {TV}}(P, Q)$ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$varepsilon $ </tex-math></inline-formula>\u0000. We then utilize these bounds to establish locally private versions of the van Trees inequality, Le Cam’s, Assouad’s, and the mutual information methods —powerful tools for bounding minimax estimation risks. These results are shown to lead to tighter privacy analyses than the state-of-the-arts in several statistical problems such as entropy and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"385-395"},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Group Fairness Evaluation via Conditional Value-at-Risk Testing 通过条件风险价值测试进行多组公平性评估
Pub Date : 2024-03-09 DOI: 10.1109/JSAIT.2024.3397741
Lucas Monteiro Paes;Ananda Theertha Suresh;Alex Beutel;Flavio P. Calmon;Ahmad Beirami
Machine learning (ML) models used in prediction and classification tasks may display performance disparities across population groups determined by sensitive attributes (e.g., race, sex, age). We consider the problem of evaluating the performance of a fixed ML model across population groups defined by multiple sensitive attributes (e.g., race and sex and age). Here, the sample complexity for estimating the worst-case performance gap across groups (e.g., the largest difference in error rates) increases exponentially with the number of group-denoting sensitive attributes. To address this issue, we propose an approach to test for performance disparities based on Conditional Value-at-Risk (CVaR). By allowing a small probabilistic slack on the groups over which a model has approximately equal performance, we show that the sample complexity required for discovering performance violations is reduced exponentially to be at most upper bounded by the square root of the number of groups. As a byproduct of our analysis, when the groups are weighted by a specific prior distribution, we show that Rényi entropy of order 2/3 of the prior distribution captures the sample complexity of the proposed CVaR test algorithm. Finally, we also show that there exists a non-i.i.d. data collection strategy that results in a sample complexity independent of the number of groups.
在预测和分类任务中使用的机器学习(ML)模型可能会在由敏感属性(如种族、性别、年龄)决定的人群中显示出性能差异。我们考虑的问题是,如何评估固定 ML 模型在由多个敏感属性(如种族、性别和年龄)决定的人群中的性能。在这种情况下,估计不同群体间最坏情况下的性能差距(例如,误差率的最大差异)的样本复杂度会随着群体敏感属性的数量呈指数增长。为了解决这个问题,我们提出了一种基于条件风险值(CVaR)的性能差距测试方法。通过允许模型在性能大致相同的组别上有较小的概率松弛,我们证明发现性能违规所需的样本复杂度会以指数形式降低,最多为组别数量平方根的上限。作为我们分析的副产品,当各组由特定的先验分布加权时,我们证明先验分布的 2/3 阶雷尼熵可以捕捉到所提 CVaR 检验算法的样本复杂度。最后,我们还证明,存在一种非 i.i.d. 数据收集策略,其样本复杂度与组数无关。
{"title":"Multi-Group Fairness Evaluation via Conditional Value-at-Risk Testing","authors":"Lucas Monteiro Paes;Ananda Theertha Suresh;Alex Beutel;Flavio P. Calmon;Ahmad Beirami","doi":"10.1109/JSAIT.2024.3397741","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3397741","url":null,"abstract":"Machine learning (ML) models used in prediction and classification tasks may display performance disparities across population groups determined by sensitive attributes (e.g., race, sex, age). We consider the problem of evaluating the performance of a fixed ML model across population groups defined by multiple sensitive attributes (e.g., race and sex and age). Here, the sample complexity for estimating the worst-case performance gap across groups (e.g., the largest difference in error rates) increases exponentially with the number of group-denoting sensitive attributes. To address this issue, we propose an approach to test for performance disparities based on Conditional Value-at-Risk (CVaR). By allowing a small probabilistic slack on the groups over which a model has approximately equal performance, we show that the sample complexity required for discovering performance violations is reduced exponentially to be at most upper bounded by the square root of the number of groups. As a byproduct of our analysis, when the groups are weighted by a specific prior distribution, we show that Rényi entropy of order 2/3 of the prior distribution captures the sample complexity of the proposed CVaR test algorithm. Finally, we also show that there exists a non-i.i.d. data collection strategy that results in a sample complexity independent of the number of groups.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"659-674"},"PeriodicalIF":0.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142736343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient and Robust Classification for Sparse Attacks 针对稀疏攻击的高效稳健分类
Pub Date : 2024-03-06 DOI: 10.1109/JSAIT.2024.3397187
Mark Beliaev;Payam Delgosha;Hamed Hassani;Ramtin Pedarsani
Over the past two decades, the rise in adoption of neural networks has surged in parallel with their performance. Concurrently, we have observed the inherent fragility of these prediction models: small changes to the inputs can induce classification errors across entire datasets. In the following study, we examine perturbations constrained by the $ell _{0}$ –norm, a potent attack model in the domains of computer vision, malware detection, and natural language processing. To combat this adversary, we introduce a novel defense technique comprised of two components: “truncation” and “adversarial training”. Subsequently, we conduct a theoretical analysis of the Gaussian mixture setting and establish the asymptotic optimality of our proposed defense. Based on this obtained insight, we broaden the application of our technique to neural networks. Lastly, we empirically validate our results in the domain of computer vision, demonstrating substantial enhancements in the robust classification error of neural networks.
在过去二十年里,神经网络的应用与性能同步激增。与此同时,我们也观察到了这些预测模型固有的脆弱性:输入的微小变化都可能导致整个数据集出现分类错误。在下面的研究中,我们将研究受$ell _{0}$ -norm约束的扰动,这是计算机视觉、恶意软件检测和自然语言处理领域中一种强大的攻击模型。为了对付这种对手,我们引入了一种由两部分组成的新型防御技术:"截断 "和 "对抗训练"。随后,我们对高斯混合物设置进行了理论分析,并确定了我们提出的防御方法的渐近最优性。在此基础上,我们将我们的技术应用于神经网络。最后,我们在计算机视觉领域对我们的成果进行了经验验证,证明神经网络的稳健分类误差得到了大幅提升。
{"title":"Efficient and Robust Classification for Sparse Attacks","authors":"Mark Beliaev;Payam Delgosha;Hamed Hassani;Ramtin Pedarsani","doi":"10.1109/JSAIT.2024.3397187","DOIUrl":"https://doi.org/10.1109/JSAIT.2024.3397187","url":null,"abstract":"Over the past two decades, the rise in adoption of neural networks has surged in parallel with their performance. Concurrently, we have observed the inherent fragility of these prediction models: small changes to the inputs can induce classification errors across entire datasets. In the following study, we examine perturbations constrained by the \u0000<inline-formula> <tex-math>$ell _{0}$ </tex-math></inline-formula>\u0000–norm, a potent attack model in the domains of computer vision, malware detection, and natural language processing. To combat this adversary, we introduce a novel defense technique comprised of two components: “truncation” and “adversarial training”. Subsequently, we conduct a theoretical analysis of the Gaussian mixture setting and establish the asymptotic optimality of our proposed defense. Based on this obtained insight, we broaden the application of our technique to neural networks. Lastly, we empirically validate our results in the domain of computer vision, demonstrating substantial enhancements in the robust classification error of neural networks.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"261-272"},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE journal on selected areas in information theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1