首页 > 最新文献

IEEE transactions on privacy最新文献

英文 中文
Privacy-Preserving Verification of ML Preprocessing via Model Behavior Indicators. 基于模型行为指标的ML预处理隐私保护验证。
Pub Date : 2025-01-01 Epub Date: 2025-11-04 DOI: 10.1109/tp.2025.3628998
Wenbiao Li, Anisa Halimi, Jaideep Vaidya, Xiaoqian Jiang, Erman Ayday

We present a privacy-preserving framework to verify whether a declared data preprocessing pipeline was correctly applied before training a machine learning model on sensitive data. The verifier has only black-box query access to the model and combines three behavior indicators: shift in prediction accuracy, Kullback-Leibler (KL) divergence between output distributions, and explanation vectors from Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP). The method requires neither the original training records nor ground-truth labels. It supports two tasks: (i) a binary decision on correctness and (ii) a multi-class diagnosis identifying which step is missing. Experiments on three tabular datasets (Diabetes, Adult-Income, Student-Record) show that the binary detector maintains over 75% F1 even under strong local differential privacy ( ε = 0.1 ). Machine-learning classifiers consistently outperform simple threshold rules in the binary setting, while the two approaches perform comparably for multi-class diagnosis. A label-free variant that clusters explanation vectors achieves competitive accuracy, enabling verification when no labeled pipelines are available. These results demonstrate a practical and scalable approach for safeguarding preprocessing integrity in privacy-sensitive machine learning workflows.

我们提出了一个隐私保护框架,用于验证在敏感数据上训练机器学习模型之前是否正确应用了声明的数据预处理管道。验证者只能对模型进行黑盒查询访问,并结合三个行为指标:预测精度的偏移,输出分布之间的Kullback-Leibler (KL)分歧,以及来自局部可解释模型不可知解释(LIME)和SHapley加性解释(SHAP)的解释向量。该方法既不需要原始训练记录,也不需要真实值标签。它支持两项任务:(i)对正确性的二元决策和(ii)识别缺失步骤的多类诊断。在三个表格数据集(Diabetes, Adult-Income, Student-Record)上的实验表明,即使在强局部差分隐私(ε = 0.1)下,二元检测器仍保持75%以上的F1。机器学习分类器在二进制设置中始终优于简单阈值规则,而这两种方法在多类诊断中表现相当。聚类解释向量的无标签变体实现了具有竞争力的准确性,在没有标记管道可用时允许验证。这些结果展示了一种在隐私敏感的机器学习工作流程中保护预处理完整性的实用且可扩展的方法。
{"title":"Privacy-Preserving Verification of ML Preprocessing via Model Behavior Indicators.","authors":"Wenbiao Li, Anisa Halimi, Jaideep Vaidya, Xiaoqian Jiang, Erman Ayday","doi":"10.1109/tp.2025.3628998","DOIUrl":"10.1109/tp.2025.3628998","url":null,"abstract":"<p><p>We present a privacy-preserving framework to verify whether a declared data preprocessing pipeline was correctly applied before training a machine learning model on sensitive data. The verifier has only black-box query access to the model and combines three behavior indicators: shift in prediction accuracy, Kullback-Leibler (KL) divergence between output distributions, and explanation vectors from Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP). The method requires neither the original training records nor ground-truth labels. It supports two tasks: (i) a binary decision on correctness and (ii) a multi-class diagnosis identifying which step is missing. Experiments on three tabular datasets (Diabetes, Adult-Income, Student-Record) show that the binary detector maintains over 75% F1 even under strong local differential privacy ( <math><mrow><mi>ε</mi> <mo>=</mo> <mn>0.1</mn></mrow> </math> ). Machine-learning classifiers consistently outperform simple threshold rules in the binary setting, while the two approaches perform comparably for multi-class diagnosis. A label-free variant that clusters explanation vectors achieves competitive accuracy, enabling verification when no labeled pipelines are available. These results demonstrate a practical and scalable approach for safeguarding preprocessing integrity in privacy-sensitive machine learning workflows.</p>","PeriodicalId":519971,"journal":{"name":"IEEE transactions on privacy","volume":"2 ","pages":"144-158"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12807549/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146000303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blockchain Based Secure Federated Learning With Local Differential Privacy and Incentivization. 基于区块链的安全联盟学习与本地差异化隐私和激励。
Pub Date : 2024-01-01 Epub Date: 2024-11-08 DOI: 10.1109/tp.2024.3487819
Saptarshi DE Chaudhury, Likhith Reddy Morreddigari, Matta Varun, Tirthankar Sengupta, Sandip Chakraborty, Shamik Sural, Jaideep Vaidya, Vijayalakshmi Atluri

Interest in supporting Federated Learning (FL) using blockchains has grown significantly in recent years. However, restricting access to the trained models only to actively participating nodes remains a challenge even today. To address this concern, we propose a methodology that incentivizes model parameter sharing in an FL setup under Local Differential Privacy (LDP). The nodes that share less obfuscated data under LDP are awarded higher quantum of tokens, which they can later use to obtain session keys for accessing encrypted model parameters updated by the server. If one or more of the nodes do not contribute to the learning process by sharing their data, or share only highly perturbed data, they earn less number of tokens. As a result, such nodes may not be able to read the new global model parameters if required. Local parameter sharing and updating of global parameters are done using the distributed ledger of a permissioned blockchain, namely HyperLedger Fabric (HLF). Being a blockchain-based approach, the risk of a single point of failure is also mitigated. Appropriate chaincodes, which are smart contracts in the HLF framework, have been developed for implementing the proposed methodology. Results of an extensive set of experiments firmly establish the feasibility of our approach.

近年来,人们对使用区块链支持联邦学习(FL)的兴趣显著增长。然而,即使在今天,将对训练模型的访问限制为仅对积极参与的节点仍然是一个挑战。为了解决这个问题,我们提出了一种方法,在本地差分隐私(LDP)下激励FL设置中的模型参数共享。在LDP下共享较少混淆数据的节点被授予更多的令牌,它们可以稍后使用这些令牌来获取会话密钥,以访问服务器更新的加密模型参数。如果一个或多个节点没有通过共享数据为学习过程做出贡献,或者只共享高度扰动的数据,那么它们获得的令牌数量就会减少。因此,如果需要,这些节点可能无法读取新的全局模型参数。本地参数共享和全局参数更新是使用一个被许可的区块链的分布式账本,即HyperLedger Fabric (HLF)来完成的。作为一种基于区块链的方法,单点故障的风险也得到了缓解。已经开发了适当的链码,即HLF框架中的智能合约,用于实施所提出的方法。大量实验的结果有力地证明了我们方法的可行性。
{"title":"Blockchain Based Secure Federated Learning With Local Differential Privacy and Incentivization.","authors":"Saptarshi DE Chaudhury, Likhith Reddy Morreddigari, Matta Varun, Tirthankar Sengupta, Sandip Chakraborty, Shamik Sural, Jaideep Vaidya, Vijayalakshmi Atluri","doi":"10.1109/tp.2024.3487819","DOIUrl":"10.1109/tp.2024.3487819","url":null,"abstract":"<p><p>Interest in supporting Federated Learning (FL) using blockchains has grown significantly in recent years. However, restricting access to the trained models only to actively participating nodes remains a challenge even today. To address this concern, we propose a methodology that incentivizes model parameter sharing in an FL setup under Local Differential Privacy (LDP). The nodes that share less obfuscated data under LDP are awarded higher quantum of tokens, which they can later use to obtain session keys for accessing encrypted model parameters updated by the server. If one or more of the nodes do not contribute to the learning process by sharing their data, or share only highly perturbed data, they earn less number of tokens. As a result, such nodes may not be able to read the new global model parameters if required. Local parameter sharing and updating of global parameters are done using the distributed ledger of a permissioned blockchain, namely HyperLedger Fabric (HLF). Being a blockchain-based approach, the risk of a single point of failure is also mitigated. Appropriate chaincodes, which are smart contracts in the HLF framework, have been developed for implementing the proposed methodology. Results of an extensive set of experiments firmly establish the feasibility of our approach.</p>","PeriodicalId":519971,"journal":{"name":"IEEE transactions on privacy","volume":"1 ","pages":"31-44"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11722199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142974496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
U.S.-U.K. PETs Prize Challenge: Anomaly Detection via Privacy-Enhanced Federated Learning. 美英 PETs 奖挑战赛:通过隐私增强联合学习进行异常检测。
Pub Date : 2024-01-01 Epub Date: 2024-04-23 DOI: 10.1109/tp.2024.3392721
Hafiz Asif, Sitao Min, Xinyue Wang, Jaideep Vaidya

Privacy Enhancing Technologies (PETs) have the potential to enable collaborative analytics without compromising privacy. This is extremely important for collaborative analytics can allow us to really extract value from the large amounts of data that are collected in domains such as healthcare, finance, and national security, among others. In order to foster innovation and move PETs from the research labs to actual deployment, the U.S. and U.K. governments partnered together in 2021 to propose the PETs prize challenge asking for privacy-enhancing solutions for two of the biggest problems facing us today: financial crime prevention and pandemic response. This article presents the Rutgers ScarletPets privacy-preserving federated learning approach to identify anomalous financial transactions in a payment network system (PNS). This approach utilizes a two-step anomaly detection methodology to solve the problem. In the first step, features are mined based on account-level data and labels, and then a privacy-preserving encoding scheme is used to augment these features to the data held by the PNS. In the second step, the PNS learns a highly accurate classifier from the augmented data. Our proposed approach has two major advantages: 1) there is no noteworthy drop in accuracy between the federated and the centralized setting, and 2) our approach is flexible since the PNS can keep improving its model and features to build a better classifier without imposing any additional computational or privacy burden on the banks. Notably, our solution won the first prize in the US for its privacy, utility, efficiency, and flexibility.

隐私增强技术(PET)有可能在不损害隐私的情况下实现协作分析。这一点极为重要,因为协作分析可以让我们真正从医疗保健、金融和国家安全等领域收集的大量数据中获取价值。为了促进创新并将 PETs 从研究实验室推向实际应用,美国和英国政府于 2021 年合作提出了 PETs 奖项挑战,要求为我们当今面临的两个最大问题提供隐私增强解决方案:金融犯罪预防和大流行病应对。本文介绍了罗格斯大学的 ScarletPets 隐私保护联合学习方法,用于识别支付网络系统(PNS)中的异常金融交易。该方法采用两步异常检测方法来解决问题。第一步,根据账户级数据和标签挖掘特征,然后使用隐私保护编码方案将这些特征增强到 PNS 持有的数据中。第二步,PNS 从增强数据中学习高精度分类器。我们提出的方法有两大优势:1)在联盟式和集中式环境下,准确率没有显著下降;2)我们的方法非常灵活,因为 PNS 可以不断改进其模型和特征,以建立更好的分类器,而不会给银行带来任何额外的计算或隐私负担。值得注意的是,我们的解决方案因其隐私性、实用性、高效性和灵活性获得了美国一等奖。
{"title":"U.S.-U.K. PETs Prize Challenge: Anomaly Detection via Privacy-Enhanced Federated Learning.","authors":"Hafiz Asif, Sitao Min, Xinyue Wang, Jaideep Vaidya","doi":"10.1109/tp.2024.3392721","DOIUrl":"10.1109/tp.2024.3392721","url":null,"abstract":"<p><p>Privacy Enhancing Technologies (PETs) have the potential to enable collaborative analytics without compromising privacy. This is extremely important for collaborative analytics can allow us to really extract value from the large amounts of data that are collected in domains such as healthcare, finance, and national security, among others. In order to foster innovation and move PETs from the research labs to actual deployment, the U.S. and U.K. governments partnered together in 2021 to propose the PETs prize challenge asking for privacy-enhancing solutions for two of the biggest problems facing us today: financial crime prevention and pandemic response. This article presents the Rutgers ScarletPets privacy-preserving federated learning approach to identify anomalous financial transactions in a payment network system (PNS). This approach utilizes a two-step anomaly detection methodology to solve the problem. In the first step, features are mined based on account-level data and labels, and then a privacy-preserving encoding scheme is used to augment these features to the data held by the PNS. In the second step, the PNS learns a highly accurate classifier from the augmented data. Our proposed approach has two major advantages: 1) there is no noteworthy drop in accuracy between the federated and the centralized setting, and 2) our approach is flexible since the PNS can keep improving its model and features to build a better classifier without imposing any additional computational or privacy burden on the banks. Notably, our solution won the first prize in the US for its privacy, utility, efficiency, and flexibility.</p>","PeriodicalId":519971,"journal":{"name":"IEEE transactions on privacy","volume":"1 ","pages":"3-18"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11229673/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141560776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on privacy
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1