Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice最新文献

英文中文

Information Leakage by Model Weights on Federated Learning 基于模型权重的联邦学习信息泄漏

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501.3419423

Xiaoyun Xu, Jingzheng Wu, Mutian Yang, Tianyue Luo, Xu Duan, Weiheng Li, Yanjun Wu, Bin Wu

Federated learning aggregates data from multiple sources while protecting privacy, which makes it possible to train efficient models in real scenes. However, although federated learning uses encrypted security aggregation, its decentralised nature makes it vulnerable to malicious attackers. A deliberate attacker can subtly control one or more participants and upload malicious model parameter updates, but the aggregation server cannot detect it due to encrypted privacy protection. Based on these problems, we find a practical and novel security risk in the design of federal learning. We propose an attack for conspired malicious participants to adjust the training data strategically so that the weight of a certain dimension in the aggregation model will rise or fall with a pattern. The trend of weights or parameters in the aggregation model forms meaningful signals, which is the risk of information leakage. The leakage is exposed to other participants in this federation but only available for participants who reach an agreement with the malicious participant, i.e., the receiver must be able to understand patterns of changes in weights. The attack effect is evaluated and verified on open-source code and data sets.

联邦学习在保护隐私的同时聚合了来自多个来源的数据，这使得在真实场景中训练高效模型成为可能。然而，尽管联邦学习使用加密的安全聚合，但其分散的性质使其容易受到恶意攻击者的攻击。蓄意的攻击者可以巧妙地控制一个或多个参与者并上传恶意的模型参数更新，但聚合服务器由于加密的隐私保护而无法检测到它。基于这些问题，我们在联邦学习的设计中发现了一种实用的、新颖的安全风险。我们提出了一种针对合谋的恶意参与者的攻击，通过策略调整训练数据，使聚合模型中某一维度的权重有规律地上升或下降。聚合模型中权重或参数的变化趋势形成有意义的信号，这是信息泄露的风险。泄漏将暴露给此联合中的其他参与者，但仅对与恶意参与者达成协议的参与者可用，即接收方必须能够理解权重变化的模式。在开源代码和数据集上对攻击效果进行了评估和验证。

{"title":"Information Leakage by Model Weights on Federated Learning","authors":"Xiaoyun Xu, Jingzheng Wu, Mutian Yang, Tianyue Luo, Xu Duan, Weiheng Li, Yanjun Wu, Bin Wu","doi":"10.1145/3411501.3419423","DOIUrl":"https://doi.org/10.1145/3411501.3419423","url":null,"abstract":"Federated learning aggregates data from multiple sources while protecting privacy, which makes it possible to train efficient models in real scenes. However, although federated learning uses encrypted security aggregation, its decentralised nature makes it vulnerable to malicious attackers. A deliberate attacker can subtly control one or more participants and upload malicious model parameter updates, but the aggregation server cannot detect it due to encrypted privacy protection. Based on these problems, we find a practical and novel security risk in the design of federal learning. We propose an attack for conspired malicious participants to adjust the training data strategically so that the weight of a certain dimension in the aggregation model will rise or fall with a pattern. The trend of weights or parameters in the aggregation model forms meaningful signals, which is the risk of information leakage. The leakage is exposed to other participants in this federation but only available for participants who reach an agreement with the malicious participant, i.e., the receiver must be able to understand patterns of changes in weights. The attack effect is evaluated and verified on open-source code and data sets.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116614538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Faster Secure Multiparty Computation of Adaptive Gradient Descent 自适应梯度下降的快速安全多方计算

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501.3419427

Wen-jie Lu, Yixuan Fang, Zhicong Huang, Cheng Hong, Chaochao Chen, Hunter Qu, Yajin Zhou, K. Ren

Most of the secure multi-party computation (MPC) machine learning methods can only afford simple gradient descent (sGD 1) optimizers, and are unable to benefit from the recent progress of adaptive GD optimizers (e.g., Adagrad, Adam and their variants), which include square-root and reciprocal operations that are hard to compute in MPC. To mitigate this issue, we introduce InvertSqrt, an efficient MPC protocol for computing 1/√x. Then we implement the Adam adaptive GD optimizer based on InvertSqrt and use it for training on different datasets. The training costs compare favorably to the sGD ones, indicating that adaptive GD optimizers in MPC have become practical.

大多数安全多方计算(MPC)机器学习方法只能提供简单的梯度下降(sgd1)优化器，并且无法从自适应GD优化器(例如Adagrad, Adam及其变体)的最新进展中受益，其中包括在MPC中难以计算的平方根和倒数运算。为了缓解这个问题，我们引入了InvertSqrt，这是一种用于计算1/√x的高效MPC协议。然后我们实现了基于InvertSqrt的Adam自适应GD优化器，并使用它在不同的数据集上进行训练。与sGD的训练成本相比，该方法的训练成本更低，这表明MPC中的自适应GD优化器已经变得实用。

引用次数: 9

Zero-Knowledge Proofs for Machine Learning 机器学习的零知识证明

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501.3418608

Yupeng Zhang

Machine learning has become increasingly prominent and is widely used in various applications in practice. Despite its great success, the integrity of machine learning predictions and accuracy is a rising concern. The reproducibility of machine learning models that are claimed to achieve high accuracy remains challenging, and the correctness and consistency of machine learning predictions in real products lack any security guarantees. We introduce some of our recent results on applying the cryptographic primitive of zero knowledge proofs to the domain of machine learning to address these issues. The protocols allow the owner of a machine learning model to convince others that the model computes a particular prediction on a data sample, or achieves a high accuracy on public datasets, without leaking any information about the machine learning model itself. We developed efficient zero knowledge proof protocols for decision trees, random forests and neural networks.

机器学习在实践中日益突出并广泛应用于各种应用。尽管取得了巨大的成功，但机器学习预测的完整性和准确性日益受到关注。声称达到高精度的机器学习模型的再现性仍然具有挑战性，机器学习预测在真实产品中的正确性和一致性缺乏任何安全保证。我们介绍了我们最近在机器学习领域应用零知识证明的加密原语来解决这些问题的一些结果。该协议允许机器学习模型的所有者说服其他人，该模型在数据样本上计算特定的预测，或者在公共数据集上实现高精度，而不会泄露任何关于机器学习模型本身的信息。我们为决策树、随机森林和神经网络开发了高效的零知识证明协议。

引用次数: 2

Privacy-Preserving in Defending against Membership Inference Attacks 防范成员推理攻击的隐私保护

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501.3419428

Zuobin Ying, Yun Zhang, Ximeng Liu

The membership inference attack refers to the attacker's purpose to infer whether the data sample is in the target classifier training dataset. The ability of an adversary to ascertain the presence of an individual constitutes an obvious privacy threat if relate to a group of users that share a sensitive characteristic. Many defense methods have been proposed for membership inference attack, but they have not achieved the expected privacy effect. In this paper, we quantify the impact of these choices on privacy in experiments using logistic regression and neural network models. Using both formal and empirical analyses, we illustrate that differential privacy and L2 regularization can effectively prevent member inference attacks.

隶属度推理攻击是指攻击者的目的是推断数据样本是否在目标分类器训练数据集中。攻击者确定个人存在的能力，如果涉及到一组共享敏感特征的用户，则构成明显的隐私威胁。针对隶属推理攻击，人们提出了许多防御方法，但都没有达到预期的隐私效果。在本文中，我们在实验中使用逻辑回归和神经网络模型量化了这些选择对隐私的影响。通过形式分析和实证分析，我们证明了差分隐私和L2正则化可以有效地防止成员推理攻击。

引用次数: 11

Engineering Privacy-Preserving Machine Learning Protocols 工程隐私保护机器学习协议

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501.3418607

T. Schneider

Privacy-preserving machine learning (PPML) protocols allow to privately evaluate or even train machine learning (ML) models on sensitive data while simultaneously protecting the data and the model. So far, most of these protocols were built and optimized by hand, which requires expert knowledge in cryptography and also a thorough understanding of the ML models. Moreover, the design space is very large as there are many technologies that can even be combined with several trade-offs. Examples for the underlying cryptographic building blocks include homomorphic encryption (HE) where computation typically is the bottleneck, and secure multi-party computation protocols (MPC) that rely mostly on symmetric key cryptography where communication is often the~bottleneck. In this keynote, I will describe our research towards engineering practical PPML protocols that protect models and data. First of all, there is no point in designing PPML protocols for too simple models such as Support Vector Machines (SVMs) or Support Vector Regression Machines (SVRs), because they can be stolen easily [10] and hence do not benefit from protection. Complex models can be protected and evaluated in real-time using Trusted Execution Environments (TEEs) which we demonstrated for speech recognition using Intel SGX[5] and for keyword recognition using ARM TrustZone[3] as respective commercial TEE technologies. Our goal is to build tools for non-experts in cryptography to automatically generate highly optimized mixed PPML protocols given a high-level specification in a ML framework like TensorFlow. Towards this, we have built tools to automatically generate optimized mixed protocols that combine HE and different MPC protocols [6-8]. Such mixed protocols can for example be used for the efficient privacy-preserving evaluation of decision trees [1, 2, 9, 13] and neural networks[2, 11, 12]. The first PPML protocols for these ML classifiers were proposed long before the current hype on PPML started [1, 2, 12]. We already have first results for compiling high-level ML specifications via our tools into mixed protocols for neural networks (from TensorFlow) [4] and sum-product networks (from SPFlow) [14], and I will conclude with major open challenges.

隐私保护机器学习(PPML)协议允许私下评估甚至训练敏感数据上的机器学习(ML)模型，同时保护数据和模型。到目前为止，大多数这些协议都是手工构建和优化的，这需要密码学方面的专业知识以及对ML模型的透彻理解。此外，设计空间非常大，因为有许多技术甚至可以与几种权衡相结合。底层加密构建块的示例包括同态加密(HE)，其中计算通常是瓶颈，以及主要依赖于对称密钥加密的安全多方计算协议(MPC)，其中通信通常是瓶颈。在这个主题演讲中，我将描述我们在保护模型和数据的工程实用PPML协议方面的研究。首先，为过于简单的模型(如支持向量机(svm)或支持向量回归机(svr))设计PPML协议是没有意义的，因为它们很容易被盗[10]，因此不能从保护中受益。复杂的模型可以使用可信执行环境(TEE)进行实时保护和评估，我们演示了使用英特尔SGX[5]的语音识别和使用ARM TrustZone[3]的关键字识别作为各自的商业TEE技术。我们的目标是为非密码学专家构建工具，在像TensorFlow这样的ML框架中提供高级规范，自动生成高度优化的混合PPML协议。为此，我们构建了工具来自动生成优化的混合协议，将HE和不同的MPC协议结合在一起[6-8]。例如，这种混合协议可用于决策树[1,2,9,13]和神经网络[2,11,12]的有效隐私保护评估。这些ML分类器的第一个PPML协议早在当前对PPML的炒作开始之前就被提出了[1,2,12]。我们已经有了通过我们的工具将高级机器学习规范编译成神经网络(来自TensorFlow)[4]和和积网络(来自SPFlow)[14]的混合协议的第一批结果，我将以主要的公开挑战来结束。

{"title":"Engineering Privacy-Preserving Machine Learning Protocols","authors":"T. Schneider","doi":"10.1145/3411501.3418607","DOIUrl":"https://doi.org/10.1145/3411501.3418607","url":null,"abstract":"Privacy-preserving machine learning (PPML) protocols allow to privately evaluate or even train machine learning (ML) models on sensitive data while simultaneously protecting the data and the model. So far, most of these protocols were built and optimized by hand, which requires expert knowledge in cryptography and also a thorough understanding of the ML models. Moreover, the design space is very large as there are many technologies that can even be combined with several trade-offs. Examples for the underlying cryptographic building blocks include homomorphic encryption (HE) where computation typically is the bottleneck, and secure multi-party computation protocols (MPC) that rely mostly on symmetric key cryptography where communication is often the~bottleneck. In this keynote, I will describe our research towards engineering practical PPML protocols that protect models and data. First of all, there is no point in designing PPML protocols for too simple models such as Support Vector Machines (SVMs) or Support Vector Regression Machines (SVRs), because they can be stolen easily [10] and hence do not benefit from protection. Complex models can be protected and evaluated in real-time using Trusted Execution Environments (TEEs) which we demonstrated for speech recognition using Intel SGX[5] and for keyword recognition using ARM TrustZone[3] as respective commercial TEE technologies. Our goal is to build tools for non-experts in cryptography to automatically generate highly optimized mixed PPML protocols given a high-level specification in a ML framework like TensorFlow. Towards this, we have built tools to automatically generate optimized mixed protocols that combine HE and different MPC protocols [6-8]. Such mixed protocols can for example be used for the efficient privacy-preserving evaluation of decision trees [1, 2, 9, 13] and neural networks[2, 11, 12]. The first PPML protocols for these ML classifiers were proposed long before the current hype on PPML started [1, 2, 12]. We already have first results for compiling high-level ML specifications via our tools into mixed protocols for neural networks (from TensorFlow) [4] and sum-product networks (from SPFlow) [14], and I will conclude with major open challenges.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124905228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

TinyGarble2

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501.3419433

S. Hussain, Baiyu Li, F. Koushanfar, Rosario Cammarota

We present TinyGarble2 -- a C++ framework for privacy-preserving computation through the Yao's Garbled Circuit (GC) protocol in both the honest-but-curious and the malicious security models. TinyGarble2 provides a rich library with arithmetic and logic building blocks for developing GC-based secure applications. The framework offers abstractions among three layers: the C++ program, the GC back-end and the Boolean logic representation of the function being computed. TinyGarble2 thus allowing the most optimized versions of all pertinent components. These abstractions, coupled with secure share transfer among the functions make TinyGarble2 the fastest and most memory-efficient GC framework. In addition, the framework provides a library for Convolutional Neural Networks (CNN). Our evaluations show that TinyGarble2 is the fastest among the current end-to-end GC frameworks while also being scalable in terms of memory footprint. Moreover, it performs 18x faster on the CNN LeNet-5 compared to the existing scalable frameworks.

引用次数: 11

Delphi: A Cryptographic Inference System for Neural Networks Delphi:一个用于神经网络的密码推理系统

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501.3419418

Pratyush Mishra, Ryan T. Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, R. A. Popa

Many companies provide neural network prediction services to users for a wide range of applications. However, current prediction systems compromise one party's privacy: either the user has to send sensitive inputs to the service provider for classification, or the service provider must store its proprietary neural networks on the user's device. The former harms the personal privacy of the user, while the latter reveals the service provider's proprietary model. We design, implement, and evaluate Delphi, a secure prediction system that allows two parties to execute neural network inference without revealing either party's data. Delphi approaches the problem by simultaneously co-designing cryptography and machine learning. We first design a hybrid cryptographic protocol that improves upon the communication and computation costs over prior work. Second, we develop a planner that automatically generates neural network architecture configurations that navigate the performance-accuracy trade-offs of our hybrid protocol. Together, these techniques allow us to achieve a 22x improvement in online prediction latency compared to the state-of-the-art prior work.

许多公司为用户提供神经网络预测服务，应用范围很广。然而，目前的预测系统损害了一方的隐私:要么用户必须将敏感输入发送给服务提供商进行分类，要么服务提供商必须将其专有的神经网络存储在用户的设备上。前者损害了用户的个人隐私，后者则暴露了服务提供商的专有模式。我们设计、实现和评估Delphi，这是一个安全的预测系统，允许双方在不泄露任何一方数据的情况下执行神经网络推理。Delphi通过同时共同设计密码学和机器学习来解决这个问题。我们首先设计了一种混合加密协议，与之前的工作相比，它改善了通信和计算成本。其次，我们开发了一个规划器，自动生成神经网络架构配置，以导航我们的混合协议的性能精度权衡。总的来说，这些技术使我们能够在在线预测延迟方面比之前的最先进的工作提高22倍。

{"title":"Delphi: A Cryptographic Inference System for Neural Networks","authors":"Pratyush Mishra, Ryan T. Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, R. A. Popa","doi":"10.1145/3411501.3419418","DOIUrl":"https://doi.org/10.1145/3411501.3419418","url":null,"abstract":"Many companies provide neural network prediction services to users for a wide range of applications. However, current prediction systems compromise one party's privacy: either the user has to send sensitive inputs to the service provider for classification, or the service provider must store its proprietary neural networks on the user's device. The former harms the personal privacy of the user, while the latter reveals the service provider's proprietary model. We design, implement, and evaluate Delphi, a secure prediction system that allows two parties to execute neural network inference without revealing either party's data. Delphi approaches the problem by simultaneously co-designing cryptography and machine learning. We first design a hybrid cryptographic protocol that improves upon the communication and computation costs over prior work. Second, we develop a planner that automatically generates neural network architecture configurations that navigate the performance-accuracy trade-offs of our hybrid protocol. Together, these techniques allow us to achieve a 22x improvement in online prediction latency compared to the state-of-the-art prior work.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125608896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 192

A Systematic Comparison of Encrypted Machine Learning Solutions for Image Classification 图像分类中加密机器学习解决方案的系统比较

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501.3419432

Veneta Haralampieva, D. Rueckert, Jonathan Passerat-Palmbach

This work provides a comprehensive review of existing frameworks based on secure computing techniques in the context of private image classification. The in-depth analysis of these approaches is followed by careful examination of their performance costs, in particular runtime and communication overhead. To further illustrate the practical considerations when using different privacy-preserving technologies, experiments were conducted using four state-of-the-art libraries implementing secure computing at the heart of the data science stack: PySyft and CrypTen supporting private inference via Secure Multi-Party Computation, TF-Trusted utilising Trusted Execution Environments and HE-Transformer relying on Homomorphic encryption. Our work aims to evaluate the suitability of these frameworks from a usability, runtime requirements and accuracy point of view. In order to better understand the gap between state-of-the-art protocols and what is currently available in practice for a data scientist, we designed three neural network architecture to obtain secure predictions via each of the four aforementioned frameworks. Two networks were evaluated on the MNIST dataset and one on the Malaria Cell image dataset. We observed satisfying performances for TF-Trusted and CrypTen and noted that all frameworks perfectly preserved the accuracy of the corresponding plaintext model.

这项工作提供了基于安全计算技术在私有图像分类背景下现有框架的全面审查。在对这些方法进行深入分析之后，仔细检查了它们的性能成本，特别是运行时和通信开销。为了进一步说明在使用不同的隐私保护技术时的实际考虑，实验使用了四个最先进的库，在数据科学堆栈的核心实现安全计算:PySyft和CrypTen通过安全多方计算支持私有推理，TF-Trusted利用可信执行环境和依赖同态加密的hetransformer。我们的工作旨在从可用性、运行时需求和准确性的角度评估这些框架的适用性。为了更好地理解最先进的协议与数据科学家目前在实践中可用的协议之间的差距，我们设计了三个神经网络架构，通过上述四个框架中的每个框架获得安全预测。在MNIST数据集上评估了两个网络，在疟疾细胞图像数据集上评估了一个网络。我们观察到TF-Trusted和CrypTen的性能令人满意，并注意到所有框架都完美地保留了相应明文模型的准确性。

{"title":"A Systematic Comparison of Encrypted Machine Learning Solutions for Image Classification","authors":"Veneta Haralampieva, D. Rueckert, Jonathan Passerat-Palmbach","doi":"10.1145/3411501.3419432","DOIUrl":"https://doi.org/10.1145/3411501.3419432","url":null,"abstract":"This work provides a comprehensive review of existing frameworks based on secure computing techniques in the context of private image classification. The in-depth analysis of these approaches is followed by careful examination of their performance costs, in particular runtime and communication overhead. To further illustrate the practical considerations when using different privacy-preserving technologies, experiments were conducted using four state-of-the-art libraries implementing secure computing at the heart of the data science stack: PySyft and CrypTen supporting private inference via Secure Multi-Party Computation, TF-Trusted utilising Trusted Execution Environments and HE-Transformer relying on Homomorphic encryption. Our work aims to evaluate the suitability of these frameworks from a usability, runtime requirements and accuracy point of view. In order to better understand the gap between state-of-the-art protocols and what is currently available in practice for a data scientist, we designed three neural network architecture to obtain secure predictions via each of the four aforementioned frameworks. Two networks were evaluated on the MNIST dataset and one on the Malaria Cell image dataset. We observed satisfying performances for TF-Trusted and CrypTen and noted that all frameworks perfectly preserved the accuracy of the corresponding plaintext model.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"203 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122011009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice 2020年隐私保护机器学习实践研讨会论文集

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501

引用次数: 1

SVM Learning for Default Prediction of Credit Card under Differential Privacy 差分隐私下信用卡默认预测的SVM学习

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

Pub Date : 2020-11-09 DOI: 10.1145/3411501.3419431

Jianping Cai, Ximeng Liu, Yingjie Wu

Currently, financial institutions utilize personal sensitive information extensively in machine learning. It results in significant privacy risks to customers. As an essential standard of privacy, differential privacy is often applied to machine learning in recent years. To establish a prediction model of credit card default under the premise of protecting personal privacy, we consider the problems of customer data contribution difference and data sample distribution imbalance, propose weighted SVM algorithm under differential privacy. Through theoretical analysis, we have ensured the security of differential privacy. The algorithm solves the problem of prediction result deviation caused by sample distribution imbalance and effectively reduces the data sensitivity and noise error. The experimental results show that the algorithm proposed in this paper can accurately predict whether a customer is default while protecting personal privacy.

目前，金融机构在机器学习中广泛使用个人敏感信息。这给客户带来了重大的隐私风险。差分隐私作为一种重要的隐私标准，近年来经常被应用到机器学习中。为建立保护个人隐私前提下的信用卡违约预测模型，考虑客户数据贡献差异和数据样本分布不平衡问题，提出差分隐私下的加权SVM算法。通过理论分析，我们保证了差分隐私的安全性。该算法解决了样本分布不平衡导致的预测结果偏差问题，有效降低了数据敏感性和噪声误差。实验结果表明，本文提出的算法能够在保护个人隐私的同时准确预测客户是否违约。

引用次数: 4

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀