Federated learning aggregates data from multiple sources while protecting privacy, which makes it possible to train efficient models in real scenes. However, although federated learning uses encrypted security aggregation, its decentralised nature makes it vulnerable to malicious attackers. A deliberate attacker can subtly control one or more participants and upload malicious model parameter updates, but the aggregation server cannot detect it due to encrypted privacy protection. Based on these problems, we find a practical and novel security risk in the design of federal learning. We propose an attack for conspired malicious participants to adjust the training data strategically so that the weight of a certain dimension in the aggregation model will rise or fall with a pattern. The trend of weights or parameters in the aggregation model forms meaningful signals, which is the risk of information leakage. The leakage is exposed to other participants in this federation but only available for participants who reach an agreement with the malicious participant, i.e., the receiver must be able to understand patterns of changes in weights. The attack effect is evaluated and verified on open-source code and data sets.
{"title":"Information Leakage by Model Weights on Federated Learning","authors":"Xiaoyun Xu, Jingzheng Wu, Mutian Yang, Tianyue Luo, Xu Duan, Weiheng Li, Yanjun Wu, Bin Wu","doi":"10.1145/3411501.3419423","DOIUrl":"https://doi.org/10.1145/3411501.3419423","url":null,"abstract":"Federated learning aggregates data from multiple sources while protecting privacy, which makes it possible to train efficient models in real scenes. However, although federated learning uses encrypted security aggregation, its decentralised nature makes it vulnerable to malicious attackers. A deliberate attacker can subtly control one or more participants and upload malicious model parameter updates, but the aggregation server cannot detect it due to encrypted privacy protection. Based on these problems, we find a practical and novel security risk in the design of federal learning. We propose an attack for conspired malicious participants to adjust the training data strategically so that the weight of a certain dimension in the aggregation model will rise or fall with a pattern. The trend of weights or parameters in the aggregation model forms meaningful signals, which is the risk of information leakage. The leakage is exposed to other participants in this federation but only available for participants who reach an agreement with the malicious participant, i.e., the receiver must be able to understand patterns of changes in weights. The attack effect is evaluated and verified on open-source code and data sets.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116614538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most of the secure multi-party computation (MPC) machine learning methods can only afford simple gradient descent (sGD 1) optimizers, and are unable to benefit from the recent progress of adaptive GD optimizers (e.g., Adagrad, Adam and their variants), which include square-root and reciprocal operations that are hard to compute in MPC. To mitigate this issue, we introduce InvertSqrt, an efficient MPC protocol for computing 1/√x. Then we implement the Adam adaptive GD optimizer based on InvertSqrt and use it for training on different datasets. The training costs compare favorably to the sGD ones, indicating that adaptive GD optimizers in MPC have become practical.
{"title":"Faster Secure Multiparty Computation of Adaptive Gradient Descent","authors":"Wen-jie Lu, Yixuan Fang, Zhicong Huang, Cheng Hong, Chaochao Chen, Hunter Qu, Yajin Zhou, K. Ren","doi":"10.1145/3411501.3419427","DOIUrl":"https://doi.org/10.1145/3411501.3419427","url":null,"abstract":"Most of the secure multi-party computation (MPC) machine learning methods can only afford simple gradient descent (sGD 1) optimizers, and are unable to benefit from the recent progress of adaptive GD optimizers (e.g., Adagrad, Adam and their variants), which include square-root and reciprocal operations that are hard to compute in MPC. To mitigate this issue, we introduce InvertSqrt, an efficient MPC protocol for computing 1/√x. Then we implement the Adam adaptive GD optimizer based on InvertSqrt and use it for training on different datasets. The training costs compare favorably to the sGD ones, indicating that adaptive GD optimizers in MPC have become practical.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116269726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine learning has become increasingly prominent and is widely used in various applications in practice. Despite its great success, the integrity of machine learning predictions and accuracy is a rising concern. The reproducibility of machine learning models that are claimed to achieve high accuracy remains challenging, and the correctness and consistency of machine learning predictions in real products lack any security guarantees. We introduce some of our recent results on applying the cryptographic primitive of zero knowledge proofs to the domain of machine learning to address these issues. The protocols allow the owner of a machine learning model to convince others that the model computes a particular prediction on a data sample, or achieves a high accuracy on public datasets, without leaking any information about the machine learning model itself. We developed efficient zero knowledge proof protocols for decision trees, random forests and neural networks.
{"title":"Zero-Knowledge Proofs for Machine Learning","authors":"Yupeng Zhang","doi":"10.1145/3411501.3418608","DOIUrl":"https://doi.org/10.1145/3411501.3418608","url":null,"abstract":"Machine learning has become increasingly prominent and is widely used in various applications in practice. Despite its great success, the integrity of machine learning predictions and accuracy is a rising concern. The reproducibility of machine learning models that are claimed to achieve high accuracy remains challenging, and the correctness and consistency of machine learning predictions in real products lack any security guarantees. We introduce some of our recent results on applying the cryptographic primitive of zero knowledge proofs to the domain of machine learning to address these issues. The protocols allow the owner of a machine learning model to convince others that the model computes a particular prediction on a data sample, or achieves a high accuracy on public datasets, without leaking any information about the machine learning model itself. We developed efficient zero knowledge proof protocols for decision trees, random forests and neural networks.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132462112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The membership inference attack refers to the attacker's purpose to infer whether the data sample is in the target classifier training dataset. The ability of an adversary to ascertain the presence of an individual constitutes an obvious privacy threat if relate to a group of users that share a sensitive characteristic. Many defense methods have been proposed for membership inference attack, but they have not achieved the expected privacy effect. In this paper, we quantify the impact of these choices on privacy in experiments using logistic regression and neural network models. Using both formal and empirical analyses, we illustrate that differential privacy and L2 regularization can effectively prevent member inference attacks.
{"title":"Privacy-Preserving in Defending against Membership Inference Attacks","authors":"Zuobin Ying, Yun Zhang, Ximeng Liu","doi":"10.1145/3411501.3419428","DOIUrl":"https://doi.org/10.1145/3411501.3419428","url":null,"abstract":"The membership inference attack refers to the attacker's purpose to infer whether the data sample is in the target classifier training dataset. The ability of an adversary to ascertain the presence of an individual constitutes an obvious privacy threat if relate to a group of users that share a sensitive characteristic. Many defense methods have been proposed for membership inference attack, but they have not achieved the expected privacy effect. In this paper, we quantify the impact of these choices on privacy in experiments using logistic regression and neural network models. Using both formal and empirical analyses, we illustrate that differential privacy and L2 regularization can effectively prevent member inference attacks.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115140005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Privacy-preserving machine learning (PPML) protocols allow to privately evaluate or even train machine learning (ML) models on sensitive data while simultaneously protecting the data and the model. So far, most of these protocols were built and optimized by hand, which requires expert knowledge in cryptography and also a thorough understanding of the ML models. Moreover, the design space is very large as there are many technologies that can even be combined with several trade-offs. Examples for the underlying cryptographic building blocks include homomorphic encryption (HE) where computation typically is the bottleneck, and secure multi-party computation protocols (MPC) that rely mostly on symmetric key cryptography where communication is often the~bottleneck. In this keynote, I will describe our research towards engineering practical PPML protocols that protect models and data. First of all, there is no point in designing PPML protocols for too simple models such as Support Vector Machines (SVMs) or Support Vector Regression Machines (SVRs), because they can be stolen easily [10] and hence do not benefit from protection. Complex models can be protected and evaluated in real-time using Trusted Execution Environments (TEEs) which we demonstrated for speech recognition using Intel SGX[5] and for keyword recognition using ARM TrustZone[3] as respective commercial TEE technologies. Our goal is to build tools for non-experts in cryptography to automatically generate highly optimized mixed PPML protocols given a high-level specification in a ML framework like TensorFlow. Towards this, we have built tools to automatically generate optimized mixed protocols that combine HE and different MPC protocols [6-8]. Such mixed protocols can for example be used for the efficient privacy-preserving evaluation of decision trees [1, 2, 9, 13] and neural networks[2, 11, 12]. The first PPML protocols for these ML classifiers were proposed long before the current hype on PPML started [1, 2, 12]. We already have first results for compiling high-level ML specifications via our tools into mixed protocols for neural networks (from TensorFlow) [4] and sum-product networks (from SPFlow) [14], and I will conclude with major open challenges.
{"title":"Engineering Privacy-Preserving Machine Learning Protocols","authors":"T. Schneider","doi":"10.1145/3411501.3418607","DOIUrl":"https://doi.org/10.1145/3411501.3418607","url":null,"abstract":"Privacy-preserving machine learning (PPML) protocols allow to privately evaluate or even train machine learning (ML) models on sensitive data while simultaneously protecting the data and the model. So far, most of these protocols were built and optimized by hand, which requires expert knowledge in cryptography and also a thorough understanding of the ML models. Moreover, the design space is very large as there are many technologies that can even be combined with several trade-offs. Examples for the underlying cryptographic building blocks include homomorphic encryption (HE) where computation typically is the bottleneck, and secure multi-party computation protocols (MPC) that rely mostly on symmetric key cryptography where communication is often the~bottleneck. In this keynote, I will describe our research towards engineering practical PPML protocols that protect models and data. First of all, there is no point in designing PPML protocols for too simple models such as Support Vector Machines (SVMs) or Support Vector Regression Machines (SVRs), because they can be stolen easily [10] and hence do not benefit from protection. Complex models can be protected and evaluated in real-time using Trusted Execution Environments (TEEs) which we demonstrated for speech recognition using Intel SGX[5] and for keyword recognition using ARM TrustZone[3] as respective commercial TEE technologies. Our goal is to build tools for non-experts in cryptography to automatically generate highly optimized mixed PPML protocols given a high-level specification in a ML framework like TensorFlow. Towards this, we have built tools to automatically generate optimized mixed protocols that combine HE and different MPC protocols [6-8]. Such mixed protocols can for example be used for the efficient privacy-preserving evaluation of decision trees [1, 2, 9, 13] and neural networks[2, 11, 12]. The first PPML protocols for these ML classifiers were proposed long before the current hype on PPML started [1, 2, 12]. We already have first results for compiling high-level ML specifications via our tools into mixed protocols for neural networks (from TensorFlow) [4] and sum-product networks (from SPFlow) [14], and I will conclude with major open challenges.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124905228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Hussain, Baiyu Li, F. Koushanfar, Rosario Cammarota
We present TinyGarble2 -- a C++ framework for privacy-preserving computation through the Yao's Garbled Circuit (GC) protocol in both the honest-but-curious and the malicious security models. TinyGarble2 provides a rich library with arithmetic and logic building blocks for developing GC-based secure applications. The framework offers abstractions among three layers: the C++ program, the GC back-end and the Boolean logic representation of the function being computed. TinyGarble2 thus allowing the most optimized versions of all pertinent components. These abstractions, coupled with secure share transfer among the functions make TinyGarble2 the fastest and most memory-efficient GC framework. In addition, the framework provides a library for Convolutional Neural Networks (CNN). Our evaluations show that TinyGarble2 is the fastest among the current end-to-end GC frameworks while also being scalable in terms of memory footprint. Moreover, it performs 18x faster on the CNN LeNet-5 compared to the existing scalable frameworks.
{"title":"TinyGarble2","authors":"S. Hussain, Baiyu Li, F. Koushanfar, Rosario Cammarota","doi":"10.1145/3411501.3419433","DOIUrl":"https://doi.org/10.1145/3411501.3419433","url":null,"abstract":"We present TinyGarble2 -- a C++ framework for privacy-preserving computation through the Yao's Garbled Circuit (GC) protocol in both the honest-but-curious and the malicious security models. TinyGarble2 provides a rich library with arithmetic and logic building blocks for developing GC-based secure applications. The framework offers abstractions among three layers: the C++ program, the GC back-end and the Boolean logic representation of the function being computed. TinyGarble2 thus allowing the most optimized versions of all pertinent components. These abstractions, coupled with secure share transfer among the functions make TinyGarble2 the fastest and most memory-efficient GC framework. In addition, the framework provides a library for Convolutional Neural Networks (CNN). Our evaluations show that TinyGarble2 is the fastest among the current end-to-end GC frameworks while also being scalable in terms of memory footprint. Moreover, it performs 18x faster on the CNN LeNet-5 compared to the existing scalable frameworks.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123263093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pratyush Mishra, Ryan T. Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, R. A. Popa
Many companies provide neural network prediction services to users for a wide range of applications. However, current prediction systems compromise one party's privacy: either the user has to send sensitive inputs to the service provider for classification, or the service provider must store its proprietary neural networks on the user's device. The former harms the personal privacy of the user, while the latter reveals the service provider's proprietary model. We design, implement, and evaluate Delphi, a secure prediction system that allows two parties to execute neural network inference without revealing either party's data. Delphi approaches the problem by simultaneously co-designing cryptography and machine learning. We first design a hybrid cryptographic protocol that improves upon the communication and computation costs over prior work. Second, we develop a planner that automatically generates neural network architecture configurations that navigate the performance-accuracy trade-offs of our hybrid protocol. Together, these techniques allow us to achieve a 22x improvement in online prediction latency compared to the state-of-the-art prior work.
{"title":"Delphi: A Cryptographic Inference System for Neural Networks","authors":"Pratyush Mishra, Ryan T. Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, R. A. Popa","doi":"10.1145/3411501.3419418","DOIUrl":"https://doi.org/10.1145/3411501.3419418","url":null,"abstract":"Many companies provide neural network prediction services to users for a wide range of applications. However, current prediction systems compromise one party's privacy: either the user has to send sensitive inputs to the service provider for classification, or the service provider must store its proprietary neural networks on the user's device. The former harms the personal privacy of the user, while the latter reveals the service provider's proprietary model. We design, implement, and evaluate Delphi, a secure prediction system that allows two parties to execute neural network inference without revealing either party's data. Delphi approaches the problem by simultaneously co-designing cryptography and machine learning. We first design a hybrid cryptographic protocol that improves upon the communication and computation costs over prior work. Second, we develop a planner that automatically generates neural network architecture configurations that navigate the performance-accuracy trade-offs of our hybrid protocol. Together, these techniques allow us to achieve a 22x improvement in online prediction latency compared to the state-of-the-art prior work.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125608896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Veneta Haralampieva, D. Rueckert, Jonathan Passerat-Palmbach
This work provides a comprehensive review of existing frameworks based on secure computing techniques in the context of private image classification. The in-depth analysis of these approaches is followed by careful examination of their performance costs, in particular runtime and communication overhead. To further illustrate the practical considerations when using different privacy-preserving technologies, experiments were conducted using four state-of-the-art libraries implementing secure computing at the heart of the data science stack: PySyft and CrypTen supporting private inference via Secure Multi-Party Computation, TF-Trusted utilising Trusted Execution Environments and HE-Transformer relying on Homomorphic encryption. Our work aims to evaluate the suitability of these frameworks from a usability, runtime requirements and accuracy point of view. In order to better understand the gap between state-of-the-art protocols and what is currently available in practice for a data scientist, we designed three neural network architecture to obtain secure predictions via each of the four aforementioned frameworks. Two networks were evaluated on the MNIST dataset and one on the Malaria Cell image dataset. We observed satisfying performances for TF-Trusted and CrypTen and noted that all frameworks perfectly preserved the accuracy of the corresponding plaintext model.
{"title":"A Systematic Comparison of Encrypted Machine Learning Solutions for Image Classification","authors":"Veneta Haralampieva, D. Rueckert, Jonathan Passerat-Palmbach","doi":"10.1145/3411501.3419432","DOIUrl":"https://doi.org/10.1145/3411501.3419432","url":null,"abstract":"This work provides a comprehensive review of existing frameworks based on secure computing techniques in the context of private image classification. The in-depth analysis of these approaches is followed by careful examination of their performance costs, in particular runtime and communication overhead. To further illustrate the practical considerations when using different privacy-preserving technologies, experiments were conducted using four state-of-the-art libraries implementing secure computing at the heart of the data science stack: PySyft and CrypTen supporting private inference via Secure Multi-Party Computation, TF-Trusted utilising Trusted Execution Environments and HE-Transformer relying on Homomorphic encryption. Our work aims to evaluate the suitability of these frameworks from a usability, runtime requirements and accuracy point of view. In order to better understand the gap between state-of-the-art protocols and what is currently available in practice for a data scientist, we designed three neural network architecture to obtain secure predictions via each of the four aforementioned frameworks. Two networks were evaluated on the MNIST dataset and one on the Malaria Cell image dataset. We observed satisfying performances for TF-Trusted and CrypTen and noted that all frameworks perfectly preserved the accuracy of the corresponding plaintext model.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"203 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122011009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","authors":"","doi":"10.1145/3411501","DOIUrl":"https://doi.org/10.1145/3411501","url":null,"abstract":"","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132780250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Currently, financial institutions utilize personal sensitive information extensively in machine learning. It results in significant privacy risks to customers. As an essential standard of privacy, differential privacy is often applied to machine learning in recent years. To establish a prediction model of credit card default under the premise of protecting personal privacy, we consider the problems of customer data contribution difference and data sample distribution imbalance, propose weighted SVM algorithm under differential privacy. Through theoretical analysis, we have ensured the security of differential privacy. The algorithm solves the problem of prediction result deviation caused by sample distribution imbalance and effectively reduces the data sensitivity and noise error. The experimental results show that the algorithm proposed in this paper can accurately predict whether a customer is default while protecting personal privacy.
{"title":"SVM Learning for Default Prediction of Credit Card under Differential Privacy","authors":"Jianping Cai, Ximeng Liu, Yingjie Wu","doi":"10.1145/3411501.3419431","DOIUrl":"https://doi.org/10.1145/3411501.3419431","url":null,"abstract":"Currently, financial institutions utilize personal sensitive information extensively in machine learning. It results in significant privacy risks to customers. As an essential standard of privacy, differential privacy is often applied to machine learning in recent years. To establish a prediction model of credit card default under the premise of protecting personal privacy, we consider the problems of customer data contribution difference and data sample distribution imbalance, propose weighted SVM algorithm under differential privacy. Through theoretical analysis, we have ensured the security of differential privacy. The algorithm solves the problem of prediction result deviation caused by sample distribution imbalance and effectively reduces the data sensitivity and noise error. The experimental results show that the algorithm proposed in this paper can accurately predict whether a customer is default while protecting personal privacy.","PeriodicalId":116231,"journal":{"name":"Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116498657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}