当模特学得太多时

Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy Pub Date : 2021-04-26 DOI:10.1145/3422337.3450327

David Evans

{"title":"当模特学得太多时","authors":"David Evans","doi":"10.1145/3422337.3450327","DOIUrl":null,"url":null,"abstract":"Statistical machine learning uses training data to produce models that capture patterns in that data. When models are trained on private data, such as medical records or personal emails, there is a risk that those models not only learn the hoped-for patterns, but will also learn and expose sensitive information about their training data. Several different types of inference attacks on machine learning models have been found, and methods have been proposed to mitigate the risks of exposing sensitive aspects of training data. Differential privacy provides formal guarantees bounding certain types of inference risk, but, at least with state-of-the-art methods, providing substantive differential privacy guarantees requires adding so much noise to the training process for com¬plex models that the resulting models are useless. Experimental evidence, however, suggests that inference attacks have limited power, and in many cases a very small amount of privacy noise seems to be enough to defuse inference attacks. In this talk, I will give an overview of a variety of different inference risks for machine learning models, talk about strategies for evaluating model inference risks, and report on some experiments by our research group to better understand the power of inference attacks in more realistic settings, and explore some broader the connections between privacy, fair-ness, and adversarial robustness.","PeriodicalId":187272,"journal":{"name":"Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy","volume":"142 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"When Models Learn Too Much\",\"authors\":\"David Evans\",\"doi\":\"10.1145/3422337.3450327\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Statistical machine learning uses training data to produce models that capture patterns in that data. When models are trained on private data, such as medical records or personal emails, there is a risk that those models not only learn the hoped-for patterns, but will also learn and expose sensitive information about their training data. Several different types of inference attacks on machine learning models have been found, and methods have been proposed to mitigate the risks of exposing sensitive aspects of training data. Differential privacy provides formal guarantees bounding certain types of inference risk, but, at least with state-of-the-art methods, providing substantive differential privacy guarantees requires adding so much noise to the training process for com¬plex models that the resulting models are useless. Experimental evidence, however, suggests that inference attacks have limited power, and in many cases a very small amount of privacy noise seems to be enough to defuse inference attacks. In this talk, I will give an overview of a variety of different inference risks for machine learning models, talk about strategies for evaluating model inference risks, and report on some experiments by our research group to better understand the power of inference attacks in more realistic settings, and explore some broader the connections between privacy, fair-ness, and adversarial robustness.\",\"PeriodicalId\":187272,\"journal\":{\"name\":\"Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy\",\"volume\":\"142 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3422337.3450327\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3422337.3450327","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

统计机器学习使用训练数据来生成捕获数据模式的模型。当使用私人数据(如医疗记录或个人电子邮件)对模型进行训练时，存在这样一种风险，即这些模型不仅会学习期望的模式，还会学习并暴露有关其训练数据的敏感信息。已经发现了针对机器学习模型的几种不同类型的推理攻击，并且已经提出了一些方法来降低暴露训练数据敏感方面的风险。差分隐私提供了限制某些类型的推理风险的正式保证，但是，至少对于最先进的方法，提供实质性的差分隐私保证需要在复杂模型的训练过程中添加太多的噪声，以至于生成的模型是无用的。然而，实验证据表明，推理攻击的力量有限，在许多情况下，非常少量的隐私噪声似乎足以化解推理攻击。在这次演讲中，我将概述机器学习模型的各种不同推理风险，讨论评估模型推理风险的策略，并报告我们研究小组的一些实验，以更好地理解推理攻击在更现实的环境中的力量，并探索隐私，公平和对抗性鲁棒性之间的更广泛的联系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

When Models Learn Too Much

Statistical machine learning uses training data to produce models that capture patterns in that data. When models are trained on private data, such as medical records or personal emails, there is a risk that those models not only learn the hoped-for patterns, but will also learn and expose sensitive information about their training data. Several different types of inference attacks on machine learning models have been found, and methods have been proposed to mitigate the risks of exposing sensitive aspects of training data. Differential privacy provides formal guarantees bounding certain types of inference risk, but, at least with state-of-the-art methods, providing substantive differential privacy guarantees requires adding so much noise to the training process for com¬plex models that the resulting models are useless. Experimental evidence, however, suggests that inference attacks have limited power, and in many cases a very small amount of privacy noise seems to be enough to defuse inference attacks. In this talk, I will give an overview of a variety of different inference risks for machine learning models, talk about strategies for evaluating model inference risks, and report on some experiments by our research group to better understand the power of inference attacks in more realistic settings, and explore some broader the connections between privacy, fair-ness, and adversarial robustness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy

自引率

0.00%

发文量

期刊最新文献

Quantum Obfuscation: Quantum Predicates with Entangled qubits When Models Learn Too Much Adaptive Fingerprinting: Website Fingerprinting over Few Encrypted Traffic Brittle Features of Device Authentication Session details: Session 2: Blockchains, Digital Currency