Ganzhao Yuan, Zhenjie Zhang, M. Winslett, Xiaokui Xiao, Y. Yang, Z. Hao
{"title":"精确和近似差分隐私下的批量线性查询优化","authors":"Ganzhao Yuan, Zhenjie Zhang, M. Winslett, Xiaokui Xiao, Y. Yang, Z. Hao","doi":"10.1145/2699501","DOIUrl":null,"url":null,"abstract":"Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results while satisfying the privacy guarantees. Previous work, notably Li et al. [2010], has suggested that, with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose a low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ε-) and approximate (i.e., (ε, Δ)-) differential privacy definitions. We derive the utility guarantees of LRM and provide guidance on how to set the privacy parameters, given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.","PeriodicalId":50915,"journal":{"name":"ACM Transactions on Database Systems","volume":"3 1","pages":"11:1-11:47"},"PeriodicalIF":2.2000,"publicationDate":"2015-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy\",\"authors\":\"Ganzhao Yuan, Zhenjie Zhang, M. Winslett, Xiaokui Xiao, Y. Yang, Z. Hao\",\"doi\":\"10.1145/2699501\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results while satisfying the privacy guarantees. Previous work, notably Li et al. [2010], has suggested that, with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose a low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ε-) and approximate (i.e., (ε, Δ)-) differential privacy definitions. We derive the utility guarantees of LRM and provide guidance on how to set the privacy parameters, given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.\",\"PeriodicalId\":50915,\"journal\":{\"name\":\"ACM Transactions on Database Systems\",\"volume\":\"3 1\",\"pages\":\"11:1-11:47\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2015-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Database Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/2699501\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Database Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/2699501","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results while satisfying the privacy guarantees. Previous work, notably Li et al. [2010], has suggested that, with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose a low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ε-) and approximate (i.e., (ε, Δ)-) differential privacy definitions. We derive the utility guarantees of LRM and provide guidance on how to set the privacy parameters, given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.
期刊介绍:
Heavily used in both academic and corporate R&D settings, ACM Transactions on Database Systems (TODS) is a key publication for computer scientists working in data abstraction, data modeling, and designing data management systems. Topics include storage and retrieval, transaction management, distributed and federated databases, semantics of data, intelligent databases, and operations and algorithms relating to these areas. In this rapidly changing field, TODS provides insights into the thoughts of the best minds in database R&D.