{"title":"确保多源网域适应性,满足全球和网域隐私需求","authors":"Shuwen Chai;Yutang Xiao;Feng Liu;Jian Zhu;Yuan Zhou","doi":"10.1109/TKDE.2024.3459890","DOIUrl":null,"url":null,"abstract":"Making available a large size of training data for deep learning models and preserving data privacy are two ever-growing concerns in the machine learning community. \n<italic>Multi-source domain adaptation</i>\n (MDA) leverages the data information from different domains and aggregates them to improve the performance in the target task, while the privacy leakage risk of publishing models under malicious attacker for membership or attribute inference is even more complicated than the one faced by single-source domain adaptation. In this paper, we tackle the problem of effectively protecting data privacy while training and aggregating multi-source information, where each source domain enjoys an independent privacy budget. Specifically, we develop a \n<italic>differentially private MDA</i>\n (DPMDA) algorithm to provide domain-wise privacy protection with adaptive weighting scheme based on task similarity and task-specific privacy budget. We evaluate our algorithm on three benchmark tasks and show that DPMDA can effectively leverage different private budgets from source domains and consistently outperforms the existing private baselines with a reasonable gap with non-private state-of-the-art.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"9235-9248"},"PeriodicalIF":8.9000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Securing Multi-Source Domain Adaptation With Global and Domain-Wise Privacy Demands\",\"authors\":\"Shuwen Chai;Yutang Xiao;Feng Liu;Jian Zhu;Yuan Zhou\",\"doi\":\"10.1109/TKDE.2024.3459890\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Making available a large size of training data for deep learning models and preserving data privacy are two ever-growing concerns in the machine learning community. \\n<italic>Multi-source domain adaptation</i>\\n (MDA) leverages the data information from different domains and aggregates them to improve the performance in the target task, while the privacy leakage risk of publishing models under malicious attacker for membership or attribute inference is even more complicated than the one faced by single-source domain adaptation. In this paper, we tackle the problem of effectively protecting data privacy while training and aggregating multi-source information, where each source domain enjoys an independent privacy budget. Specifically, we develop a \\n<italic>differentially private MDA</i>\\n (DPMDA) algorithm to provide domain-wise privacy protection with adaptive weighting scheme based on task similarity and task-specific privacy budget. We evaluate our algorithm on three benchmark tasks and show that DPMDA can effectively leverage different private budgets from source domains and consistently outperforms the existing private baselines with a reasonable gap with non-private state-of-the-art.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"36 12\",\"pages\":\"9235-9248\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10679602/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10679602/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Securing Multi-Source Domain Adaptation With Global and Domain-Wise Privacy Demands
Making available a large size of training data for deep learning models and preserving data privacy are two ever-growing concerns in the machine learning community.
Multi-source domain adaptation
(MDA) leverages the data information from different domains and aggregates them to improve the performance in the target task, while the privacy leakage risk of publishing models under malicious attacker for membership or attribute inference is even more complicated than the one faced by single-source domain adaptation. In this paper, we tackle the problem of effectively protecting data privacy while training and aggregating multi-source information, where each source domain enjoys an independent privacy budget. Specifically, we develop a
differentially private MDA
(DPMDA) algorithm to provide domain-wise privacy protection with adaptive weighting scheme based on task similarity and task-specific privacy budget. We evaluate our algorithm on three benchmark tasks and show that DPMDA can effectively leverage different private budgets from source domains and consistently outperforms the existing private baselines with a reasonable gap with non-private state-of-the-art.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.