{"title":"Inductive and Effective Privacy-preserving Semi-supervised Learning with Harmonic Anchor Mixture","authors":"Zhi Li, Zhoujun Li","doi":"10.1145/3459104.3459187","DOIUrl":null,"url":null,"abstract":"Distributed privacy-preserving data mining (DPPDM) has been attracting enormous attention. It allows multiple participants to jointly use their datasets as a whole to train a model while preserving data privacy. Many works have been looking into the semi-supervised learning in DPPDM, to combine both labeled and unlabeled data for better performance. However, these works only provide transductive solutions, which means they can only give predictions for instances in the training set, and not for any new data sample beyond the set. Meanwhile, these methods are constructed with approximate calculations for security concerns, leading to sub-optimal results and limited effectiveness. In this paper, a mixture-model-based solution is proposed for inductive and effective semi-supervised learning in DPPDM. Our motivation lies in combining mixture models and graph-based methods to construct an anchor mixture with the ability of label prediction. We also propose an optimization process, which is accurately calculated through secure computation protocols, to achieve effectiveness. Experiments on synthetic and real-world datasets demonstrate that our proposal outperforms state-of-the-art methods in both transductive and inductive tasks.","PeriodicalId":142284,"journal":{"name":"2021 International Symposium on Electrical, Electronics and Information Engineering","volume":"124 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Symposium on Electrical, Electronics and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459104.3459187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Distributed privacy-preserving data mining (DPPDM) has been attracting enormous attention. It allows multiple participants to jointly use their datasets as a whole to train a model while preserving data privacy. Many works have been looking into the semi-supervised learning in DPPDM, to combine both labeled and unlabeled data for better performance. However, these works only provide transductive solutions, which means they can only give predictions for instances in the training set, and not for any new data sample beyond the set. Meanwhile, these methods are constructed with approximate calculations for security concerns, leading to sub-optimal results and limited effectiveness. In this paper, a mixture-model-based solution is proposed for inductive and effective semi-supervised learning in DPPDM. Our motivation lies in combining mixture models and graph-based methods to construct an anchor mixture with the ability of label prediction. We also propose an optimization process, which is accurately calculated through secure computation protocols, to achieve effectiveness. Experiments on synthetic and real-world datasets demonstrate that our proposal outperforms state-of-the-art methods in both transductive and inductive tasks.