{"title":"基于数据分布的聚类联邦学习","authors":"Lu Yu, Wenjing Nie, Lun Xin, M. Guo","doi":"10.1145/3503047.3503102","DOIUrl":null,"url":null,"abstract":"Federated learning is a distributed machine learning framework where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. Non-independent and identically distributed data across clients is one of the challenges in federated learning applications which leads to a decline in model accuracy and modeling efficiency. We present a clustered federated learning algorithm based on data distribution and conduct an empirical evaluation. To protect the privacy of data in each client, we apply the encrypted distance computing algorithm in data set similarity measurement. The data experiments demonstrate the approach is effective for improving the accuracy and efficiency of federated learning. The AUC values of the clustered model is about 15% higher than the conventional model while the time cost of clustered modeling is less than 1/2 of that of conventional modeling.","PeriodicalId":190604,"journal":{"name":"Proceedings of the 3rd International Conference on Advanced Information Science and System","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Clustered Federated Learning Based on Data Distribution\",\"authors\":\"Lu Yu, Wenjing Nie, Lun Xin, M. Guo\",\"doi\":\"10.1145/3503047.3503102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning is a distributed machine learning framework where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. Non-independent and identically distributed data across clients is one of the challenges in federated learning applications which leads to a decline in model accuracy and modeling efficiency. We present a clustered federated learning algorithm based on data distribution and conduct an empirical evaluation. To protect the privacy of data in each client, we apply the encrypted distance computing algorithm in data set similarity measurement. The data experiments demonstrate the approach is effective for improving the accuracy and efficiency of federated learning. The AUC values of the clustered model is about 15% higher than the conventional model while the time cost of clustered modeling is less than 1/2 of that of conventional modeling.\",\"PeriodicalId\":190604,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Advanced Information Science and System\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Advanced Information Science and System\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3503047.3503102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Advanced Information Science and System","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503047.3503102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Clustered Federated Learning Based on Data Distribution
Federated learning is a distributed machine learning framework where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. Non-independent and identically distributed data across clients is one of the challenges in federated learning applications which leads to a decline in model accuracy and modeling efficiency. We present a clustered federated learning algorithm based on data distribution and conduct an empirical evaluation. To protect the privacy of data in each client, we apply the encrypted distance computing algorithm in data set similarity measurement. The data experiments demonstrate the approach is effective for improving the accuracy and efficiency of federated learning. The AUC values of the clustered model is about 15% higher than the conventional model while the time cost of clustered modeling is less than 1/2 of that of conventional modeling.