{"title":"迈向联邦无监督表示学习","authors":"Bram van Berlo, Aaqib Saeed, T. Ozcelebi","doi":"10.1145/3378679.3394530","DOIUrl":null,"url":null,"abstract":"Making deep learning models efficient at inferring nowadays requires training with an extensive number of labeled data that are gathered in a centralized system. However, gathering labeled data is an expensive and time-consuming process, centralized systems cannot aggregate an ever-increasing amount of data and aggregating user data is raising privacy concerns. Federated learning solves data volume and privacy issues by leaving user data on devices, but is limited to use cases where labeled data can be generated from user interaction. Unsupervised representation learning reduces the amount of labeled data required for model training, but previous work is limited to centralized systems. This work introduces federated unsupervised representation learning, a novel software architecture that uses unsupervised representation learning to pre-train deep neural networks using unlabeled data in a federated setting. Pre-trained networks can be used to extract discriminative features. The features help learn a down-stream task of interest with a reduced amount of labeled data. Based on representation performance experiments with human activity detection it is recommended to pre-train with unlabeled data originating from more users performing a bigger set of activities compared to data used with the down-stream task of interest. As a result, competitive or superior performance compared to supervised deep learning is achieved.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"55","resultStr":"{\"title\":\"Towards federated unsupervised representation learning\",\"authors\":\"Bram van Berlo, Aaqib Saeed, T. Ozcelebi\",\"doi\":\"10.1145/3378679.3394530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Making deep learning models efficient at inferring nowadays requires training with an extensive number of labeled data that are gathered in a centralized system. However, gathering labeled data is an expensive and time-consuming process, centralized systems cannot aggregate an ever-increasing amount of data and aggregating user data is raising privacy concerns. Federated learning solves data volume and privacy issues by leaving user data on devices, but is limited to use cases where labeled data can be generated from user interaction. Unsupervised representation learning reduces the amount of labeled data required for model training, but previous work is limited to centralized systems. This work introduces federated unsupervised representation learning, a novel software architecture that uses unsupervised representation learning to pre-train deep neural networks using unlabeled data in a federated setting. Pre-trained networks can be used to extract discriminative features. The features help learn a down-stream task of interest with a reduced amount of labeled data. Based on representation performance experiments with human activity detection it is recommended to pre-train with unlabeled data originating from more users performing a bigger set of activities compared to data used with the down-stream task of interest. As a result, competitive or superior performance compared to supervised deep learning is achieved.\",\"PeriodicalId\":268360,\"journal\":{\"name\":\"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"55\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3378679.3394530\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3378679.3394530","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards federated unsupervised representation learning
Making deep learning models efficient at inferring nowadays requires training with an extensive number of labeled data that are gathered in a centralized system. However, gathering labeled data is an expensive and time-consuming process, centralized systems cannot aggregate an ever-increasing amount of data and aggregating user data is raising privacy concerns. Federated learning solves data volume and privacy issues by leaving user data on devices, but is limited to use cases where labeled data can be generated from user interaction. Unsupervised representation learning reduces the amount of labeled data required for model training, but previous work is limited to centralized systems. This work introduces federated unsupervised representation learning, a novel software architecture that uses unsupervised representation learning to pre-train deep neural networks using unlabeled data in a federated setting. Pre-trained networks can be used to extract discriminative features. The features help learn a down-stream task of interest with a reduced amount of labeled data. Based on representation performance experiments with human activity detection it is recommended to pre-train with unlabeled data originating from more users performing a bigger set of activities compared to data used with the down-stream task of interest. As a result, competitive or superior performance compared to supervised deep learning is achieved.