Distributed Learning of Pure Non-IID data using Latent Codes

2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM) Pub Date : 2023-01-03 DOI:10.1109/IMCOM56909.2023.10035595

Anirudh Kasturi, A. Agrawal, C. Hota

{"title":"Distributed Learning of Pure Non-IID data using Latent Codes","authors":"Anirudh Kasturi, A. Agrawal, C. Hota","doi":"10.1109/IMCOM56909.2023.10035595","DOIUrl":null,"url":null,"abstract":"There has been a huge increase in the amount of data being generated as a result of the proliferation of high-tech, data-generating devices made possible by recent developments in mobile technology. This has rekindled interest in creating smart applications that can make use of the possibilities of this data and provide insightful results. Concerns about bandwidth, privacy, and latency arise when this data from many devices is aggregated in one location to create more precise predictions. This research presents a novel distributed learning approach, wherein a Variational Auto Encoder is trained locally on each client and then used to derive a sample set of points centrally. The server then develops a unified global model, and sends its training parameters to all users. Pure non-i.i.d. distributions, in which each client only sees data labelled with a single value, are the primary focus of our study. According to our findings, communication amongst the server and the clients takes significantly less time than it does in federated and centralised learning setups. We further demonstrate that, whenever the data is spread in a pure non-iid fashion, our methodology achieves higher accuracy than the federated learning strategy by more than 4%. We also showed that, in comparison to centralised and federated learning systems, our suggested method requires less network bandwidth.","PeriodicalId":230213,"journal":{"name":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM56909.2023.10035595","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

There has been a huge increase in the amount of data being generated as a result of the proliferation of high-tech, data-generating devices made possible by recent developments in mobile technology. This has rekindled interest in creating smart applications that can make use of the possibilities of this data and provide insightful results. Concerns about bandwidth, privacy, and latency arise when this data from many devices is aggregated in one location to create more precise predictions. This research presents a novel distributed learning approach, wherein a Variational Auto Encoder is trained locally on each client and then used to derive a sample set of points centrally. The server then develops a unified global model, and sends its training parameters to all users. Pure non-i.i.d. distributions, in which each client only sees data labelled with a single value, are the primary focus of our study. According to our findings, communication amongst the server and the clients takes significantly less time than it does in federated and centralised learning setups. We further demonstrate that, whenever the data is spread in a pure non-iid fashion, our methodology achieves higher accuracy than the federated learning strategy by more than 4%. We also showed that, in comparison to centralised and federated learning systems, our suggested method requires less network bandwidth.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用潜在代码的纯非iid数据的分布式学习

由于最近移动技术的发展使高科技数据生成设备的扩散成为可能，因此产生的数据量大幅增加。这重新激发了人们对创建智能应用程序的兴趣，这些应用程序可以利用这些数据的可能性并提供有洞察力的结果。当来自许多设备的数据聚集在一个位置以创建更精确的预测时，就会出现带宽、隐私和延迟问题。本研究提出了一种新颖的分布式学习方法，其中变分自动编码器在每个客户端进行局部训练，然后用于集中派生点的样本集。然后，服务器开发一个统一的全局模型，并将其训练参数发送给所有用户。纯non-i.i.d。分布是我们研究的主要焦点，在分布中，每个客户只看到标有单个值的数据。根据我们的发现，服务器和客户端之间的通信比联邦和集中式学习设置所花费的时间要少得多。我们进一步证明，当数据以纯非id方式传播时，我们的方法比联邦学习策略的准确率高出4%以上。我们还表明，与集中式和联邦式学习系统相比，我们建议的方法需要更少的网络带宽。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)

自引率

0.00%

发文量