{"title":"FedNE: Surrogate-Assisted Federated Neighbor Embedding for Dimensionality Reduction","authors":"Ziwei Li, Xiaoqi Wang, Hong-You Chen, Han-Wei Shen, Wei-Lun Chao","doi":"arxiv-2409.11509","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) has rapidly evolved as a promising paradigm that\nenables collaborative model training across distributed participants without\nexchanging their local data. Despite its broad applications in fields such as\ncomputer vision, graph learning, and natural language processing, the\ndevelopment of a data projection model that can be effectively used to\nvisualize data in the context of FL is crucial yet remains heavily\nunder-explored. Neighbor embedding (NE) is an essential technique for\nvisualizing complex high-dimensional data, but collaboratively learning a joint\nNE model is difficult. The key challenge lies in the objective function, as\neffective visualization algorithms like NE require computing loss functions\namong pairs of data. In this paper, we introduce \\textsc{FedNE}, a novel\napproach that integrates the \\textsc{FedAvg} framework with the contrastive NE\ntechnique, without any requirements of shareable data. To address the lack of\ninter-client repulsion which is crucial for the alignment in the global\nembedding space, we develop a surrogate loss function that each client learns\nand shares with each other. Additionally, we propose a data-mixing strategy to\naugment the local data, aiming to relax the problems of invisible neighbors and\nfalse neighbors constructed by the local $k$NN graphs. We conduct comprehensive\nexperiments on both synthetic and real-world datasets. The results demonstrate\nthat our \\textsc{FedNE} can effectively preserve the neighborhood data\nstructures and enhance the alignment in the global embedding space compared to\nseveral baseline methods.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11509","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Federated learning (FL) has rapidly evolved as a promising paradigm that
enables collaborative model training across distributed participants without
exchanging their local data. Despite its broad applications in fields such as
computer vision, graph learning, and natural language processing, the
development of a data projection model that can be effectively used to
visualize data in the context of FL is crucial yet remains heavily
under-explored. Neighbor embedding (NE) is an essential technique for
visualizing complex high-dimensional data, but collaboratively learning a joint
NE model is difficult. The key challenge lies in the objective function, as
effective visualization algorithms like NE require computing loss functions
among pairs of data. In this paper, we introduce \textsc{FedNE}, a novel
approach that integrates the \textsc{FedAvg} framework with the contrastive NE
technique, without any requirements of shareable data. To address the lack of
inter-client repulsion which is crucial for the alignment in the global
embedding space, we develop a surrogate loss function that each client learns
and shares with each other. Additionally, we propose a data-mixing strategy to
augment the local data, aiming to relax the problems of invisible neighbors and
false neighbors constructed by the local $k$NN graphs. We conduct comprehensive
experiments on both synthetic and real-world datasets. The results demonstrate
that our \textsc{FedNE} can effectively preserve the neighborhood data
structures and enhance the alignment in the global embedding space compared to
several baseline methods.