{"title":"联邦多视图k -均值聚类","authors":"Miin-Shen Yang;Kristina P. Sinaga","doi":"10.1109/TPAMI.2024.3520708","DOIUrl":null,"url":null,"abstract":"The increasing effect of Internet of Things (IoT) unlocks the massive volume of the availability of Big Data in many fields. Generally, these Big Data may be in a non-independently and identically distributed fashion (non-IID). In this paper, we have contributions in such a way enable multi-view k-means (MVKM) clustering to maintain the privacy of each database by allowing MVKM to be operated on the local principle of clients’ multi-view data. This work integrates the exponential distance to transform the weighted Euclidean distance on MVKM so that it can make full use of development in federated learning via the MVKM clustering algorithm. The proposed algorithm, called a federated MVKM (Fed-MVKM), can provide a whole new level adding a lot of new ideas to produce a much better output. The proposed Fed-MVKM is highly suitable for clustering large data sets. To demonstrate its efficient and applicable, we implement a synthetic and six real multi-view data sets and then perform Federated Peter-Clark in Huang et al. 2023 for causal inference setting to split the data instances over multiple clients, efficiently. The results show that shared-models based local cluster centers with data-driven in the federated environment can generate a satisfying final pattern of one multi-view data that simultaneously improve the clustering performance of (non-federated) MVKM clustering algorithms.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2446-2459"},"PeriodicalIF":18.6000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Federated Multi-View K-Means Clustering\",\"authors\":\"Miin-Shen Yang;Kristina P. Sinaga\",\"doi\":\"10.1109/TPAMI.2024.3520708\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasing effect of Internet of Things (IoT) unlocks the massive volume of the availability of Big Data in many fields. Generally, these Big Data may be in a non-independently and identically distributed fashion (non-IID). In this paper, we have contributions in such a way enable multi-view k-means (MVKM) clustering to maintain the privacy of each database by allowing MVKM to be operated on the local principle of clients’ multi-view data. This work integrates the exponential distance to transform the weighted Euclidean distance on MVKM so that it can make full use of development in federated learning via the MVKM clustering algorithm. The proposed algorithm, called a federated MVKM (Fed-MVKM), can provide a whole new level adding a lot of new ideas to produce a much better output. The proposed Fed-MVKM is highly suitable for clustering large data sets. To demonstrate its efficient and applicable, we implement a synthetic and six real multi-view data sets and then perform Federated Peter-Clark in Huang et al. 2023 for causal inference setting to split the data instances over multiple clients, efficiently. The results show that shared-models based local cluster centers with data-driven in the federated environment can generate a satisfying final pattern of one multi-view data that simultaneously improve the clustering performance of (non-federated) MVKM clustering algorithms.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"47 4\",\"pages\":\"2446-2459\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2024-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10810504/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10810504/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
物联网(IoT)的影响越来越大,在许多领域释放了大量的大数据可用性。一般来说,这些大数据可能以非独立和同分布的方式(non-IID)分布。在本文中,我们做出了这样的贡献,通过允许MVKM在客户端多视图数据的本地原则下操作,使多视图k-均值(MVKM)聚类能够维护每个数据库的隐私。本文利用指数距离对MVKM上的加权欧氏距离进行积分变换,从而充分利用了MVKM聚类算法在联邦学习中的发展。所提出的算法被称为联邦MVKM (Fed-MVKM),它可以提供一个全新的水平,添加许多新想法来产生更好的输出。所提出的Fed-MVKM非常适合于大型数据集的聚类。为了证明其有效性和适用性,我们实现了一个合成的和六个真实的多视图数据集,然后执行Federated Peter-Clark (Huang et al. 2023)进行因果推理设置,以有效地将数据实例拆分到多个客户端上。结果表明,在联邦环境下,基于共享模型的数据驱动局部聚类中心可以生成令人满意的多视图数据最终模式,同时提高了(非联邦)MVKM聚类算法的聚类性能。
The increasing effect of Internet of Things (IoT) unlocks the massive volume of the availability of Big Data in many fields. Generally, these Big Data may be in a non-independently and identically distributed fashion (non-IID). In this paper, we have contributions in such a way enable multi-view k-means (MVKM) clustering to maintain the privacy of each database by allowing MVKM to be operated on the local principle of clients’ multi-view data. This work integrates the exponential distance to transform the weighted Euclidean distance on MVKM so that it can make full use of development in federated learning via the MVKM clustering algorithm. The proposed algorithm, called a federated MVKM (Fed-MVKM), can provide a whole new level adding a lot of new ideas to produce a much better output. The proposed Fed-MVKM is highly suitable for clustering large data sets. To demonstrate its efficient and applicable, we implement a synthetic and six real multi-view data sets and then perform Federated Peter-Clark in Huang et al. 2023 for causal inference setting to split the data instances over multiple clients, efficiently. The results show that shared-models based local cluster centers with data-driven in the federated environment can generate a satisfying final pattern of one multi-view data that simultaneously improve the clustering performance of (non-federated) MVKM clustering algorithms.