Cost-Efficient Feature Selection for Horizontal Federated Learning

IEEE transactions on artificial intelligence Pub Date : 2024-08-01 DOI:10.1109/TAI.2024.3436664

Sourasekhar Banerjee;Devvjiit Bhuyan;Erik Elmroth;Monowar Bhuyan

{"title":"Cost-Efficient Feature Selection for Horizontal Federated Learning","authors":"Sourasekhar Banerjee;Devvjiit Bhuyan;Erik Elmroth;Monowar Bhuyan","doi":"10.1109/TAI.2024.3436664","DOIUrl":null,"url":null,"abstract":"Horizontal federated learning (HFL) exhibits substantial similarities in feature space across distinct clients. However, not all features contribute significantly to the training of the global model. Moreover, the curse of dimensionality delays the training. Therefore, reducing irrelevant and redundant features from the feature space makes training faster and inexpensive. This work aims to identify the common feature subset from the clients in federated settings. We introduce a hybrid approach called Fed-MOFS,\n<xref>1</xref>\n<fn><label>1</label>This manuscript is an extension of Banerjee et al. <xref>[1]</xref>.</fn>\n utilizing mutual information (MI) and clustering for local FS at each client. Unlike the Fed-FiS, which uses a scoring function for global feature ranking, Fed-MOFS employs multiobjective optimization to prioritize features based on their higher relevance and lower redundancy. This article compares the performance of Fed-MOFS\n<xref>2</xref>\n<fn><label>2</label>We share our code, data, and supplementary copy through <uri>https://github.com/DevBhuyan/Horz-FL/blob/main/README.md</uri>.</fn>\n with conventional and federated FS methods. Moreover, we tested the scalability, stability, and efficacy of both Fed-FiS and Fed-MOFS across diverse datasets. We also assessed how FS influenced model convergence and explored its impact in scenarios with data heterogeneity. Our results show that Fed-MOFS enhances global model performance with a 50% reduction in feature space and is at least twice as fast as the FSHFL method. The computational complexity for both approaches is O(\n<inline-formula><tex-math>$d^{2}$</tex-math></inline-formula>\n), which is lower than the state of the art.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6551-6565"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10620005/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Horizontal federated learning (HFL) exhibits substantial similarities in feature space across distinct clients. However, not all features contribute significantly to the training of the global model. Moreover, the curse of dimensionality delays the training. Therefore, reducing irrelevant and redundant features from the feature space makes training faster and inexpensive. This work aims to identify the common feature subset from the clients in federated settings. We introduce a hybrid approach called Fed-MOFS, ¹ ¹

This manuscript is an extension of Banerjee et al. [1].

utilizing mutual information (MI) and clustering for local FS at each client. Unlike the Fed-FiS, which uses a scoring function for global feature ranking, Fed-MOFS employs multiobjective optimization to prioritize features based on their higher relevance and lower redundancy. This article compares the performance of Fed-MOFS ² ²

We share our code, data, and supplementary copy through https://github.com/DevBhuyan/Horz-FL/blob/main/README.md.

with conventional and federated FS methods. Moreover, we tested the scalability, stability, and efficacy of both Fed-FiS and Fed-MOFS across diverse datasets. We also assessed how FS influenced model convergence and explored its impact in scenarios with data heterogeneity. Our results show that Fed-MOFS enhances global model performance with a 50% reduction in feature space and is at least twice as fast as the FSHFL method. The computational complexity for both approaches is O(

$d^{2}$

), which is lower than the state of the art.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助