{"title":"决策树的通信效率联合学习","authors":"Shuo Zhao;Zikun Zhu;Xin Li;Ying-Chi Chen","doi":"10.1109/TAI.2024.3433419","DOIUrl":null,"url":null,"abstract":"The increasing concerns about data privacy and security have driven the emergence of federated learning, which preserves privacy by collaborative learning across multiple clients without sharing their raw data. In this article, we propose a communication-efficient federated learning algorithm for decision trees (DTs), referred to as FL-DT. The key idea is to exchange the statistics of a small number of features among the server and all clients, enabling identification of the optimal feature to split each DT node without compromising privacy. To efficiently find the splitting feature based on the partially available information at each DT node, a novel formulation is derived to estimate the lower and upper bounds of Gini indexes of all features by solving a sequence of mixed-integer convex programming problems. Our experimental results based on various public datasets demonstrate that FL-DT can reduce the communication overhead substantially without surrendering any classification accuracy, compared to other conventional methods.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5478-5492"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Communication-Efficient Federated Learning for Decision Trees\",\"authors\":\"Shuo Zhao;Zikun Zhu;Xin Li;Ying-Chi Chen\",\"doi\":\"10.1109/TAI.2024.3433419\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasing concerns about data privacy and security have driven the emergence of federated learning, which preserves privacy by collaborative learning across multiple clients without sharing their raw data. In this article, we propose a communication-efficient federated learning algorithm for decision trees (DTs), referred to as FL-DT. The key idea is to exchange the statistics of a small number of features among the server and all clients, enabling identification of the optimal feature to split each DT node without compromising privacy. To efficiently find the splitting feature based on the partially available information at each DT node, a novel formulation is derived to estimate the lower and upper bounds of Gini indexes of all features by solving a sequence of mixed-integer convex programming problems. Our experimental results based on various public datasets demonstrate that FL-DT can reduce the communication overhead substantially without surrendering any classification accuracy, compared to other conventional methods.\",\"PeriodicalId\":73305,\"journal\":{\"name\":\"IEEE transactions on artificial intelligence\",\"volume\":\"5 11\",\"pages\":\"5478-5492\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10609778/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10609778/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Communication-Efficient Federated Learning for Decision Trees
The increasing concerns about data privacy and security have driven the emergence of federated learning, which preserves privacy by collaborative learning across multiple clients without sharing their raw data. In this article, we propose a communication-efficient federated learning algorithm for decision trees (DTs), referred to as FL-DT. The key idea is to exchange the statistics of a small number of features among the server and all clients, enabling identification of the optimal feature to split each DT node without compromising privacy. To efficiently find the splitting feature based on the partially available information at each DT node, a novel formulation is derived to estimate the lower and upper bounds of Gini indexes of all features by solving a sequence of mixed-integer convex programming problems. Our experimental results based on various public datasets demonstrate that FL-DT can reduce the communication overhead substantially without surrendering any classification accuracy, compared to other conventional methods.