{"title":"利用聚类联合半监督学习在 HWN 中实现高效数据标记和最佳设备调度","authors":"Moqbel Hamood;Abdullatif Albaseer;Mohamed Abdallah;Ala Al-Fuqaha","doi":"10.1109/TCOMM.2024.3519538","DOIUrl":null,"url":null,"abstract":"Clustered Federated Multi-task Learning (CFL) has emerged as a promising technique to address statistical challenges, particularly with non-independent and identically distributed (non-IID) data across users. However, existing CFL studies entirely rely on the impractical assumption that devices possess access to accurate ground-truth labels. This assumption becomes specifically problematic in hierarchical wireless networks (HWNs), with vast unlabeled data and dual-level model aggregation, not only leading to slowing down convergence speeds and extending processing times but also resulting in increased resource consumption. To this end, we propose Clustered Federated Semi-Supervised Learning (CFSL), a novel framework tailored for more realistic scenarios in HWNs. We leverage specialized models resulting from device clustering and present two prediction model schemes, the best-performing specialized model and the weighted-averaging ensemble model, to correctly label unlabeled, unseen data. For the best-performing specialized model scheme, a specialized model excelling in label prediction for a specific device is assigned to correctly label the unlabeled data, even when the data originates from other environments, while the weighted-averaging ensemble model combines all specialized models into a unified model, capturing more details from broader data distributions across edge networks. The CFSL also introduces two novel prediction time schemes, split-based and stopping-based, for accurately timing the labeling process, alongside two strategic device selection schemes, greedy and round-robin, upon reaching each cluster’s stopping point. Extensive testing validates CFSL’s superiority over existing models in labeling and testing accuracies and resource efficiency, achieving up to 51% energy savings.","PeriodicalId":13041,"journal":{"name":"IEEE Transactions on Communications","volume":"73 7","pages":"4941-4957"},"PeriodicalIF":8.3000,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Data Labeling and Optimal Device Scheduling in HWNs Using Clustered Federated Semi-Supervised Learning\",\"authors\":\"Moqbel Hamood;Abdullatif Albaseer;Mohamed Abdallah;Ala Al-Fuqaha\",\"doi\":\"10.1109/TCOMM.2024.3519538\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustered Federated Multi-task Learning (CFL) has emerged as a promising technique to address statistical challenges, particularly with non-independent and identically distributed (non-IID) data across users. However, existing CFL studies entirely rely on the impractical assumption that devices possess access to accurate ground-truth labels. This assumption becomes specifically problematic in hierarchical wireless networks (HWNs), with vast unlabeled data and dual-level model aggregation, not only leading to slowing down convergence speeds and extending processing times but also resulting in increased resource consumption. To this end, we propose Clustered Federated Semi-Supervised Learning (CFSL), a novel framework tailored for more realistic scenarios in HWNs. We leverage specialized models resulting from device clustering and present two prediction model schemes, the best-performing specialized model and the weighted-averaging ensemble model, to correctly label unlabeled, unseen data. For the best-performing specialized model scheme, a specialized model excelling in label prediction for a specific device is assigned to correctly label the unlabeled data, even when the data originates from other environments, while the weighted-averaging ensemble model combines all specialized models into a unified model, capturing more details from broader data distributions across edge networks. The CFSL also introduces two novel prediction time schemes, split-based and stopping-based, for accurately timing the labeling process, alongside two strategic device selection schemes, greedy and round-robin, upon reaching each cluster’s stopping point. Extensive testing validates CFSL’s superiority over existing models in labeling and testing accuracies and resource efficiency, achieving up to 51% energy savings.\",\"PeriodicalId\":13041,\"journal\":{\"name\":\"IEEE Transactions on Communications\",\"volume\":\"73 7\",\"pages\":\"4941-4957\"},\"PeriodicalIF\":8.3000,\"publicationDate\":\"2024-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10806860/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Communications","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10806860/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Efficient Data Labeling and Optimal Device Scheduling in HWNs Using Clustered Federated Semi-Supervised Learning
Clustered Federated Multi-task Learning (CFL) has emerged as a promising technique to address statistical challenges, particularly with non-independent and identically distributed (non-IID) data across users. However, existing CFL studies entirely rely on the impractical assumption that devices possess access to accurate ground-truth labels. This assumption becomes specifically problematic in hierarchical wireless networks (HWNs), with vast unlabeled data and dual-level model aggregation, not only leading to slowing down convergence speeds and extending processing times but also resulting in increased resource consumption. To this end, we propose Clustered Federated Semi-Supervised Learning (CFSL), a novel framework tailored for more realistic scenarios in HWNs. We leverage specialized models resulting from device clustering and present two prediction model schemes, the best-performing specialized model and the weighted-averaging ensemble model, to correctly label unlabeled, unseen data. For the best-performing specialized model scheme, a specialized model excelling in label prediction for a specific device is assigned to correctly label the unlabeled data, even when the data originates from other environments, while the weighted-averaging ensemble model combines all specialized models into a unified model, capturing more details from broader data distributions across edge networks. The CFSL also introduces two novel prediction time schemes, split-based and stopping-based, for accurately timing the labeling process, alongside two strategic device selection schemes, greedy and round-robin, upon reaching each cluster’s stopping point. Extensive testing validates CFSL’s superiority over existing models in labeling and testing accuracies and resource efficiency, achieving up to 51% energy savings.
期刊介绍:
The IEEE Transactions on Communications is dedicated to publishing high-quality manuscripts that showcase advancements in the state-of-the-art of telecommunications. Our scope encompasses all aspects of telecommunications, including telephone, telegraphy, facsimile, and television, facilitated by electromagnetic propagation methods such as radio, wire, aerial, underground, coaxial, and submarine cables, as well as waveguides, communication satellites, and lasers. We cover telecommunications in various settings, including marine, aeronautical, space, and fixed station services, addressing topics such as repeaters, radio relaying, signal storage, regeneration, error detection and correction, multiplexing, carrier techniques, communication switching systems, data communications, and communication theory. Join us in advancing the field of telecommunications through groundbreaking research and innovation.