Ziqian Lin , Xuefeng Jiang , Kun Zhang , Chongjun Fan , Yaya Liu
{"title":"FedDSHAR: A dual-strategy federated learning approach for human activity recognition amid noise label user","authors":"Ziqian Lin , Xuefeng Jiang , Kun Zhang , Chongjun Fan , Yaya Liu","doi":"10.1016/j.future.2025.107724","DOIUrl":null,"url":null,"abstract":"<div><div>Federated learning (FL) has recently achieved successes in privacy-sensitive health-care applications like medical analysis. Most previous studies suppose that collected user data are well-annotated, however, it is a strong assumption in practice. For instance, human activity recognition (HAR) task aims to train a model which predicts a certain person’s activity based on sensor data series collected from a given period of time. Due to diverse and incomplete annotation approaches, user-side data inevitably contain significant label noise, which greatly degrade model convergence and performance. In this work, we propose a novel FL framework FedDSHAR, which partitions the user-side data into the clean data subset and noisy data subset. Two strategies are utilized on two subsets to further exploit extra effective information from data, where strategic time-series augmentation is adopted on the clean subset and the semi-supervised learning scheme is used for the noisy subset. Extensive experiments conducted on three public real-world HAR datasets demonstrate that FedDSHAR outperforms six state-of-the-art methods, particularly in addressing extreme label noise in real-world distributed noisy HAR scenarios. Our code is available at <span><span>https://github.com/coke2020ice/FedDSHAR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107724"},"PeriodicalIF":6.2000,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25000196","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Federated learning (FL) has recently achieved successes in privacy-sensitive health-care applications like medical analysis. Most previous studies suppose that collected user data are well-annotated, however, it is a strong assumption in practice. For instance, human activity recognition (HAR) task aims to train a model which predicts a certain person’s activity based on sensor data series collected from a given period of time. Due to diverse and incomplete annotation approaches, user-side data inevitably contain significant label noise, which greatly degrade model convergence and performance. In this work, we propose a novel FL framework FedDSHAR, which partitions the user-side data into the clean data subset and noisy data subset. Two strategies are utilized on two subsets to further exploit extra effective information from data, where strategic time-series augmentation is adopted on the clean subset and the semi-supervised learning scheme is used for the noisy subset. Extensive experiments conducted on three public real-world HAR datasets demonstrate that FedDSHAR outperforms six state-of-the-art methods, particularly in addressing extreme label noise in real-world distributed noisy HAR scenarios. Our code is available at https://github.com/coke2020ice/FedDSHAR.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.