FedDSHAR: A dual-strategy federated learning approach for human activity recognition amid noise label user

IF 6.2 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-01-25 DOI:10.1016/j.future.2025.107724

Ziqian Lin , Xuefeng Jiang , Kun Zhang , Chongjun Fan , Yaya Liu

{"title":"FedDSHAR: A dual-strategy federated learning approach for human activity recognition amid noise label user","authors":"Ziqian Lin , Xuefeng Jiang , Kun Zhang , Chongjun Fan , Yaya Liu","doi":"10.1016/j.future.2025.107724","DOIUrl":null,"url":null,"abstract":"<div><div>Federated learning (FL) has recently achieved successes in privacy-sensitive health-care applications like medical analysis. Most previous studies suppose that collected user data are well-annotated, however, it is a strong assumption in practice. For instance, human activity recognition (HAR) task aims to train a model which predicts a certain person’s activity based on sensor data series collected from a given period of time. Due to diverse and incomplete annotation approaches, user-side data inevitably contain significant label noise, which greatly degrade model convergence and performance. In this work, we propose a novel FL framework FedDSHAR, which partitions the user-side data into the clean data subset and noisy data subset. Two strategies are utilized on two subsets to further exploit extra effective information from data, where strategic time-series augmentation is adopted on the clean subset and the semi-supervised learning scheme is used for the noisy subset. Extensive experiments conducted on three public real-world HAR datasets demonstrate that FedDSHAR outperforms six state-of-the-art methods, particularly in addressing extreme label noise in real-world distributed noisy HAR scenarios. Our code is available at <span><span>https://github.com/coke2020ice/FedDSHAR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107724"},"PeriodicalIF":6.2000,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25000196","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Federated learning (FL) has recently achieved successes in privacy-sensitive health-care applications like medical analysis. Most previous studies suppose that collected user data are well-annotated, however, it is a strong assumption in practice. For instance, human activity recognition (HAR) task aims to train a model which predicts a certain person’s activity based on sensor data series collected from a given period of time. Due to diverse and incomplete annotation approaches, user-side data inevitably contain significant label noise, which greatly degrade model convergence and performance. In this work, we propose a novel FL framework FedDSHAR, which partitions the user-side data into the clean data subset and noisy data subset. Two strategies are utilized on two subsets to further exploit extra effective information from data, where strategic time-series augmentation is adopted on the clean subset and the semi-supervised learning scheme is used for the noisy subset. Extensive experiments conducted on three public real-world HAR datasets demonstrate that FedDSHAR outperforms six state-of-the-art methods, particularly in addressing extreme label noise in real-world distributed noisy HAR scenarios. Our code is available at https://github.com/coke2020ice/FedDSHAR.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Future Generation Computer Systems-The International Journal of Escience 工程技术-计算机：理论方法

CiteScore

19.90

自引率

2.70%

发文量

376

审稿时长

10.6 months

期刊介绍： Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.

期刊最新文献

Editorial Board A self-organized MoE framework for distributed federated learning Keyed watermarks: A fine-grained watermark generation for Apache Flink Fast and Privacy-Preserving Spatial Keyword Authorization Query with access control Performance and efficiency: A multi-generational benchmark of modern processors on bandwidth-bound HPC applications