Sample selection using multi-task autoencoders in federated learning with non-IID data

IF 5.4 2区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY Engineering Science and Technology-An International Journal-Jestech Pub Date : 2025-01-01 Epub Date: 2024-12-09 DOI:10.1016/j.jestch.2024.101920
Emre Ardıç, Yakup Genç
{"title":"Sample selection using multi-task autoencoders in federated learning with non-IID data","authors":"Emre Ardıç,&nbsp;Yakup Genç","doi":"10.1016/j.jestch.2024.101920","DOIUrl":null,"url":null,"abstract":"<div><div>Federated learning is a machine learning paradigm in which multiple devices collaboratively train a model under the supervision of a central server while ensuring data privacy. However, its performance is often hindered by redundant, malicious, or abnormal samples, leading to model degradation and inefficiency. To overcome these issues, we propose novel sample selection methods for image classification, employing a multi-task autoencoder to estimate sample contributions through loss and feature analysis. Our approach incorporates unsupervised outlier detection, using one-class support vector machine (OCSVM), isolation forest (IF), and adaptive loss threshold (AT) methods managed by a central server to filter noisy samples on clients. We also propose a multi-class deep support vector data description (SVDD) loss controlled by a central server to enhance feature-based sample selection. We validate our methods on CIFAR10 and MNIST datasets across varying numbers of clients, non-IID distributions, and noise levels up to 40%. The results show significant accuracy improvements with loss-based sample selection, achieving gains of up to 7.02% on CIFAR10 with OCSVM and 1.83% on MNIST with AT. Additionally, our federated SVDD loss further improves feature-based sample selection, yielding accuracy gains of up to 0.99% on CIFAR10 with OCSVM. These results show the effectiveness of our methods in improving model accuracy across various client counts and noise conditions.</div></div>","PeriodicalId":48609,"journal":{"name":"Engineering Science and Technology-An International Journal-Jestech","volume":"61 ","pages":"Article 101920"},"PeriodicalIF":5.4000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Science and Technology-An International Journal-Jestech","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2215098624003069","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/9 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Federated learning is a machine learning paradigm in which multiple devices collaboratively train a model under the supervision of a central server while ensuring data privacy. However, its performance is often hindered by redundant, malicious, or abnormal samples, leading to model degradation and inefficiency. To overcome these issues, we propose novel sample selection methods for image classification, employing a multi-task autoencoder to estimate sample contributions through loss and feature analysis. Our approach incorporates unsupervised outlier detection, using one-class support vector machine (OCSVM), isolation forest (IF), and adaptive loss threshold (AT) methods managed by a central server to filter noisy samples on clients. We also propose a multi-class deep support vector data description (SVDD) loss controlled by a central server to enhance feature-based sample selection. We validate our methods on CIFAR10 and MNIST datasets across varying numbers of clients, non-IID distributions, and noise levels up to 40%. The results show significant accuracy improvements with loss-based sample selection, achieving gains of up to 7.02% on CIFAR10 with OCSVM and 1.83% on MNIST with AT. Additionally, our federated SVDD loss further improves feature-based sample selection, yielding accuracy gains of up to 0.99% on CIFAR10 with OCSVM. These results show the effectiveness of our methods in improving model accuracy across various client counts and noise conditions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在非iid数据的联邦学习中使用多任务自动编码器进行样本选择
联邦学习是一种机器学习范例,其中多个设备在中央服务器的监督下协同训练模型,同时确保数据隐私。然而,它的性能经常受到冗余、恶意或异常样本的阻碍,导致模型退化和效率低下。为了克服这些问题,我们提出了新的图像分类样本选择方法,采用多任务自动编码器通过损失和特征分析来估计样本贡献。我们的方法结合了无监督异常值检测,使用一类支持向量机(OCSVM)、隔离森林(IF)和自适应损失阈值(AT)方法,由中央服务器管理,以过滤客户端上的噪声样本。我们还提出了一种由中央服务器控制的多类深度支持向量数据描述(SVDD)损失,以增强基于特征的样本选择。我们在CIFAR10和MNIST数据集上验证了我们的方法,这些数据集跨越不同数量的客户端、非iid分布和高达40%的噪声水平。结果表明,基于损失的样本选择显著提高了准确率,OCSVM在CIFAR10上的增益高达7.02%,AT在MNIST上的增益高达1.83%。此外,我们的联合SVDD损失进一步改进了基于特征的样本选择,在使用OCSVM的CIFAR10上获得了高达0.99%的准确率增益。这些结果表明,我们的方法在提高不同客户数量和噪声条件下的模型精度方面是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Engineering Science and Technology-An International Journal-Jestech
Engineering Science and Technology-An International Journal-Jestech Materials Science-Electronic, Optical and Magnetic Materials
CiteScore
11.20
自引率
3.50%
发文量
153
审稿时长
22 days
期刊介绍: Engineering Science and Technology, an International Journal (JESTECH) (formerly Technology), a peer-reviewed quarterly engineering journal, publishes both theoretical and experimental high quality papers of permanent interest, not previously published in journals, in the field of engineering and applied science which aims to promote the theory and practice of technology and engineering. In addition to peer-reviewed original research papers, the Editorial Board welcomes original research reports, state-of-the-art reviews and communications in the broadly defined field of engineering science and technology. The scope of JESTECH includes a wide spectrum of subjects including: -Electrical/Electronics and Computer Engineering (Biomedical Engineering and Instrumentation; Coding, Cryptography, and Information Protection; Communications, Networks, Mobile Computing and Distributed Systems; Compilers and Operating Systems; Computer Architecture, Parallel Processing, and Dependability; Computer Vision and Robotics; Control Theory; Electromagnetic Waves, Microwave Techniques and Antennas; Embedded Systems; Integrated Circuits, VLSI Design, Testing, and CAD; Microelectromechanical Systems; Microelectronics, and Electronic Devices and Circuits; Power, Energy and Energy Conversion Systems; Signal, Image, and Speech Processing) -Mechanical and Civil Engineering (Automotive Technologies; Biomechanics; Construction Materials; Design and Manufacturing; Dynamics and Control; Energy Generation, Utilization, Conversion, and Storage; Fluid Mechanics and Hydraulics; Heat and Mass Transfer; Micro-Nano Sciences; Renewable and Sustainable Energy Technologies; Robotics and Mechatronics; Solid Mechanics and Structure; Thermal Sciences) -Metallurgical and Materials Engineering (Advanced Materials Science; Biomaterials; Ceramic and Inorgnanic Materials; Electronic-Magnetic Materials; Energy and Environment; Materials Characterizastion; Metallurgy; Polymers and Nanocomposites)
期刊最新文献
Machine-learning-guided titanium alloy design for intrinsic grain growth resistance Numerical investigation of a hybrid serpentine–pin flow field for enhanced PEMFC performance under variable operating conditions Dynamic data-model fusion modeling technology for monitoring, prediction, and optimization in machining: A comprehensive review Numerical simulation driven Cycle-GAN for domain generalization-based fault diagnosis of axial piston pumps Design of a tri-band single-loop load-modulated Doherty power amplifier
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1