Yanxin Xu , Hua Zhang , Zhenyan Liu , Fei Gao , Lei Qiao
{"title":"VeriTrac: Verifiable and traceable cross-silo federated learning","authors":"Yanxin Xu , Hua Zhang , Zhenyan Liu , Fei Gao , Lei Qiao","doi":"10.1016/j.future.2025.107780","DOIUrl":null,"url":null,"abstract":"<div><div>Cross-silo federated learning enables many clients to train a machine learning model collaboratively, while keeping the raw training data locally. It faces the risks of privacy leakage and malicious participants. In this paper, we introduce a new security risk that malicious clients may disrupt the training process of cross-silo federated learning by falsifying the verification evidences. The verification failure caused by this malicious behavior is not easily distinguishable from that caused by the malicious server falsifying the aggregated model. To address this issue, we design VeriTrac, the first privacy-preserving cross-silo federated learning scheme that supports verifiability and traceability. Before performing the aggregation, the server can utilize the non-private information of clients to verify messages submitted by them to avoid being framed. When the proportion of malicious clients is less than 50%, malicious participants causing the verification error can be traced. In addition, to verify the correctness of the aggregated models, a model vector with a verification factor is constructed and encrypted. The vector is confidential for the server, and the factor is part of the verification evidence and recoverable for clients. Security analysis shows that VeriTrac can guarantee the tracing of malicious participants and the data security of clients. Experimental evaluation shows that computation efficiency and communication efficiency of VeriTrac are acceptable.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"168 ","pages":"Article 107780"},"PeriodicalIF":6.2000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25000755","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Cross-silo federated learning enables many clients to train a machine learning model collaboratively, while keeping the raw training data locally. It faces the risks of privacy leakage and malicious participants. In this paper, we introduce a new security risk that malicious clients may disrupt the training process of cross-silo federated learning by falsifying the verification evidences. The verification failure caused by this malicious behavior is not easily distinguishable from that caused by the malicious server falsifying the aggregated model. To address this issue, we design VeriTrac, the first privacy-preserving cross-silo federated learning scheme that supports verifiability and traceability. Before performing the aggregation, the server can utilize the non-private information of clients to verify messages submitted by them to avoid being framed. When the proportion of malicious clients is less than 50%, malicious participants causing the verification error can be traced. In addition, to verify the correctness of the aggregated models, a model vector with a verification factor is constructed and encrypted. The vector is confidential for the server, and the factor is part of the verification evidence and recoverable for clients. Security analysis shows that VeriTrac can guarantee the tracing of malicious participants and the data security of clients. Experimental evaluation shows that computation efficiency and communication efficiency of VeriTrac are acceptable.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.