{"title":"Byzantine Fault-Tolerant Parallelized Stochastic Gradient Descent for Linear Regression","authors":"Nirupam Gupta, N. Vaidya","doi":"10.1109/ALLERTON.2019.8919735","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of Byzantine fault-tolerance in parallelized stochastic gradient descent (SGD) method solving for a linear regression problem. We consider a synchronous system comprising of a master and multiple workers, where up to a (known) constant number of workers are Byzantine faulty. Byzantine faulty workers may send incorrect information to the master during an execution of the parallelized SGD method. To mitigate the detrimental impact of Byzantine faulty workers, we replace the averaging of gradients in the traditional parallelized SGD method by a provably more robust gradient aggregation rule. The crux of the proposed gradient aggregation rule is a gradient-filter, named comparative gradient clipping(CGC) filter. We show that the resultant parallelized SGD method obtains a good estimate of the regression parameter even in presence of bounded fraction of Byzantine faulty workers. The upper bound derived for the asymptotic estimation error only grows linearly with the fraction of Byzantine faulty workers.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"135 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2019.8919735","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
This paper addresses the problem of Byzantine fault-tolerance in parallelized stochastic gradient descent (SGD) method solving for a linear regression problem. We consider a synchronous system comprising of a master and multiple workers, where up to a (known) constant number of workers are Byzantine faulty. Byzantine faulty workers may send incorrect information to the master during an execution of the parallelized SGD method. To mitigate the detrimental impact of Byzantine faulty workers, we replace the averaging of gradients in the traditional parallelized SGD method by a provably more robust gradient aggregation rule. The crux of the proposed gradient aggregation rule is a gradient-filter, named comparative gradient clipping(CGC) filter. We show that the resultant parallelized SGD method obtains a good estimate of the regression parameter even in presence of bounded fraction of Byzantine faulty workers. The upper bound derived for the asymptotic estimation error only grows linearly with the fraction of Byzantine faulty workers.