{"title":"Toward Quality of Information Aware Distributed Machine Learning","authors":"Houping Xiao, Shiyu Wang","doi":"10.1145/3522591","DOIUrl":null,"url":null,"abstract":"In the era of big data, data are usually distributed across numerous connected computing and storage units (i.e., nodes or workers). Under such an environment, many machine learning problems can be reformulated as a consensus optimization problem, which consists of one objective and constraint terms splitting into N parts (each corresponds to a node). Such a problem can be solved efficiently in a distributed manner via Alternating Direction Method of Multipliers (ADMM). However, existing consensus optimization frameworks assume that every node has the same quality of information (QoI), i.e., the data from all the nodes are equally informative for the estimation of global model parameters. As a consequence, they may lead to inaccurate estimates in the presence of nodes with low QoI. To overcome this challenge, in this article, we propose a novel consensus optimization framework for distributed machine-learning that incorporates the crucial metric, QoI. Theoretically, we prove that the convergence rate of the proposed framework is linear to the number of iterations, but has a tighter upper bound compared with ADMM. Experimentally, we show that the proposed framework is more efficient and effective than existing ADMM-based solutions on both synthetic and real-world datasets due to its faster convergence rate and higher accuracy.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data (TKDD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3522591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the era of big data, data are usually distributed across numerous connected computing and storage units (i.e., nodes or workers). Under such an environment, many machine learning problems can be reformulated as a consensus optimization problem, which consists of one objective and constraint terms splitting into N parts (each corresponds to a node). Such a problem can be solved efficiently in a distributed manner via Alternating Direction Method of Multipliers (ADMM). However, existing consensus optimization frameworks assume that every node has the same quality of information (QoI), i.e., the data from all the nodes are equally informative for the estimation of global model parameters. As a consequence, they may lead to inaccurate estimates in the presence of nodes with low QoI. To overcome this challenge, in this article, we propose a novel consensus optimization framework for distributed machine-learning that incorporates the crucial metric, QoI. Theoretically, we prove that the convergence rate of the proposed framework is linear to the number of iterations, but has a tighter upper bound compared with ADMM. Experimentally, we show that the proposed framework is more efficient and effective than existing ADMM-based solutions on both synthetic and real-world datasets due to its faster convergence rate and higher accuracy.