Toward Quality of Information Aware Distributed Machine Learning

ACM Transactions on Knowledge Discovery from Data (TKDD) Pub Date : 2022-03-15 DOI:10.1145/3522591

Houping Xiao, Shiyu Wang

{"title":"Toward Quality of Information Aware Distributed Machine Learning","authors":"Houping Xiao, Shiyu Wang","doi":"10.1145/3522591","DOIUrl":null,"url":null,"abstract":"In the era of big data, data are usually distributed across numerous connected computing and storage units (i.e., nodes or workers). Under such an environment, many machine learning problems can be reformulated as a consensus optimization problem, which consists of one objective and constraint terms splitting into N parts (each corresponds to a node). Such a problem can be solved efficiently in a distributed manner via Alternating Direction Method of Multipliers (ADMM). However, existing consensus optimization frameworks assume that every node has the same quality of information (QoI), i.e., the data from all the nodes are equally informative for the estimation of global model parameters. As a consequence, they may lead to inaccurate estimates in the presence of nodes with low QoI. To overcome this challenge, in this article, we propose a novel consensus optimization framework for distributed machine-learning that incorporates the crucial metric, QoI. Theoretically, we prove that the convergence rate of the proposed framework is linear to the number of iterations, but has a tighter upper bound compared with ADMM. Experimentally, we show that the proposed framework is more efficient and effective than existing ADMM-based solutions on both synthetic and real-world datasets due to its faster convergence rate and higher accuracy.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data (TKDD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3522591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In the era of big data, data are usually distributed across numerous connected computing and storage units (i.e., nodes or workers). Under such an environment, many machine learning problems can be reformulated as a consensus optimization problem, which consists of one objective and constraint terms splitting into N parts (each corresponds to a node). Such a problem can be solved efficiently in a distributed manner via Alternating Direction Method of Multipliers (ADMM). However, existing consensus optimization frameworks assume that every node has the same quality of information (QoI), i.e., the data from all the nodes are equally informative for the estimation of global model parameters. As a consequence, they may lead to inaccurate estimates in the presence of nodes with low QoI. To overcome this challenge, in this article, we propose a novel consensus optimization framework for distributed machine-learning that incorporates the crucial metric, QoI. Theoretically, we prove that the convergence rate of the proposed framework is linear to the number of iterations, but has a tighter upper bound compared with ADMM. Experimentally, we show that the proposed framework is more efficient and effective than existing ADMM-based solutions on both synthetic and real-world datasets due to its faster convergence rate and higher accuracy.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

面向信息感知质量的分布式机器学习

在大数据时代，数据通常分布在众多连接的计算和存储单元(即节点或工作人员)上。在这样的环境下，许多机器学习问题可以被重新表述为共识优化问题，该问题由一个目标和约束项分成N个部分(每个部分对应一个节点)组成。利用乘法器交替方向法(ADMM)可以以分布式的方式有效地解决这一问题。然而，现有的共识优化框架假设每个节点具有相同的信息质量(QoI)，即来自所有节点的数据对于全局模型参数的估计具有相同的信息量。因此，在存在低qi的节点时，它们可能导致不准确的估计。为了克服这一挑战，在本文中，我们为分布式机器学习提出了一个新的共识优化框架，该框架包含了关键指标qi。从理论上证明了该框架的收敛速度与迭代次数成线性关系，但与ADMM相比具有更严格的上界。实验表明，由于该框架具有更快的收敛速度和更高的精度，因此在合成数据集和实际数据集上都比现有的基于admm的解决方案更高效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

IF 2.9 ACS Applied Bio MaterialsPub Date : 2022-08-31 DOI: 10.2147/IDR.S380774

Robel Mekonnen Yimer, Mesfin Kebede Alemu

Environmental bacterial and fungal contamination in high touch surfaces and indoor air of a paediatric intensive care unit in Maputo Central Hospital, Mozambique in 2018

IF 0 Infection Prevention in PracticePub Date : 2022-12-01 DOI: 10.1016/j.infpip.2022.100250

Vânia Maphossa , José Carlos Langa , Samuel Simbine , Fabião Edmundo Maússe , Darlene Kenga , Ventura Relvas , Valéria Chicamba , Alice Manjate , Jahit Sacarlal

Bacterial contamination rates and drug susceptibility patterns of bacteria recovered from medical equipment, inanimate surfaces, and indoor air of a neonatal intensive care unit and pediatric ward at Hawassa University Comprehensive Specialized Hospital, Ethiopia

IF 0 IJID regionsPub Date : 2021-12-01 DOI: 10.1016/j.ijregi.2021.09.005

Konjit Bitew, Deresse Daka Gidebo, Musa Mohammed Ali

来源期刊