Toward Quality of Information Aware Distributed Machine Learning

Houping Xiao, Shiyu Wang
{"title":"Toward Quality of Information Aware Distributed Machine Learning","authors":"Houping Xiao, Shiyu Wang","doi":"10.1145/3522591","DOIUrl":null,"url":null,"abstract":"In the era of big data, data are usually distributed across numerous connected computing and storage units (i.e., nodes or workers). Under such an environment, many machine learning problems can be reformulated as a consensus optimization problem, which consists of one objective and constraint terms splitting into N parts (each corresponds to a node). Such a problem can be solved efficiently in a distributed manner via Alternating Direction Method of Multipliers (ADMM). However, existing consensus optimization frameworks assume that every node has the same quality of information (QoI), i.e., the data from all the nodes are equally informative for the estimation of global model parameters. As a consequence, they may lead to inaccurate estimates in the presence of nodes with low QoI. To overcome this challenge, in this article, we propose a novel consensus optimization framework for distributed machine-learning that incorporates the crucial metric, QoI. Theoretically, we prove that the convergence rate of the proposed framework is linear to the number of iterations, but has a tighter upper bound compared with ADMM. Experimentally, we show that the proposed framework is more efficient and effective than existing ADMM-based solutions on both synthetic and real-world datasets due to its faster convergence rate and higher accuracy.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data (TKDD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3522591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In the era of big data, data are usually distributed across numerous connected computing and storage units (i.e., nodes or workers). Under such an environment, many machine learning problems can be reformulated as a consensus optimization problem, which consists of one objective and constraint terms splitting into N parts (each corresponds to a node). Such a problem can be solved efficiently in a distributed manner via Alternating Direction Method of Multipliers (ADMM). However, existing consensus optimization frameworks assume that every node has the same quality of information (QoI), i.e., the data from all the nodes are equally informative for the estimation of global model parameters. As a consequence, they may lead to inaccurate estimates in the presence of nodes with low QoI. To overcome this challenge, in this article, we propose a novel consensus optimization framework for distributed machine-learning that incorporates the crucial metric, QoI. Theoretically, we prove that the convergence rate of the proposed framework is linear to the number of iterations, but has a tighter upper bound compared with ADMM. Experimentally, we show that the proposed framework is more efficient and effective than existing ADMM-based solutions on both synthetic and real-world datasets due to its faster convergence rate and higher accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向信息感知质量的分布式机器学习
在大数据时代,数据通常分布在众多连接的计算和存储单元(即节点或工作人员)上。在这样的环境下,许多机器学习问题可以被重新表述为共识优化问题,该问题由一个目标和约束项分成N个部分(每个部分对应一个节点)组成。利用乘法器交替方向法(ADMM)可以以分布式的方式有效地解决这一问题。然而,现有的共识优化框架假设每个节点具有相同的信息质量(QoI),即来自所有节点的数据对于全局模型参数的估计具有相同的信息量。因此,在存在低qi的节点时,它们可能导致不准确的估计。为了克服这一挑战,在本文中,我们为分布式机器学习提出了一个新的共识优化框架,该框架包含了关键指标qi。从理论上证明了该框架的收敛速度与迭代次数成线性关系,但与ADMM相比具有更严格的上界。实验表明,由于该框架具有更快的收敛速度和更高的精度,因此在合成数据集和实际数据集上都比现有的基于admm的解决方案更高效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
相关文献
Environmental bacterial and fungal contamination in high touch surfaces and indoor air of a paediatric intensive care unit in Maputo Central Hospital, Mozambique in 2018
IF 0 Infection Prevention in PracticePub Date : 2022-12-01 DOI: 10.1016/j.infpip.2022.100250
Vânia Maphossa , José Carlos Langa , Samuel Simbine , Fabião Edmundo Maússe , Darlene Kenga , Ventura Relvas , Valéria Chicamba , Alice Manjate , Jahit Sacarlal
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Machine Learning-based Short-term Rainfall Prediction from Sky Data Incremental Feature Spaces Learning with Label Scarcity Multi-objective Learning to Overcome Catastrophic Forgetting in Time-series Applications Combining Filtering and Cross-Correlation Efficiently for Streaming Time Series Segment-Wise Time-Varying Dynamic Bayesian Network with Graph Regularization
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1