Optimal Data Splitting in Distributed Optimization for Machine Learning

IF 0.5 4区 数学 Q3 MATHEMATICS Doklady Mathematics Pub Date : 2024-03-25 DOI:10.1134/S1064562423701600
D. Medyakov, G. Molodtsov, A. Beznosikov, A. Gasnikov
{"title":"Optimal Data Splitting in Distributed Optimization for Machine Learning","authors":"D. Medyakov,&nbsp;G. Molodtsov,&nbsp;A. Beznosikov,&nbsp;A. Gasnikov","doi":"10.1134/S1064562423701600","DOIUrl":null,"url":null,"abstract":"<p>The distributed optimization problem has become increasingly relevant recently. It has a lot of advantages such as processing a large amount of data in less time compared to non-distributed methods. However, most distributed approaches suffer from a significant bottleneck—the cost of communications. Therefore, a large amount of research has recently been directed at solving this problem. One such approach uses local data similarity. In particular, there exists an algorithm provably optimally exploiting the similarity property. But this result, as well as results from other works solve the communication bottleneck by focusing only on the fact that communication is significantly more expensive than local computing and does not take into account the various capacities of network devices and the different relationship between communication time and local computing expenses. We consider this setup and the objective of this study is to achieve an optimal ratio of distributed data between the server and local machines for any costs of communications and local computations. The running times of the network are compared between uniform and optimal distributions. The superior theoretical performance of our solutions is experimentally validated.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S465 - S475"},"PeriodicalIF":0.5000,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Doklady Mathematics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1134/S1064562423701600","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

The distributed optimization problem has become increasingly relevant recently. It has a lot of advantages such as processing a large amount of data in less time compared to non-distributed methods. However, most distributed approaches suffer from a significant bottleneck—the cost of communications. Therefore, a large amount of research has recently been directed at solving this problem. One such approach uses local data similarity. In particular, there exists an algorithm provably optimally exploiting the similarity property. But this result, as well as results from other works solve the communication bottleneck by focusing only on the fact that communication is significantly more expensive than local computing and does not take into account the various capacities of network devices and the different relationship between communication time and local computing expenses. We consider this setup and the objective of this study is to achieve an optimal ratio of distributed data between the server and local machines for any costs of communications and local computations. The running times of the network are compared between uniform and optimal distributions. The superior theoretical performance of our solutions is experimentally validated.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习分布式优化中的最佳数据分割
摘要 分布式优化问题近来变得越来越重要。与非分布式方法相比,分布式方法有很多优点,比如能在更短的时间内处理大量数据。然而,大多数分布式方法都存在一个明显的瓶颈--通信成本。因此,最近有大量研究致力于解决这一问题。其中一种方法就是使用本地数据相似性。特别是,有一种算法可以证明是最佳地利用了相似性特性。但这一结果以及其他研究成果在解决通信瓶颈问题时,只关注了通信费用明显高于本地计算费用这一事实,而没有考虑到网络设备的不同容量以及通信时间与本地计算费用之间的不同关系。我们考虑了这种设置,本研究的目标是在通信和本地计算成本不变的情况下,使服务器和本地机器之间的分布式数据达到最佳比例。我们比较了均匀分布和最优分布的网络运行时间。实验验证了我们解决方案的卓越理论性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Doklady Mathematics
Doklady Mathematics 数学-数学
CiteScore
1.00
自引率
16.70%
发文量
39
审稿时长
3-6 weeks
期刊介绍: Doklady Mathematics is a journal of the Presidium of the Russian Academy of Sciences. It contains English translations of papers published in Doklady Akademii Nauk (Proceedings of the Russian Academy of Sciences), which was founded in 1933 and is published 36 times a year. Doklady Mathematics includes the materials from the following areas: mathematics, mathematical physics, computer science, control theory, and computers. It publishes brief scientific reports on previously unpublished significant new research in mathematics and its applications. The main contributors to the journal are Members of the RAS, Corresponding Members of the RAS, and scientists from the former Soviet Union and other foreign countries. Among the contributors are the outstanding Russian mathematicians.
期刊最新文献
Semi-Analytical Solution of Brent Equations Addition to the Article “A New Spectral Measure of Complexity and Its Capabilities for Detecting Signals in Noise” On a Dini Type Blow-Up Condition for Solutions of Higher Order Nonlinear Differential Inequalities Getting over Wide Obstacles by a Multi-Legged Robot On the Accuracy of Calculating Invariants in Centered Rarefaction Waves and in Their Influence Area
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1