DISH：利用系统异质性的分布式混合优化方法

IF 4.6 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Signal Processing Pub Date : 2024-08-26 DOI:10.1109/TSP.2024.3450351

Xiaochun Niu;Ermin Wei

{"title":"DISH：利用系统异质性的分布式混合优化方法","authors":"Xiaochun Niu;Ermin Wei","doi":"10.1109/TSP.2024.3450351","DOIUrl":null,"url":null,"abstract":"We study distributed optimization problems over multi-agent networks, including consensus and network flow problems. Existing distributed methods neglect the heterogeneity among agents’ computational capabilities, limiting their effectiveness. To address this, we propose DISH, a \n<underline>dis\ntributed \n<underline>h\nybrid method that leverages system heterogeneity. DISH allows agents with higher computational capabilities or lower computational costs to perform local Newton-type updates while others adopt simpler gradient-type updates. Notably, DISH covers existing methods like EXTRA, DIGing, and ESOM-0 as special cases. To analyze DISH's performance with general update directions, we formulate distributed problems as minimax problems and introduce GRAND (\n<underline>g\nradient-\n<underline>r\nelated \n<underline>a\nscent a\n<underline>n\nd \n<underline>d\nescent) and its alternating version, Alt-GRAND, for solving these problems. GRAND generalizes DISH to centralized minimax settings, accommodating various descent ascent update directions, including gradient-type, Newton-type, scaled gradient, and other general directions, within acute angles to the partial gradients. Theoretical analysis establishes global sublinear and linear convergence rates for GRAND and Alt-GRAND in strongly-convex-nonconcave and strongly-convex-PL settings, providing linear rates for DISH. In addition, we derive the local superlinear convergence of Newton-based variations of GRAND in centralized settings to show the potentials and limitations of Newton's method in distributed settings. Numerical experiments validate the effectiveness of our methods.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"72 ","pages":"4007-4021"},"PeriodicalIF":4.6000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DISH: A Distributed Hybrid Optimization Method Leveraging System Heterogeneity\",\"authors\":\"Xiaochun Niu;Ermin Wei\",\"doi\":\"10.1109/TSP.2024.3450351\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study distributed optimization problems over multi-agent networks, including consensus and network flow problems. Existing distributed methods neglect the heterogeneity among agents’ computational capabilities, limiting their effectiveness. To address this, we propose DISH, a \\n<underline>dis\\ntributed \\n<underline>h\\nybrid method that leverages system heterogeneity. DISH allows agents with higher computational capabilities or lower computational costs to perform local Newton-type updates while others adopt simpler gradient-type updates. Notably, DISH covers existing methods like EXTRA, DIGing, and ESOM-0 as special cases. To analyze DISH's performance with general update directions, we formulate distributed problems as minimax problems and introduce GRAND (\\n<underline>g\\nradient-\\n<underline>r\\nelated \\n<underline>a\\nscent a\\n<underline>n\\nd \\n<underline>d\\nescent) and its alternating version, Alt-GRAND, for solving these problems. GRAND generalizes DISH to centralized minimax settings, accommodating various descent ascent update directions, including gradient-type, Newton-type, scaled gradient, and other general directions, within acute angles to the partial gradients. Theoretical analysis establishes global sublinear and linear convergence rates for GRAND and Alt-GRAND in strongly-convex-nonconcave and strongly-convex-PL settings, providing linear rates for DISH. In addition, we derive the local superlinear convergence of Newton-based variations of GRAND in centralized settings to show the potentials and limitations of Newton's method in distributed settings. Numerical experiments validate the effectiveness of our methods.\",\"PeriodicalId\":13330,\"journal\":{\"name\":\"IEEE Transactions on Signal Processing\",\"volume\":\"72 \",\"pages\":\"4007-4021\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10648947/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10648947/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

我们研究多代理网络上的分布式优化问题，包括共识和网络流问题。现有的分布式方法忽视了代理计算能力的异质性，从而限制了其有效性。为了解决这个问题，我们提出了一种利用系统异质性的分布式混合方法 DISH。DISH 允许计算能力较强或计算成本较低的代理执行局部牛顿型更新，而其他代理则采用更简单的梯度型更新。值得注意的是，DISH 将 EXTRA、DIGing 和 ESOM-0 等现有方法作为特例。为了分析 DISH 在一般更新方向下的性能，我们将分布式问题表述为 minimax 问题，并引入 GRAND（梯度相关上升和下降）及其交替版本 Alt-GRAND，用于解决这些问题。GRAND 将 DISH 推广到集中式最小值设置中，在与部分梯度成锐角的范围内，容纳各种上升下降更新方向，包括梯度型、牛顿型、缩放梯度和其他一般方向。理论分析确定了 GRAND 和 Alt-GRAND 在强凸-非凹凸和强凸-PL 设置下的全局次线性和线性收敛率，并为 DISH 提供了线性收敛率。此外，我们还推导了集中式环境中基于牛顿的 GRAND 变体的局部超线性收敛，以展示牛顿方法在分布式环境中的潜力和局限性。数值实验验证了我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DISH: A Distributed Hybrid Optimization Method Leveraging System Heterogeneity

We study distributed optimization problems over multi-agent networks, including consensus and network flow problems. Existing distributed methods neglect the heterogeneity among agents’ computational capabilities, limiting their effectiveness. To address this, we propose DISH, a dis tributed h ybrid method that leverages system heterogeneity. DISH allows agents with higher computational capabilities or lower computational costs to perform local Newton-type updates while others adopt simpler gradient-type updates. Notably, DISH covers existing methods like EXTRA, DIGing, and ESOM-0 as special cases. To analyze DISH's performance with general update directions, we formulate distributed problems as minimax problems and introduce GRAND ( g radient- r elated a scent a n d d escent) and its alternating version, Alt-GRAND, for solving these problems. GRAND generalizes DISH to centralized minimax settings, accommodating various descent ascent update directions, including gradient-type, Newton-type, scaled gradient, and other general directions, within acute angles to the partial gradients. Theoretical analysis establishes global sublinear and linear convergence rates for GRAND and Alt-GRAND in strongly-convex-nonconcave and strongly-convex-PL settings, providing linear rates for DISH. In addition, we derive the local superlinear convergence of Newton-based variations of GRAND in centralized settings to show the potentials and limitations of Newton's method in distributed settings. Numerical experiments validate the effectiveness of our methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Signal Processing 工程技术-工程：电子与电气

CiteScore

11.20

自引率

9.30%

发文量

310

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Signal Processing covers novel theory, algorithms, performance analyses and applications of techniques for the processing, understanding, learning, retrieval, mining, and extraction of information from signals. The term “signal” includes, among others, audio, video, speech, image, communication, geophysical, sonar, radar, medical and musical signals. Examples of topics of interest include, but are not limited to, information processing and the theory and application of filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals.

期刊最新文献

Low-Tubal-Rank Tensor Recovery via Factorized Gradient Descent Data-Driven Quickest Change Detection in (Hidden) Markov Models Simplicial Vector Autoregressive Models A Directional Generation Algorithm for SAR Image based on Azimuth-Guided Statistical Generative Adversarial Network Structured Directional Pruning via Perturbation Orthogonal Projection