强凹凸函数的分布鞍点问题

IF 3 3区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Signal and Information Processing over Networks Pub Date : 2023-09-28 DOI:10.1109/TSIPN.2023.3317807

Muhammad I. Qureshi;Usman A. Khan

{"title":"强凹凸函数的分布鞍点问题","authors":"Muhammad I. Qureshi;Usman A. Khan","doi":"10.1109/TSIPN.2023.3317807","DOIUrl":null,"url":null,"abstract":"In this article, we propose \n<monospace>GT-GDA</monospace>\n, a distributed optimization method to solve saddle point problems of the form: \n<inline-formula><tex-math>${\\min _{\\mathbf {x}} \\max _{\\mathbf {y}} \\lbrace F(\\mathbf x,\\mathbf y) :=G(\\mathbf x) + \\langle \\mathbf y, \\overline{P} \\mathbf x \\rangle - H(\\mathbf y) \\rbrace }$</tex-math></inline-formula>\n, where the functions \n<inline-formula><tex-math>$G(\\cdot)$</tex-math></inline-formula>\n, \n<inline-formula><tex-math>$H(\\cdot)$</tex-math></inline-formula>\n, and the coupling matrix \n<inline-formula><tex-math>$\\overline{P}$</tex-math></inline-formula>\n are distributed over a strongly connected network of nodes. \n<monospace>GT-GDA</monospace>\n is a first-order method that uses gradient tracking to eliminate the dissimilarity caused by heterogeneous data distribution among the nodes. In the most general form, \n<monospace>GT-GDA</monospace>\n includes a consensus over the local coupling matrices to achieve the optimal (unique) saddle point, however, at the expense of increased communication. To avoid this, we propose a more efficient variant \n<monospace>GT-GDA-Lite</monospace>\n that does not incur additional communication and analyze its convergence in various scenarios. We show that \n<monospace>GT-GDA</monospace>\n converges linearly to the unique saddle point solution when \n<inline-formula><tex-math>$G$</tex-math></inline-formula>\n is smooth and convex, \n<inline-formula><tex-math>$H$</tex-math></inline-formula>\n is smooth and strongly convex, and the global coupling matrix \n<inline-formula><tex-math>$\\overline{P}$</tex-math></inline-formula>\n has full column rank. We further characterize the regime under which \n<monospace>GT-GDA</monospace>\n exhibits a network topology-independent convergence behavior. We next show the linear convergence of \n<monospace>GT-GDA-Lite</monospace>\n to an error around the unique saddle point, which goes to zero when the coupling cost \n<inline-formula><tex-math>${\\langle \\mathbf y, \\overline{P} \\mathbf x \\rangle }$</tex-math></inline-formula>\n is common to all nodes, or when \n<inline-formula><tex-math>$G$</tex-math></inline-formula>\n and \n<inline-formula><tex-math>$H$</tex-math></inline-formula>\n are quadratic. Numerical experiments illustrate the convergence properties and importance of \n<monospace>GT-GDA</monospace>\n and \n<monospace>GT-GDA-Lite</monospace>\n for several applications.","PeriodicalId":56268,"journal":{"name":"IEEE Transactions on Signal and Information Processing over Networks","volume":"9 ","pages":"679-690"},"PeriodicalIF":3.0000,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distributed Saddle Point Problems for Strongly Concave-Convex Functions\",\"authors\":\"Muhammad I. Qureshi;Usman A. Khan\",\"doi\":\"10.1109/TSIPN.2023.3317807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, we propose \\n<monospace>GT-GDA</monospace>\\n, a distributed optimization method to solve saddle point problems of the form: \\n<inline-formula><tex-math>${\\\\min _{\\\\mathbf {x}} \\\\max _{\\\\mathbf {y}} \\\\lbrace F(\\\\mathbf x,\\\\mathbf y) :=G(\\\\mathbf x) + \\\\langle \\\\mathbf y, \\\\overline{P} \\\\mathbf x \\\\rangle - H(\\\\mathbf y) \\\\rbrace }$</tex-math></inline-formula>\\n, where the functions \\n<inline-formula><tex-math>$G(\\\\cdot)$</tex-math></inline-formula>\\n, \\n<inline-formula><tex-math>$H(\\\\cdot)$</tex-math></inline-formula>\\n, and the coupling matrix \\n<inline-formula><tex-math>$\\\\overline{P}$</tex-math></inline-formula>\\n are distributed over a strongly connected network of nodes. \\n<monospace>GT-GDA</monospace>\\n is a first-order method that uses gradient tracking to eliminate the dissimilarity caused by heterogeneous data distribution among the nodes. In the most general form, \\n<monospace>GT-GDA</monospace>\\n includes a consensus over the local coupling matrices to achieve the optimal (unique) saddle point, however, at the expense of increased communication. To avoid this, we propose a more efficient variant \\n<monospace>GT-GDA-Lite</monospace>\\n that does not incur additional communication and analyze its convergence in various scenarios. We show that \\n<monospace>GT-GDA</monospace>\\n converges linearly to the unique saddle point solution when \\n<inline-formula><tex-math>$G$</tex-math></inline-formula>\\n is smooth and convex, \\n<inline-formula><tex-math>$H$</tex-math></inline-formula>\\n is smooth and strongly convex, and the global coupling matrix \\n<inline-formula><tex-math>$\\\\overline{P}$</tex-math></inline-formula>\\n has full column rank. We further characterize the regime under which \\n<monospace>GT-GDA</monospace>\\n exhibits a network topology-independent convergence behavior. We next show the linear convergence of \\n<monospace>GT-GDA-Lite</monospace>\\n to an error around the unique saddle point, which goes to zero when the coupling cost \\n<inline-formula><tex-math>${\\\\langle \\\\mathbf y, \\\\overline{P} \\\\mathbf x \\\\rangle }$</tex-math></inline-formula>\\n is common to all nodes, or when \\n<inline-formula><tex-math>$G$</tex-math></inline-formula>\\n and \\n<inline-formula><tex-math>$H$</tex-math></inline-formula>\\n are quadratic. Numerical experiments illustrate the convergence properties and importance of \\n<monospace>GT-GDA</monospace>\\n and \\n<monospace>GT-GDA-Lite</monospace>\\n for several applications.\",\"PeriodicalId\":56268,\"journal\":{\"name\":\"IEEE Transactions on Signal and Information Processing over Networks\",\"volume\":\"9 \",\"pages\":\"679-690\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2023-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Signal and Information Processing over Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10266914/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal and Information Processing over Networks","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10266914/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们提出了GT-GDA，这是一种求解鞍点问题的分布式优化方法，其形式为：$｛\min_｛\mathbf｛x｝｝\max_｛\ mathbf｛y｝｝\lbrace F（\mathbfx，\mathbfy）：=G（\mathBFx）+\langle\mathbfY，\overline｛P｝\mathbfx\rangle-H（\mathbf y）\rbrace｝$，其中函数$G（\cdot）$，$H（\cdot）$，和耦合矩阵$\overline{P}$分布在强连接的节点网络上。GT-GDA是一种一阶方法，它使用梯度跟踪来消除节点之间异构数据分布造成的不相似性。在最通用的形式中，GT-GDA包括对局部耦合矩阵的共识，以实现最优（唯一）鞍点，然而，这是以增加通信为代价的。为了避免这种情况，我们提出了一种更有效的变体GT GDA Lite，它不会引起额外的通信，并分析了它在各种场景中的收敛性。我们证明了当$G$是光滑凸的，$H$是光滑强凸的，全局耦合矩阵$\overline{P}$具有全列秩时，GT-GDA线性收敛于唯一鞍点解。我们进一步刻画了GT-GDA表现出与网络拓扑无关的收敛行为的机制。接下来，我们展示了GT-GDA-Lite在唯一鞍点附近的误差的线性收敛性，当耦合成本${\langle\mathbfy，\overline{P}\mathbfx\rangle}$对所有节点是公共的时，或者当$G$和$H$是二次的时，该误差为零。数值实验说明了GT-GDA和GT-GDA-Lite的收敛特性和对几种应用的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Distributed Saddle Point Problems for Strongly Concave-Convex Functions

In this article, we propose GT-GDA , a distributed optimization method to solve saddle point problems of the form:

${\min _{\mathbf {x}} \max _{\mathbf {y}} \lbrace F(\mathbf x,\mathbf y) :=G(\mathbf x) + \langle \mathbf y, \overline{P} \mathbf x \rangle - H(\mathbf y) \rbrace }$

, where the functions

$G(\cdot)$

$H(\cdot)$

, and the coupling matrix

$\overline{P}$

are distributed over a strongly connected network of nodes. GT-GDA is a first-order method that uses gradient tracking to eliminate the dissimilarity caused by heterogeneous data distribution among the nodes. In the most general form, GT-GDA includes a consensus over the local coupling matrices to achieve the optimal (unique) saddle point, however, at the expense of increased communication. To avoid this, we propose a more efficient variant GT-GDA-Lite that does not incur additional communication and analyze its convergence in various scenarios. We show that GT-GDA converges linearly to the unique saddle point solution when

$G$

is smooth and convex,

$H$

is smooth and strongly convex, and the global coupling matrix

$\overline{P}$

has full column rank. We further characterize the regime under which GT-GDA exhibits a network topology-independent convergence behavior. We next show the linear convergence of GT-GDA-Lite to an error around the unique saddle point, which goes to zero when the coupling cost

${\langle \mathbf y, \overline{P} \mathbf x \rangle }$

is common to all nodes, or when

$G$

and

$H$

are quadratic. Numerical experiments illustrate the convergence properties and importance of GT-GDA and GT-GDA-Lite for several applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Signal and Information Processing over Networks Computer Science-Computer Networks and Communications

CiteScore

5.80

自引率

12.50%

发文量

期刊介绍： The IEEE Transactions on Signal and Information Processing over Networks publishes high-quality papers that extend the classical notions of processing of signals defined over vector spaces (e.g. time and space) to processing of signals and information (data) defined over networks, potentially dynamically varying. In signal processing over networks, the topology of the network may define structural relationships in the data, or may constrain processing of the data. Topics include distributed algorithms for filtering, detection, estimation, adaptation and learning, model selection, data fusion, and diffusion or evolution of information over such networks, and applications of distributed signal processing.