{"title":"Distributed Saddle Point Problems for Strongly Concave-Convex Functions","authors":"Muhammad I. Qureshi;Usman A. Khan","doi":"10.1109/TSIPN.2023.3317807","DOIUrl":null,"url":null,"abstract":"In this article, we propose \n<monospace><b>GT-GDA</b></monospace>\n, a distributed optimization method to solve saddle point problems of the form: \n<inline-formula><tex-math>${\\min _{\\mathbf {x}} \\max _{\\mathbf {y}} \\lbrace F(\\mathbf x,\\mathbf y) :=G(\\mathbf x) + \\langle \\mathbf y, \\overline{P} \\mathbf x \\rangle - H(\\mathbf y) \\rbrace }$</tex-math></inline-formula>\n, where the functions \n<inline-formula><tex-math>$G(\\cdot)$</tex-math></inline-formula>\n, \n<inline-formula><tex-math>$H(\\cdot)$</tex-math></inline-formula>\n, and the coupling matrix \n<inline-formula><tex-math>$\\overline{P}$</tex-math></inline-formula>\n are distributed over a strongly connected network of nodes. \n<monospace><b>GT-GDA</b></monospace>\n is a first-order method that uses gradient tracking to eliminate the dissimilarity caused by heterogeneous data distribution among the nodes. In the most general form, \n<monospace><b>GT-GDA</b></monospace>\n includes a consensus over the local coupling matrices to achieve the optimal (unique) saddle point, however, at the expense of increased communication. To avoid this, we propose a more efficient variant \n<monospace><b>GT-GDA-Lite</b></monospace>\n that does not incur additional communication and analyze its convergence in various scenarios. We show that \n<monospace><b>GT-GDA</b></monospace>\n converges linearly to the unique saddle point solution when \n<inline-formula><tex-math>$G$</tex-math></inline-formula>\n is smooth and convex, \n<inline-formula><tex-math>$H$</tex-math></inline-formula>\n is smooth and strongly convex, and the global coupling matrix \n<inline-formula><tex-math>$\\overline{P}$</tex-math></inline-formula>\n has full column rank. We further characterize the regime under which \n<monospace><b>GT-GDA</b></monospace>\n exhibits a network topology-independent convergence behavior. We next show the linear convergence of \n<monospace><b>GT-GDA-Lite</b></monospace>\n to an error around the unique saddle point, which goes to zero when the coupling cost \n<inline-formula><tex-math>${\\langle \\mathbf y, \\overline{P} \\mathbf x \\rangle }$</tex-math></inline-formula>\n is common to all nodes, or when \n<inline-formula><tex-math>$G$</tex-math></inline-formula>\n and \n<inline-formula><tex-math>$H$</tex-math></inline-formula>\n are quadratic. Numerical experiments illustrate the convergence properties and importance of \n<monospace><b>GT-GDA</b></monospace>\n and \n<monospace><b>GT-GDA-Lite</b></monospace>\n for several applications.","PeriodicalId":56268,"journal":{"name":"IEEE Transactions on Signal and Information Processing over Networks","volume":"9 ","pages":"679-690"},"PeriodicalIF":3.0000,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal and Information Processing over Networks","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10266914/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
In this article, we propose
GT-GDA
, a distributed optimization method to solve saddle point problems of the form:
${\min _{\mathbf {x}} \max _{\mathbf {y}} \lbrace F(\mathbf x,\mathbf y) :=G(\mathbf x) + \langle \mathbf y, \overline{P} \mathbf x \rangle - H(\mathbf y) \rbrace }$
, where the functions
$G(\cdot)$
,
$H(\cdot)$
, and the coupling matrix
$\overline{P}$
are distributed over a strongly connected network of nodes.
GT-GDA
is a first-order method that uses gradient tracking to eliminate the dissimilarity caused by heterogeneous data distribution among the nodes. In the most general form,
GT-GDA
includes a consensus over the local coupling matrices to achieve the optimal (unique) saddle point, however, at the expense of increased communication. To avoid this, we propose a more efficient variant
GT-GDA-Lite
that does not incur additional communication and analyze its convergence in various scenarios. We show that
GT-GDA
converges linearly to the unique saddle point solution when
$G$
is smooth and convex,
$H$
is smooth and strongly convex, and the global coupling matrix
$\overline{P}$
has full column rank. We further characterize the regime under which
GT-GDA
exhibits a network topology-independent convergence behavior. We next show the linear convergence of
GT-GDA-Lite
to an error around the unique saddle point, which goes to zero when the coupling cost
${\langle \mathbf y, \overline{P} \mathbf x \rangle }$
is common to all nodes, or when
$G$
and
$H$
are quadratic. Numerical experiments illustrate the convergence properties and importance of
GT-GDA
and
GT-GDA-Lite
for several applications.
期刊介绍:
The IEEE Transactions on Signal and Information Processing over Networks publishes high-quality papers that extend the classical notions of processing of signals defined over vector spaces (e.g. time and space) to processing of signals and information (data) defined over networks, potentially dynamically varying. In signal processing over networks, the topology of the network may define structural relationships in the data, or may constrain processing of the data. Topics include distributed algorithms for filtering, detection, estimation, adaptation and learning, model selection, data fusion, and diffusion or evolution of information over such networks, and applications of distributed signal processing.