Abdulmomen Ghalkha;Chaouki Ben Issaid;Anis Elgabli;Mehdi Bennis
{"title":"DIN: A Decentralized Inexact Newton Algorithm for Consensus Optimization","authors":"Abdulmomen Ghalkha;Chaouki Ben Issaid;Anis Elgabli;Mehdi Bennis","doi":"10.1109/TMLCN.2024.3400756","DOIUrl":null,"url":null,"abstract":"This paper tackles a challenging decentralized consensus optimization problem defined over a network of interconnected devices. The devices work collaboratively to solve a problem using only their local data and exchanging information with their immediate neighbors. One approach to solving such a problem is to use Newton-type methods, which are known for their fast convergence. However, these methods have a significant drawback as they require transmitting Hessian information between devices. This not only makes them communication-inefficient but also raises privacy concerns. To address these issues, we present a novel approach that transforms the Newton direction learning problem into a formulation composed of a sum of separable functions subjected to a consensus constraint and learns an inexact Newton direction alongside the global model without enforcing devices to share their computed Hessians using the proximal primal-dual (Prox-PDA) algorithm. Our algorithm, coined DIN, avoids sharing Hessian information between devices since each device shares a model-sized vector, concealing the first- and second-order information, reducing the network’s burden and improving both communication and energy efficiencies. Furthermore, we prove that DIN descent direction converges linearly to the optimal Newton direction. Numerical simulations corroborate that DIN exhibits higher communication efficiency in terms of communication rounds while consuming less communication and computation energy compared to existing second-order decentralized baselines.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"663-674"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10531222","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10531222/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper tackles a challenging decentralized consensus optimization problem defined over a network of interconnected devices. The devices work collaboratively to solve a problem using only their local data and exchanging information with their immediate neighbors. One approach to solving such a problem is to use Newton-type methods, which are known for their fast convergence. However, these methods have a significant drawback as they require transmitting Hessian information between devices. This not only makes them communication-inefficient but also raises privacy concerns. To address these issues, we present a novel approach that transforms the Newton direction learning problem into a formulation composed of a sum of separable functions subjected to a consensus constraint and learns an inexact Newton direction alongside the global model without enforcing devices to share their computed Hessians using the proximal primal-dual (Prox-PDA) algorithm. Our algorithm, coined DIN, avoids sharing Hessian information between devices since each device shares a model-sized vector, concealing the first- and second-order information, reducing the network’s burden and improving both communication and energy efficiencies. Furthermore, we prove that DIN descent direction converges linearly to the optimal Newton direction. Numerical simulations corroborate that DIN exhibits higher communication efficiency in terms of communication rounds while consuming less communication and computation energy compared to existing second-order decentralized baselines.