Log-Scale Quantization in Distributed First-Order Methods: Gradient-Based Learning From Distributed Data

IF 6.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2025-01-08 DOI:10.1109/TASE.2025.3526967

Mohammadreza Doostmohammadian;Muhammad I. Qureshi;Mohammad Hossein Khalesi;Hamid R. Rabiee;Usman A. Khan

{"title":"Log-Scale Quantization in Distributed First-Order Methods: Gradient-Based Learning From Distributed Data","authors":"Mohammadreza Doostmohammadian;Muhammad I. Qureshi;Mohammad Hossein Khalesi;Hamid R. Rabiee;Usman A. Khan","doi":"10.1109/TASE.2025.3526967","DOIUrl":null,"url":null,"abstract":"Decentralized strategies are of interest for learning from large-scale data over networks. This paper studies learning over a network of geographically distributed nodes/agents subject to quantization. Each node possesses a private local cost function, collectively contributing to a global cost function, which the considered methodology aims to minimize. In contrast to many existing papers, the information exchange among nodes is log-quantized to address limited network-bandwidth in practical situations. We consider a first-order computationally efficient distributed optimization algorithm (with no extra inner consensus loop) that leverages node-level gradient correction based on local data and network-level gradient aggregation only over nearby nodes. This method only requires balanced networks with no need for stochastic weight design. It can handle log-scale quantized data exchange over possibly time-varying and switching network setups. We study convergence over both structured networks (for example, training over data-centers) and ad-hoc multi-agent networks (for example, training over dynamic robotic networks). Through experimental validation, we show that (i) structured networks generally result in a smaller optimality gap, and (ii) log-scale quantization leads to a smaller optimality gap compared to uniform quantization. Note to Practitioners—Motivated by recent developments in cloud computing, parallel processing, and the availability of low-cost CPUs and communication networks, this paper considers distributed and decentralized algorithms for machine learning and optimization. These algorithms are particularly relevant for decentralized data mining, where data sets are distributed across a network of computing nodes. A practical example of this is the classification of images over a networked data centre. In real-world scenarios, practical model nonlinearities such as data quantization must be addressed for information exchange among the computing nodes. This work emphasizes the importance of handling log-scale quantization and compares its performance over uniform quantization. By exploring these quantization methods, we aim to determine which is more accurate in terms of optimality gap and learning residual. Moreover, we study the impact of the structure of the information-sharing network on reducing the optimality gap and improving the convergence rate of distributed algorithms. As contemporary distributed and networked data mining systems demand highly accurate algorithms with fast convergence for real-time applications, our research emphasizes the benefit of structured networks under logarithmic quantization information-exchange. Our findings can be extended to different machine learning algorithms, offering pathways to more accurate and faster data mining solutions.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"10948-10959"},"PeriodicalIF":6.4000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10833724/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Decentralized strategies are of interest for learning from large-scale data over networks. This paper studies learning over a network of geographically distributed nodes/agents subject to quantization. Each node possesses a private local cost function, collectively contributing to a global cost function, which the considered methodology aims to minimize. In contrast to many existing papers, the information exchange among nodes is log-quantized to address limited network-bandwidth in practical situations. We consider a first-order computationally efficient distributed optimization algorithm (with no extra inner consensus loop) that leverages node-level gradient correction based on local data and network-level gradient aggregation only over nearby nodes. This method only requires balanced networks with no need for stochastic weight design. It can handle log-scale quantized data exchange over possibly time-varying and switching network setups. We study convergence over both structured networks (for example, training over data-centers) and ad-hoc multi-agent networks (for example, training over dynamic robotic networks). Through experimental validation, we show that (i) structured networks generally result in a smaller optimality gap, and (ii) log-scale quantization leads to a smaller optimality gap compared to uniform quantization. Note to Practitioners—Motivated by recent developments in cloud computing, parallel processing, and the availability of low-cost CPUs and communication networks, this paper considers distributed and decentralized algorithms for machine learning and optimization. These algorithms are particularly relevant for decentralized data mining, where data sets are distributed across a network of computing nodes. A practical example of this is the classification of images over a networked data centre. In real-world scenarios, practical model nonlinearities such as data quantization must be addressed for information exchange among the computing nodes. This work emphasizes the importance of handling log-scale quantization and compares its performance over uniform quantization. By exploring these quantization methods, we aim to determine which is more accurate in terms of optimality gap and learning residual. Moreover, we study the impact of the structure of the information-sharing network on reducing the optimality gap and improving the convergence rate of distributed algorithms. As contemporary distributed and networked data mining systems demand highly accurate algorithms with fast convergence for real-time applications, our research emphasizes the benefit of structured networks under logarithmic quantization information-exchange. Our findings can be extended to different machine learning algorithms, offering pathways to more accurate and faster data mining solutions.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

分布式一阶方法中的对数尺度量化：基于梯度的分布式数据学习

分散策略对于从网络上的大规模数据中学习很有意义。本文研究了基于量化的地理分布节点/代理网络的学习问题。每个节点都有一个私有的局部成本函数，共同构成一个全局成本函数，所考虑的方法旨在使其最小化。与许多现有论文相比，节点之间的信息交换是日志量化的，以解决实际情况下有限的网络带宽。我们考虑了一种一阶计算效率高的分布式优化算法（没有额外的内部共识循环），它利用基于本地数据的节点级梯度校正和仅在附近节点上的网络级梯度聚合。该方法只需要平衡网络，不需要随机权值设计。它可以处理可能时变和交换网络设置上的对数尺度量化数据交换。我们研究了结构化网络（例如，在数据中心上的训练）和ad-hoc多智能体网络（例如，在动态机器人网络上的训练）的收敛性。通过实验验证，我们表明(i)结构化网络通常导致较小的最优性差距，（ii）与均匀量化相比，对数尺度量化导致较小的最优性差距。从业人员注意事项——受云计算、并行处理、低成本cpu和通信网络可用性的最新发展的推动，本文考虑了用于机器学习和优化的分布式和分散算法。这些算法与分散的数据挖掘特别相关，其中数据集分布在计算节点网络上。这方面的一个实际例子是对网络数据中心上的图像进行分类。在实际场景中，必须解决实际模型非线性，如数据量化，以便在计算节点之间进行信息交换。这项工作强调了处理对数尺度量化的重要性，并比较了其与均匀量化的性能。通过探索这些量化方法，我们的目标是确定哪种方法在最优性差距和学习残差方面更准确。此外，我们还研究了信息共享网络的结构对减小分布式算法的最优性差距和提高算法收敛速度的影响。由于当代分布式和网络化数据挖掘系统需要高精度的算法和快速收敛的实时应用，我们的研究强调了结构化网络在对数量化信息交换下的优势。我们的发现可以扩展到不同的机器学习算法，为更准确、更快速的数据挖掘解决方案提供途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.