Adaptive pruning-based Newton's method for distributed learning

IF 0.9 4区计算机科学 Q3 COMPUTER SCIENCE, THEORY & METHODS Theoretical Computer Science Pub Date : 2024-11-26 DOI:10.1016/j.tcs.2024.114987

Shuzhen Chen , Yuan Yuan , Youming Tao , Tianzhu Wang , Zhipeng Cai , Dongxiao Yu

{"title":"Adaptive pruning-based Newton's method for distributed learning","authors":"Shuzhen Chen , Yuan Yuan , Youming Tao , Tianzhu Wang , Zhipeng Cai , Dongxiao Yu","doi":"10.1016/j.tcs.2024.114987","DOIUrl":null,"url":null,"abstract":"<div><div>Newton's method leverages curvature information to boost performance, and thus outperforms first-order methods for distributed learning problems. However, Newton's method is not practical in large-scale and heterogeneous learning environments, due to obstacles such as high computation and communication costs of the Hessian matrix, sub-model diversity, staleness of training, and data heterogeneity. To overcome these obstacles, this paper presents a novel and efficient algorithm named Distributed Adaptive Newton Learning (<span>DANL</span>), which solves the drawbacks of Newton's method by using a simple Hessian initialization and adaptive allocation of training regions. The algorithm exhibits remarkable convergence properties, which are rigorously examined under standard assumptions in stochastic optimization. The theoretical analysis proves that <span>DANL</span> attains a linear convergence rate while efficiently adapting to available resources and keeping high efficiency. Furthermore, <span>DANL</span> shows notable independence from the condition number of the problem and removes the necessity for complex parameter tuning. Experiments demonstrate that <span>DANL</span> achieves linear convergence with efficient communication and strong performance across different datasets.</div></div>","PeriodicalId":49438,"journal":{"name":"Theoretical Computer Science","volume":"1026 ","pages":"Article 114987"},"PeriodicalIF":0.9000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Computer Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0304397524006042","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Newton's method leverages curvature information to boost performance, and thus outperforms first-order methods for distributed learning problems. However, Newton's method is not practical in large-scale and heterogeneous learning environments, due to obstacles such as high computation and communication costs of the Hessian matrix, sub-model diversity, staleness of training, and data heterogeneity. To overcome these obstacles, this paper presents a novel and efficient algorithm named Distributed Adaptive Newton Learning (DANL), which solves the drawbacks of Newton's method by using a simple Hessian initialization and adaptive allocation of training regions. The algorithm exhibits remarkable convergence properties, which are rigorously examined under standard assumptions in stochastic optimization. The theoretical analysis proves that DANL attains a linear convergence rate while efficiently adapting to available resources and keeping high efficiency. Furthermore, DANL shows notable independence from the condition number of the problem and removes the necessity for complex parameter tuning. Experiments demonstrate that DANL achieves linear convergence with efficient communication and strong performance across different datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Theoretical Computer Science 工程技术-计算机：理论方法

CiteScore

2.60

自引率

18.20%

发文量

471

审稿时长

12.6 months

期刊介绍： Theoretical Computer Science is mathematical and abstract in spirit, but it derives its motivation from practical and everyday computation. Its aim is to understand the nature of computation and, as a consequence of this understanding, provide more efficient methodologies. All papers introducing or studying mathematical, logic and formal concepts and methods are welcome, provided that their motivation is clearly drawn from the field of computing.

期刊最新文献

Editorial Board Editorial Board Opinion maximization in social networks via link recommendation Finding a minimum spanning tree with a small non-terminal set Self-masking for hardening inversions