Generalized Lagrange Coded Computing: A Flexible Computation-Communication Tradeoff for Resilient, Secure, and Private Computation

IF 8.3 2区计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Communications Pub Date : 2024-11-05 DOI:10.1109/TCOMM.2024.3492071

Jinbao Zhu;Hengxuan Tang;Songze Li;Yijia Chang

{"title":"Generalized Lagrange Coded Computing: A Flexible Computation-Communication Tradeoff for Resilient, Secure, and Private Computation","authors":"Jinbao Zhu;Hengxuan Tang;Songze Li;Yijia Chang","doi":"10.1109/TCOMM.2024.3492071","DOIUrl":null,"url":null,"abstract":"We consider the problem of evaluating arbitrary multivariate polynomials over a massive dataset containing multiple inputs, on a distributed computing system with a leader node and multiple worker nodes. Generalized Lagrange Coded Computing (GLCC) codes are proposed to simultaneously provide resiliency against stragglers who do not return computation results in time, security against adversarial workers who deliberately modify results for their benefit, and information-theoretic privacy of the dataset amidst possible collusion of workers. GLCC codes are constructed by first partitioning the dataset into multiple groups, then encoding the dataset using carefully designed interpolating polynomials, and sharing multiple encoded data points to each worker, such that interference computation results across groups can be eliminated at the leader. Particularly, GLCC codes include the state-of-the-art Lagrange Coded Computing (LCC) codes as a special case, and exhibit a more flexible tradeoff between communication and computation overheads in optimizing system efficiency. Furthermore, we apply GLCC to distributed training of machine learning models, and demonstrate that GLCC codes achieve a speedup of up to <inline-formula> <tex-math>$2.5-3.9\\times $ </tex-math></inline-formula> over LCC codes in training time, across experiments for training image classifiers on different datasets, model architectures, and straggler patterns.","PeriodicalId":13041,"journal":{"name":"IEEE Transactions on Communications","volume":"73 6","pages":"4213-4227"},"PeriodicalIF":8.3000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Communications","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10744414/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

We consider the problem of evaluating arbitrary multivariate polynomials over a massive dataset containing multiple inputs, on a distributed computing system with a leader node and multiple worker nodes. Generalized Lagrange Coded Computing (GLCC) codes are proposed to simultaneously provide resiliency against stragglers who do not return computation results in time, security against adversarial workers who deliberately modify results for their benefit, and information-theoretic privacy of the dataset amidst possible collusion of workers. GLCC codes are constructed by first partitioning the dataset into multiple groups, then encoding the dataset using carefully designed interpolating polynomials, and sharing multiple encoded data points to each worker, such that interference computation results across groups can be eliminated at the leader. Particularly, GLCC codes include the state-of-the-art Lagrange Coded Computing (LCC) codes as a special case, and exhibit a more flexible tradeoff between communication and computation overheads in optimizing system efficiency. Furthermore, we apply GLCC to distributed training of machine learning models, and demonstrate that GLCC codes achieve a speedup of up to

$2.5-3.9\times $

over LCC codes in training time, across experiments for training image classifiers on different datasets, model architectures, and straggler patterns.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

广义拉格朗日编码计算：灵活的计算-通信权衡，实现弹性、安全和私密计算

我们考虑了在一个具有领导节点和多个工作节点的分布式计算系统上，在包含多个输入的海量数据集上评估任意多元多项式的问题。提出了广义拉格朗日编码计算（GLCC）代码，以同时提供针对未及时返回计算结果的掉队者的弹性，针对故意修改结果的对抗性工作人员的安全性，以及在可能的工作人员勾结中数据集的信息论隐私性。首先将数据集划分为多个组，然后使用精心设计的插值多项式对数据集进行编码，并将多个编码数据点共享给每个worker，从而可以在leader处消除组间的干扰计算结果。特别地，GLCC代码包括最先进的拉格朗日编码计算（LCC）代码，作为一种特殊情况，并在优化系统效率方面表现出更灵活的通信和计算开销之间的权衡。此外，我们将GLCC应用于机器学习模型的分布式训练，并证明在不同数据集、模型架构和离散模式上训练图像分类器的实验中，GLCC代码在训练时间上比LCC代码实现了高达2.5-3.9倍的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Communications 工程技术-电信学

CiteScore

16.10

自引率

8.40%

发文量

528

审稿时长

4.1 months

期刊介绍： The IEEE Transactions on Communications is dedicated to publishing high-quality manuscripts that showcase advancements in the state-of-the-art of telecommunications. Our scope encompasses all aspects of telecommunications, including telephone, telegraphy, facsimile, and television, facilitated by electromagnetic propagation methods such as radio, wire, aerial, underground, coaxial, and submarine cables, as well as waveguides, communication satellites, and lasers. We cover telecommunications in various settings, including marine, aeronautical, space, and fixed station services, addressing topics such as repeaters, radio relaying, signal storage, regeneration, error detection and correction, multiplexing, carrier techniques, communication switching systems, data communications, and communication theory. Join us in advancing the field of telecommunications through groundbreaking research and innovation.