Yuechen Chen, Ahmed Louri, Fabrizio Lombardi, Shanshan Liu
{"title":"Chiplet-GAN:用于可扩展生成对抗网络推理的基于 Chiplet 的加速器设计 [特写]","authors":"Yuechen Chen, Ahmed Louri, Fabrizio Lombardi, Shanshan Liu","doi":"10.1109/mcas.2024.3359571","DOIUrl":null,"url":null,"abstract":"Generative adversarial networks (GANs) have emerged as a powerful solution for generating synthetic data when the availability of large, labeled training datasets is limited or costly in large-scale machine learning systems. Recent advancements in GAN models have extended their applications across diverse domains, including medicine, robotics, and content synthesis. These advanced GAN models have gained recognition for their excellent accuracy by scaling the model. However, existing accelerators face scalability challenges when dealing with large-scale GAN models. As the size of GAN models increases, the demand for computation and communication resources during inference continues to grow. To address this scalability issue, this article proposes Chiplet-GAN, a chiplet-based accelerator design for GAN inference. Chiplet-GAN enables scalability by adding more chiplets to the system, thereby supporting the scaling of computation capabilities. To handle the increasing communication demand as the system and model scale, a novel interconnection network with adaptive topology and passive/active network links is developed to provide adequate communication support for Chiplet-GAN. Coupled with workload partition and allocation algorithms, Chiplet-GAN reduces execution time and energy consumption for GAN inference workloads as both model and chiplet-system scales. Evaluation results using various GAN models show the effectiveness of Chiplet-GAN. On average, compared to GANAX, SpAtten, and Simba, the Chiplet-GAN reduces execution time and energy consumption by 34% and 21%, respectively. Furthermore, as the system scales for large-scale GAN model inference, Chiplet-GAN achieves reductions in execution time of up to 63% compared to the Simba, a chiplet-based accelerator.","PeriodicalId":55038,"journal":{"name":"IEEE Circuits and Systems Magazine","volume":null,"pages":null},"PeriodicalIF":5.6000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chiplet-GAN: Chiplet-Based Accelerator Design for Scalable Generative Adversarial Network Inference [Feature]\",\"authors\":\"Yuechen Chen, Ahmed Louri, Fabrizio Lombardi, Shanshan Liu\",\"doi\":\"10.1109/mcas.2024.3359571\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generative adversarial networks (GANs) have emerged as a powerful solution for generating synthetic data when the availability of large, labeled training datasets is limited or costly in large-scale machine learning systems. Recent advancements in GAN models have extended their applications across diverse domains, including medicine, robotics, and content synthesis. These advanced GAN models have gained recognition for their excellent accuracy by scaling the model. However, existing accelerators face scalability challenges when dealing with large-scale GAN models. As the size of GAN models increases, the demand for computation and communication resources during inference continues to grow. To address this scalability issue, this article proposes Chiplet-GAN, a chiplet-based accelerator design for GAN inference. Chiplet-GAN enables scalability by adding more chiplets to the system, thereby supporting the scaling of computation capabilities. To handle the increasing communication demand as the system and model scale, a novel interconnection network with adaptive topology and passive/active network links is developed to provide adequate communication support for Chiplet-GAN. Coupled with workload partition and allocation algorithms, Chiplet-GAN reduces execution time and energy consumption for GAN inference workloads as both model and chiplet-system scales. Evaluation results using various GAN models show the effectiveness of Chiplet-GAN. On average, compared to GANAX, SpAtten, and Simba, the Chiplet-GAN reduces execution time and energy consumption by 34% and 21%, respectively. Furthermore, as the system scales for large-scale GAN model inference, Chiplet-GAN achieves reductions in execution time of up to 63% compared to the Simba, a chiplet-based accelerator.\",\"PeriodicalId\":55038,\"journal\":{\"name\":\"IEEE Circuits and Systems Magazine\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Circuits and Systems Magazine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/mcas.2024.3359571\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Circuits and Systems Magazine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/mcas.2024.3359571","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
在大规模机器学习系统中,当大型标注训练数据集的可用性有限或成本高昂时,生成对抗网络(GAN)就成为生成合成数据的强大解决方案。GAN 模型的最新进展已将其应用扩展到医学、机器人和内容合成等多个领域。通过扩展模型,这些先进的 GAN 模型以其出色的准确性获得了认可。然而,现有加速器在处理大规模 GAN 模型时面临可扩展性挑战。随着 GAN 模型规模的扩大,推理过程中对计算和通信资源的需求也在不断增长。为解决这一可扩展性问题,本文提出了基于芯片的 GAN 推理加速器设计 Chiplet-GAN。Chiplet-GAN 可通过在系统中添加更多芯片来实现可扩展性,从而支持计算能力的扩展。为了处理随着系统和模型扩展而不断增加的通信需求,我们开发了一种具有自适应拓扑结构和被动/主动网络链接的新型互连网络,为 Chiplet-GAN 提供充分的通信支持。Chiplet-GAN 与工作负载分区和分配算法相结合,随着模型和芯片系统的扩展,减少了 GAN 推理工作负载的执行时间和能耗。使用各种 GAN 模型的评估结果显示了 Chiplet-GAN 的有效性。与 GANAX、SpAtten 和 Simba 相比,Chiplet-GAN 的平均执行时间和能耗分别减少了 34% 和 21%。此外,当系统扩展到大规模 GAN 模型推理时,Chiplet-GAN 的执行时间比基于芯片的加速器 Simba 最多减少了 63%。
Generative adversarial networks (GANs) have emerged as a powerful solution for generating synthetic data when the availability of large, labeled training datasets is limited or costly in large-scale machine learning systems. Recent advancements in GAN models have extended their applications across diverse domains, including medicine, robotics, and content synthesis. These advanced GAN models have gained recognition for their excellent accuracy by scaling the model. However, existing accelerators face scalability challenges when dealing with large-scale GAN models. As the size of GAN models increases, the demand for computation and communication resources during inference continues to grow. To address this scalability issue, this article proposes Chiplet-GAN, a chiplet-based accelerator design for GAN inference. Chiplet-GAN enables scalability by adding more chiplets to the system, thereby supporting the scaling of computation capabilities. To handle the increasing communication demand as the system and model scale, a novel interconnection network with adaptive topology and passive/active network links is developed to provide adequate communication support for Chiplet-GAN. Coupled with workload partition and allocation algorithms, Chiplet-GAN reduces execution time and energy consumption for GAN inference workloads as both model and chiplet-system scales. Evaluation results using various GAN models show the effectiveness of Chiplet-GAN. On average, compared to GANAX, SpAtten, and Simba, the Chiplet-GAN reduces execution time and energy consumption by 34% and 21%, respectively. Furthermore, as the system scales for large-scale GAN model inference, Chiplet-GAN achieves reductions in execution time of up to 63% compared to the Simba, a chiplet-based accelerator.
期刊介绍:
The IEEE Circuits and Systems Magazine covers the subject areas represented by the Society's transactions, including: analog, passive, switch capacitor, and digital filters; electronic circuits, networks, graph theory, and RF communication circuits; system theory; discrete, IC, and VLSI circuit design; multidimensional circuits and systems; large-scale systems and power networks; nonlinear circuits and systems, wavelets, filter banks, and applications; neural networks; and signal processing. Content also covers the areas represented by the Society technical committees: analog signal processing, cellular neural networks and array computing, circuits and systems for communications, computer-aided network design, digital signal processing, multimedia systems and applications, neural systems and applications, nonlinear circuits and systems, power systems and power electronics and circuits, sensors and micromaching, visual signal processing and communication, and VLSI systems and applications. Lastly, the magazine covers the interests represented by the widespread conference activity of the IEEE Circuits and Systems Society. In addition to the technical articles, the magazine also covers Society administrative activities, as for instance the meetings of the Board of Governors, Society People, as for instance the stories of award winners-fellows, medalists, and so forth, and Places reached by the Society, including readable reports from the Society's conferences around the world.