Wafer-Scale Computing: Advancements, Challenges, and Future Perspectives [Feature]

IF 5.6 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Circuits and Systems Magazine Pub Date : 2024-03-07 DOI:10.1109/mcas.2024.3349669

Yang Hu, Xinhan Lin, Huizheng Wang, Zhen He, Xingmao Yu, Jiahao Zhang, Qize Yang, Zheng Xu, Sihan Guan, Jiahao Fang, Haoran Shang, Xinru Tang, Xu Dai, Shaojun Wei, Shouyi Yin

{"title":"Wafer-Scale Computing: Advancements, Challenges, and Future Perspectives [Feature]","authors":"Yang Hu, Xinhan Lin, Huizheng Wang, Zhen He, Xingmao Yu, Jiahao Zhang, Qize Yang, Zheng Xu, Sihan Guan, Jiahao Fang, Haoran Shang, Xinru Tang, Xu Dai, Shaojun Wei, Shouyi Yin","doi":"10.1109/mcas.2024.3349669","DOIUrl":null,"url":null,"abstract":"Nowadays, artificial intelligence (AI) technology with large models plays an increasingly important role in both academia and industry. It also brings a rapidly increasing demand for the computing power of the hardware. As the computing demand for AI continues to grow, the growth of hardware computing power has failed to keep up. This has become a significant factor restricting the development of AI. The augmentation of hardware computing power is mainly propelled by the escalation of transistor density and chip area. However, the former is impeded by the termination of the Moore’s Law and Dennard scaling, and the latter is significantly restricted by the challenge of disrupting the legacy fabrication equipment and process. In recent years, advanced packaging technologies that have gradually matured are increasingly used to implement bigger chips that integrate multiple chiplets, while still providing interconnections with chip-level density and bandwidth. This technique points out a new path of continuing the increase of computing power while leveraging the current fabrication process without significant disruption. Enabled by this technique, a chip can extend to a size of wafer-scale (over 10,000 mm\n<inline-formula xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" xmlns:xlink=\"http://www.w3.org/1999/xlink\"> <tex-math notation=\"LaTeX\">$^{2}$ </tex-math></inline-formula>\n), provisioning orders of magnitude more computing capabilities (several POPS within just one monolithic chip) and die-to-die bandwidth density (over 15 GB/s/mm) than regular chips, and emerges a new Wafer-scale Computing paradigm. Compared to conventional high-performance computing paradigms such as multi-accelerator and datacenter-scale computing, Wafer-scale Computing shows remarkable advantages in communication bandwidth, integration density, and programmability potential. Not surprisingly, disruptive Wafer-scale Computing also brings unprecedented design challenges for hardware architecture, design- <inline-formula xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" xmlns:xlink=\"http://www.w3.org/1999/xlink\"> <tex-math notation=\"LaTeX\">$\\backslash $ </tex-math></inline-formula>\nsystem- technology co-optimization, power and cooling systems, and compiler tool chain. At present, there are no comprehensive surveys summarizing the current state and design insights of Wafer-scale Computing. This article aims to take the first step to help academia and industry review existing wafer-scale chips and essential technologies in a one-stop manner. So that people can conveniently grasp the basic knowledge and key points, understand the achievements and shortcomings of existing research, and contribute to this promising research direction.","PeriodicalId":55038,"journal":{"name":"IEEE Circuits and Systems Magazine","volume":"2 1","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Circuits and Systems Magazine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/mcas.2024.3349669","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Nowadays, artificial intelligence (AI) technology with large models plays an increasingly important role in both academia and industry. It also brings a rapidly increasing demand for the computing power of the hardware. As the computing demand for AI continues to grow, the growth of hardware computing power has failed to keep up. This has become a significant factor restricting the development of AI. The augmentation of hardware computing power is mainly propelled by the escalation of transistor density and chip area. However, the former is impeded by the termination of the Moore’s Law and Dennard scaling, and the latter is significantly restricted by the challenge of disrupting the legacy fabrication equipment and process. In recent years, advanced packaging technologies that have gradually matured are increasingly used to implement bigger chips that integrate multiple chiplets, while still providing interconnections with chip-level density and bandwidth. This technique points out a new path of continuing the increase of computing power while leveraging the current fabrication process without significant disruption. Enabled by this technique, a chip can extend to a size of wafer-scale (over 10,000 mm

$^{2}$

), provisioning orders of magnitude more computing capabilities (several POPS within just one monolithic chip) and die-to-die bandwidth density (over 15 GB/s/mm) than regular chips, and emerges a new Wafer-scale Computing paradigm. Compared to conventional high-performance computing paradigms such as multi-accelerator and datacenter-scale computing, Wafer-scale Computing shows remarkable advantages in communication bandwidth, integration density, and programmability potential. Not surprisingly, disruptive Wafer-scale Computing also brings unprecedented design challenges for hardware architecture, design-

$\backslash $

system- technology co-optimization, power and cooling systems, and compiler tool chain. At present, there are no comprehensive surveys summarizing the current state and design insights of Wafer-scale Computing. This article aims to take the first step to help academia and industry review existing wafer-scale chips and essential technologies in a one-stop manner. So that people can conveniently grasp the basic knowledge and key points, understand the achievements and shortcomings of existing research, and contribute to this promising research direction.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

晶圆级计算：进展、挑战和未来展望 [特写］

如今，具有大型模型的人工智能（AI）技术在学术界和工业界发挥着越来越重要的作用。这也带来了对硬件计算能力的快速增长需求。随着人工智能计算需求的不断增长，硬件计算能力的增长却跟不上。这已成为制约人工智能发展的重要因素。硬件计算能力的提升主要得益于晶体管密度和芯片面积的增加。然而，前者受制于摩尔定律和 Dennard Scaling 的终结，后者则受制于颠覆传统制造设备和工艺的挑战。近年来，逐渐成熟的先进封装技术越来越多地用于实现集成多个芯片的更大芯片，同时仍然提供芯片级密度和带宽的互连。这种技术为继续提高计算能力指明了一条新的道路，同时还能充分利用当前的制造工艺，而不会造成重大干扰。在这种技术的支持下，芯片可以扩展到晶圆级尺寸（超过 10,000 mm $^{2}$），提供比普通芯片更多数量级的计算能力（仅在一个单片芯片内就有多个 POPS）和芯片到芯片的带宽密度（超过 15 GB/s/mm），并出现了一种新的晶圆级计算范例。与多加速器和数据中心级计算等传统高性能计算范式相比，晶圆级计算在通信带宽、集成密度和可编程潜力方面具有显著优势。毫不奇怪，颠覆性的晶圆级计算也为硬件架构、设计、系统技术协同优化、电源和冷却系统以及编译器工具链带来了前所未有的设计挑战。目前，还没有全面总结晶圆级计算现状和设计见解的调查报告。本文旨在迈出第一步，帮助学术界和产业界一站式回顾现有的晶圆级芯片和基本技术。这样，人们就能方便地掌握基本知识和要点，了解现有研究的成就和不足，并为这一前景广阔的研究方向贡献力量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Circuits and Systems Magazine 工程技术-工程：电子与电气

CiteScore

9.30

自引率

1.40%

发文量

审稿时长

>12 weeks

期刊介绍： The IEEE Circuits and Systems Magazine covers the subject areas represented by the Society's transactions, including: analog, passive, switch capacitor, and digital filters; electronic circuits, networks, graph theory, and RF communication circuits; system theory; discrete, IC, and VLSI circuit design; multidimensional circuits and systems; large-scale systems and power networks; nonlinear circuits and systems, wavelets, filter banks, and applications; neural networks; and signal processing. Content also covers the areas represented by the Society technical committees: analog signal processing, cellular neural networks and array computing, circuits and systems for communications, computer-aided network design, digital signal processing, multimedia systems and applications, neural systems and applications, nonlinear circuits and systems, power systems and power electronics and circuits, sensors and micromaching, visual signal processing and communication, and VLSI systems and applications. Lastly, the magazine covers the interests represented by the widespread conference activity of the IEEE Circuits and Systems Society. In addition to the technical articles, the magazine also covers Society administrative activities, as for instance the meetings of the Board of Governors, Society People, as for instance the stories of award winners-fellows, medalists, and so forth, and Places reached by the Society, including readable reports from the Society's conferences around the world.

期刊最新文献

IEEE CASS Tour Ceará 2023 [CASS Conference Highlights] IEEE CASS Tour Paraiba 2023 [CASS Conference Highlights] Proceedings of the IEEE IEEE CASS Awards 2024 [Society News] IEEE Circuits and Systems Magazine Publication Information