针对混合 ML 和非 ML 工作负载的内存计算、数字 ML 加速器和 RISC-V 内核异构三维集成的热约束代码设计

IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-06-28 DOI:10.1109/TVLSI.2024.3415481
Yuan-Chun Luo;Anni Lu;Janak Sharda;Moritz Scherer;Jorge Tomas Gomez;Syed Shakib Sarwar;Ziyun Li;Reid Frederick Pinkham;Barbara De Salvo;Shimeng Yu
{"title":"针对混合 ML 和非 ML 工作负载的内存计算、数字 ML 加速器和 RISC-V 内核异构三维集成的热约束代码设计","authors":"Yuan-Chun Luo;Anni Lu;Janak Sharda;Moritz Scherer;Jorge Tomas Gomez;Syed Shakib Sarwar;Ziyun Li;Reid Frederick Pinkham;Barbara De Salvo;Shimeng Yu","doi":"10.1109/TVLSI.2024.3415481","DOIUrl":null,"url":null,"abstract":"Heterogeneous 3-D (H3D) integration not only reduces the chip form factor and fabrication cost but also allows the merging of diverse compute paradigms that suit different applications. This is especially attractive when modern algorithms, such as the augmented reality/virtual reality (AR/VR) workloads, consist of mixed machine learning (ML) and non-ML workloads. To date, codesign that considers the thermal, latency, and power constraints of H3D hardware is largely unexplored. In this work, a thermally aware framework for H3D hardware design is developed to evaluate the thermal, latency, and power trade-offs for a heterogeneous system with compute-in-memory (CIM), digital ML cores, and RISC-V cores. The framework solves for runtime tunable operating points described as the optimal speedup factor, the number of activated RISC-V cores, the cooling coefficient, and the activity rate based on user-defined criteria, achieving up to 135 TOPS and 215 TOPS/W under \n<inline-formula> <tex-math>$74~^{\\circ }$ </tex-math></inline-formula>\nC for the AR/VR workloads.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 9","pages":"1718-1725"},"PeriodicalIF":2.8000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Thermally Constrained Codesign of Heterogeneous 3-D Integration of Compute-in-Memory, Digital ML Accelerator, and RISC-V Cores for Mixed ML and Non-ML Workloads\",\"authors\":\"Yuan-Chun Luo;Anni Lu;Janak Sharda;Moritz Scherer;Jorge Tomas Gomez;Syed Shakib Sarwar;Ziyun Li;Reid Frederick Pinkham;Barbara De Salvo;Shimeng Yu\",\"doi\":\"10.1109/TVLSI.2024.3415481\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Heterogeneous 3-D (H3D) integration not only reduces the chip form factor and fabrication cost but also allows the merging of diverse compute paradigms that suit different applications. This is especially attractive when modern algorithms, such as the augmented reality/virtual reality (AR/VR) workloads, consist of mixed machine learning (ML) and non-ML workloads. To date, codesign that considers the thermal, latency, and power constraints of H3D hardware is largely unexplored. In this work, a thermally aware framework for H3D hardware design is developed to evaluate the thermal, latency, and power trade-offs for a heterogeneous system with compute-in-memory (CIM), digital ML cores, and RISC-V cores. The framework solves for runtime tunable operating points described as the optimal speedup factor, the number of activated RISC-V cores, the cooling coefficient, and the activity rate based on user-defined criteria, achieving up to 135 TOPS and 215 TOPS/W under \\n<inline-formula> <tex-math>$74~^{\\\\circ }$ </tex-math></inline-formula>\\nC for the AR/VR workloads.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"32 9\",\"pages\":\"1718-1725\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10576629/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10576629/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

异构三维(H3D)集成不仅能降低芯片外形尺寸和制造成本,还能融合适合不同应用的各种计算模式。当现代算法(如增强现实/虚拟现实(AR/VR)工作负载)由混合机器学习(ML)和非 ML 工作负载组成时,这一点尤其具有吸引力。迄今为止,考虑到 H3D 硬件的热量、延迟和功耗限制的代码设计在很大程度上尚未得到探索。在这项工作中,为 H3D 硬件设计开发了一个热感知框架,用于评估具有内存计算(CIM)、数字 ML 内核和 RISC-V 内核的异构系统的热、延迟和功耗权衡。该框架根据用户定义的标准,解决了运行时可调整的操作点,这些操作点被描述为最佳加速因子、激活的 RISC-V 内核数量、冷却系数和活动率,在 AR/VR 工作负载的 $74~^{\circ }$ C 条件下实现了高达 135 TOPS 和 215 TOPS/W。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Thermally Constrained Codesign of Heterogeneous 3-D Integration of Compute-in-Memory, Digital ML Accelerator, and RISC-V Cores for Mixed ML and Non-ML Workloads
Heterogeneous 3-D (H3D) integration not only reduces the chip form factor and fabrication cost but also allows the merging of diverse compute paradigms that suit different applications. This is especially attractive when modern algorithms, such as the augmented reality/virtual reality (AR/VR) workloads, consist of mixed machine learning (ML) and non-ML workloads. To date, codesign that considers the thermal, latency, and power constraints of H3D hardware is largely unexplored. In this work, a thermally aware framework for H3D hardware design is developed to evaluate the thermal, latency, and power trade-offs for a heterogeneous system with compute-in-memory (CIM), digital ML cores, and RISC-V cores. The framework solves for runtime tunable operating points described as the optimal speedup factor, the number of activated RISC-V cores, the cooling coefficient, and the activity rate based on user-defined criteria, achieving up to 135 TOPS and 215 TOPS/W under $74~^{\circ }$ C for the AR/VR workloads.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.40
自引率
7.10%
发文量
187
审稿时长
3.6 months
期刊介绍: The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.
期刊最新文献
Table of Contents IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information Table of Contents IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1