Redefining IBM power system design for CORAL

IF 1.3 4区 计算机科学 Q1 Computer Science IBM Journal of Research and Development Pub Date : 2020-01-03 DOI:10.1147/JRD.2019.2963637
S. Roberts;C. Mann;C. Marroquin
{"title":"Redefining IBM power system design for CORAL","authors":"S. Roberts;C. Mann;C. Marroquin","doi":"10.1147/JRD.2019.2963637","DOIUrl":null,"url":null,"abstract":"Stipulations in the 2014 Collaboration of Oak Ridge, Argonne, and Livermore (CORAL) joint procurement activity not only motivated a fundamental change in IBM's high-performance computer design, which refocused IBM power systems on compute nodes that can scale to 200 petaflops with access to 2.5 PB of memory, but also served the commercial market for single-server applications. The distribution of both processing elements and memory required a careful look at data movement. The resultant AC922 POWER9 system features NVIDIA V100 GPUs with cache line access granularity, more than double the IO bandwidth of PCIe Gen3, and low-latency interfaces interconnected by the state-of-the-art dual-rail Mellanox CAPI EDR HCAs running at 50 Gb/s. With processing units designed to operate at 250 and 300 W, a single system can produce up to 3,080 kW. The overall CORAL solutions achieved power usage effectiveness rankings in the top ten on the Green500. Previous power designs used uniquely designed cabinets and scaled-up infrastructure to achieve efficiency. For successful commercial use, our design uses industry-standard 19-in drawers and racks. Both air- and water-cooled solutions allow for use in a wide range of customer environments. This article documents the novel design features that facilitate data movement and enable new coherent programming models. It describes how three generations of system designs became the foundation for the CORAL contract fulfillment and illustrates key features and specifications of the final product.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":"64 3/4","pages":"2:1-2:10"},"PeriodicalIF":1.3000,"publicationDate":"2020-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2963637","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IBM Journal of Research and Development","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/8949743/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 2

Abstract

Stipulations in the 2014 Collaboration of Oak Ridge, Argonne, and Livermore (CORAL) joint procurement activity not only motivated a fundamental change in IBM's high-performance computer design, which refocused IBM power systems on compute nodes that can scale to 200 petaflops with access to 2.5 PB of memory, but also served the commercial market for single-server applications. The distribution of both processing elements and memory required a careful look at data movement. The resultant AC922 POWER9 system features NVIDIA V100 GPUs with cache line access granularity, more than double the IO bandwidth of PCIe Gen3, and low-latency interfaces interconnected by the state-of-the-art dual-rail Mellanox CAPI EDR HCAs running at 50 Gb/s. With processing units designed to operate at 250 and 300 W, a single system can produce up to 3,080 kW. The overall CORAL solutions achieved power usage effectiveness rankings in the top ten on the Green500. Previous power designs used uniquely designed cabinets and scaled-up infrastructure to achieve efficiency. For successful commercial use, our design uses industry-standard 19-in drawers and racks. Both air- and water-cooled solutions allow for use in a wide range of customer environments. This article documents the novel design features that facilitate data movement and enable new coherent programming models. It describes how three generations of system designs became the foundation for the CORAL contract fulfillment and illustrates key features and specifications of the final product.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
为CORAL重新定义IBM电力系统设计
2014年橡树岭、阿贡和利弗莫尔合作组织(CORAL)联合采购活动的规定,不仅激发了IBM高性能计算机设计的根本性变革,使IBM的电源系统重新聚焦于可扩展到每秒200千万亿次浮点运算、访问2.5 PB内存的计算节点,而且还服务于单服务器应用程序的商业市场。处理元素和内存的分布都需要仔细查看数据移动。由此产生的AC922 POWER9系统具有具有缓存线访问颗粒度的NVIDIA V100 gpu,是PCIe Gen3的两倍多的IO带宽,以及由最先进的双轨Mellanox CAPI EDR hca连接的低延迟接口,运行速度为50 Gb/s。处理单元设计为250和300瓦,单个系统可以产生高达3,080千瓦的功率。整体CORAL解决方案在Green500的电力使用效率排名中名列前十。以前的电源设计使用独特设计的机柜和扩展的基础设施来实现效率。为了成功的商业用途,我们的设计使用行业标准的19英寸抽屉和机架。风冷和水冷解决方案均可用于各种客户环境。本文记录了促进数据移动和支持新的连贯编程模型的新颖设计特性。它描述了三代系统设计如何成为CORAL合同履行的基础,并说明了最终产品的关键特性和规格。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IBM Journal of Research and Development
IBM Journal of Research and Development 工程技术-计算机:硬件
自引率
0.00%
发文量
0
审稿时长
6-12 weeks
期刊介绍: The IBM Journal of Research and Development is a peer-reviewed technical journal, published bimonthly, which features the work of authors in the science, technology and engineering of information systems. Papers are written for the worldwide scientific research and development community and knowledgeable professionals. Submitted papers are welcome from the IBM technical community and from non-IBM authors on topics relevant to the scientific and technical content of the Journal.
期刊最新文献
Use of a smartwatch for home blood pressure measurement Numerical modeling of the behavior of a lithium battery after a collision Disaster Resilient Cities in Nepal: Disaster Management Efforts of Biratnagar Metropolitan City Status of Invasive Alien Plant species in Dhankuta Municipality Perceived Learning Environment: A Case of BBA Program at Dhankuta Multiple Campus
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1