AIPerf: Automated machine learning as an AI-HPC benchmark

IF 6.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Big Data Mining and Analytics Pub Date : 2021-03-12 DOI:10.26599/BDMA.2021.9020004

Zhixiang Ren;Yongheng Liu;Tianhui Shi;Lei Xie;Yue Zhou;Jidong Zhai;Youhui Zhang;Yunquan Zhang;Wenguang Chen

{"title":"AIPerf: Automated machine learning as an AI-HPC benchmark","authors":"Zhixiang Ren;Yongheng Liu;Tianhui Shi;Lei Xie;Yue Zhou;Jidong Zhai;Youhui Zhang;Yunquan Zhang;Wenguang Chen","doi":"10.26599/BDMA.2021.9020004","DOIUrl":null,"url":null,"abstract":"The plethora of complex Artificial Intelligence (AI) algorithms and available High-Performance Computing (HPC) power stimulates the expeditious development of AI components with heterogeneous designs. Consequently, the need for cross-stack performance benchmarking of AI-HPC systems has rapidly emerged. In particular, the defacto HPC benchmark, LINPACK, cannot reflect the AI computing power and input/output performance without a representative workload. Current popular AI benchmarks, such as MLPerf, have a fixed problem size and therefore limited scalability. To address these issues, we propose an end-to-end benchmark suite utilizing automated machinelearning, which not only represents real AI scenarios, but also is auto-adaptively scalable to various scales ofmachines. We implement the algorithms in a highly parallel and flexible way to ensure the efficiency and optimizationpotential on diverse systems with customizable configurations. We utilize Operations Per Second (OPS), which ismeasured in an analytical and systematic approach, as a major metric to quantify the AI performance. We performevaluations on various systems to ensure the benchmark's stability and scalability, from 4 nodes with 32 NVIDIA Tesla T4 (56.1 Tera-OPS measured) up to 512 nodes with 4096 Huawei Ascend 910 (194.53 Peta-OPS measured), and the results show near-linear weak scalability. With a flexible workload and single metric, AIPerf can easily scaleon and rank AI-HPC, providing a powerful benchmark suite for the coming supercomputing era.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"4 3","pages":"208-220"},"PeriodicalIF":6.2000,"publicationDate":"2021-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9430128/09430136.pdf","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Mining and Analytics","FirstCategoryId":"1093","ListUrlMain":"https://ieeexplore.ieee.org/document/9430136/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 13

Abstract

The plethora of complex Artificial Intelligence (AI) algorithms and available High-Performance Computing (HPC) power stimulates the expeditious development of AI components with heterogeneous designs. Consequently, the need for cross-stack performance benchmarking of AI-HPC systems has rapidly emerged. In particular, the defacto HPC benchmark, LINPACK, cannot reflect the AI computing power and input/output performance without a representative workload. Current popular AI benchmarks, such as MLPerf, have a fixed problem size and therefore limited scalability. To address these issues, we propose an end-to-end benchmark suite utilizing automated machinelearning, which not only represents real AI scenarios, but also is auto-adaptively scalable to various scales ofmachines. We implement the algorithms in a highly parallel and flexible way to ensure the efficiency and optimizationpotential on diverse systems with customizable configurations. We utilize Operations Per Second (OPS), which ismeasured in an analytical and systematic approach, as a major metric to quantify the AI performance. We performevaluations on various systems to ensure the benchmark's stability and scalability, from 4 nodes with 32 NVIDIA Tesla T4 (56.1 Tera-OPS measured) up to 512 nodes with 4096 Huawei Ascend 910 (194.53 Peta-OPS measured), and the results show near-linear weak scalability. With a flexible workload and single metric, AIPerf can easily scaleon and rank AI-HPC, providing a powerful benchmark suite for the coming supercomputing era.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

AIPerf：作为AI-HPC基准的自动机器学习

过多的复杂人工智能（AI）算法和可用的高性能计算（HPC）能力刺激了具有异构设计的AI组件的快速开发。因此，对AI-HPC系统的跨堆栈性能基准测试的需求已经迅速出现。特别是，在没有代表性工作负载的情况下，实际的HPC基准LINPACK无法反映AI计算能力和输入/输出性能。当前流行的人工智能基准测试，如MLPerf，具有固定的问题大小，因此可扩展性有限。为了解决这些问题，我们提出了一个利用自动机器学习的端到端基准套件，它不仅代表了真实的人工智能场景，而且可以自动自适应地扩展到各种规模的机器。我们以高度并行和灵活的方式实现算法，以确保在具有可定制配置的不同系统上的效率和优化潜力。我们利用每秒操作数（OPS）作为量化人工智能性能的主要指标，该指标以分析和系统的方法进行测量。我们在各种系统上进行了评估，以确保基准的稳定性和可扩展性，从32个NVIDIA Tesla T4的4个节点（测量值为56.1 Tera OPS）到4096个华为Ascend 910的512个节点（测值为194.53 Peta OPS），结果显示可扩展性接近线性。凭借灵活的工作负载和单一指标，AIPerf可以轻松地对AI-HPC进行扩展和排名，为即将到来的超级计算时代提供强大的基准套件。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Big Data Mining and Analytics Computer Science-Computer Science Applications

CiteScore

20.90

自引率

2.20%

发文量

期刊介绍： Big Data Mining and Analytics, a publication by Tsinghua University Press, presents groundbreaking research in the field of big data research and its applications. This comprehensive book delves into the exploration and analysis of vast amounts of data from diverse sources to uncover hidden patterns, correlations, insights, and knowledge. Featuring the latest developments, research issues, and solutions, this book offers valuable insights into the world of big data. It provides a deep understanding of data mining techniques, data analytics, and their practical applications. Big Data Mining and Analytics has gained significant recognition and is indexed and abstracted in esteemed platforms such as ESCI, EI, Scopus, DBLP Computer Science, Google Scholar, INSPEC, CSCD, DOAJ, CNKI, and more. With its wealth of information and its ability to transform the way we perceive and utilize data, this book is a must-read for researchers, professionals, and anyone interested in the field of big data analytics.

期刊最新文献

Contents Front Cover Incremental Data Stream Classification with Adaptive Multi-Task Multi-View Learning Attention-Based CNN Fusion Model for Emotion Recognition During Walking Using Discrete Wavelet Transform on EEG and Inertial Signals Gender-Based Analysis of User Reactions to Facebook Posts