首页 > 最新文献

BenchCouncil Transactions on Benchmarks, Standards and Evaluations最新文献

英文 中文
Open-source computer systems initiative: The motivation, essence, challenges, and methodology 开源计算机系统计划:动机、本质、挑战和方法
Pub Date : 2022-03-01 DOI: 10.1016/j.tbench.2022.100038
Jianfeng Zhan

The global community faces many pressing and uncertain challenges like pandemics and global climate change. Information technology (IT) infrastructure has become the enabler to addressing those challenges. Unfortunately, IT decoupling has distracted and weakened the international community’s ability to handle those challenges.

This article initiates an open-source computer system (OSCS) initiative to tackle the challenges of IT decoupling. The OSCS movement is where open-source software converges with open-source hardware. Its essential is to utilize the inherent characteristics of a class of representative workloads and propose innovative abstraction and methodology to co-explore the software and hardware design spaces of high-end computer systems, attaining peak performance, security, and other fundamental dimensions. I discuss its four challenges, including the system complexity, the tradeoff between universal and ideal systems, guaranteeing quality of computation results and performance under different conditions, e.g., best-case, worst-case, or average-case, and balancing legal, patent, and license issues.

Inspired by the philosophy of building large systems out of smaller functions, I propose the funclet abstraction and methodology to tackle the first challenge. The funclet abstraction is a well-defined, evolvable, reusable, independently deployable, and testable functionality with modest complexity. Each funclet interoperates with other funclets through standard bus interfaces or interconnections. Four funclet building blocks: chiplet, HWlet, envlet, and servlet at the chip, hardware, environment management, and service layers form the four-layer funclet architecture. The advantages of the funclet abstraction and architecture are discussed. The project’s website is publicly available from https://www.opensourcecomputer.org or https://www.computercouncil.org.

国际社会面临许多紧迫和不确定的挑战,如流行病和全球气候变化。信息技术(IT)基础设施已成为解决这些挑战的推动者。不幸的是,信息技术脱钩分散了国际社会的注意力,削弱了国际社会应对这些挑战的能力。本文发起了一个开源计算机系统(OSCS)计划来处理IT解耦的挑战。OSCS运动是开源软件与开源硬件的融合。它的本质是利用一类代表性工作负载的固有特征,并提出创新的抽象和方法,共同探索高端计算机系统的软件和硬件设计空间,实现峰值性能、安全性和其他基本维度。我讨论了它的四个挑战,包括系统复杂性、通用系统和理想系统之间的权衡、在不同条件下保证计算结果的质量和性能,例如,最佳情况、最坏情况或平均情况,以及平衡法律、专利和许可问题。受到用较小的函数构建大型系统的哲学的启发,我提出了函数抽象和方法来解决第一个挑战。函数抽象是一种定义良好、可发展、可重用、可独立部署和可测试的功能,具有适度的复杂性。每个函数通过标准总线接口或互连与其他函数互操作。四个小函数构建块:芯片层、硬件层、环境管理层和服务层的chiplet、HWlet、envlet和servlet构成了四层小函数体系结构。讨论了小函数抽象和体系结构的优点。该项目的网站可从https://www.opensourcecomputer.org或https://www.computercouncil.org公开访问。
{"title":"Open-source computer systems initiative: The motivation, essence, challenges, and methodology","authors":"Jianfeng Zhan","doi":"10.1016/j.tbench.2022.100038","DOIUrl":"https://doi.org/10.1016/j.tbench.2022.100038","url":null,"abstract":"<div><p>The global community faces many pressing and uncertain challenges like pandemics and global climate change. Information technology (IT) infrastructure has become the enabler to addressing those challenges. Unfortunately, IT decoupling has distracted and weakened the international community’s ability to handle those challenges.</p><p>This article initiates an open-source computer system (OSCS) initiative to tackle the challenges of IT decoupling. The OSCS movement is where open-source software converges with open-source hardware. Its essential is to utilize the inherent characteristics of a class of representative workloads and propose innovative abstraction and methodology to co-explore the software and hardware design spaces of high-end computer systems, attaining peak performance, security, and other fundamental dimensions. I discuss its four challenges, including the system complexity, the tradeoff between universal and ideal systems, guaranteeing quality of computation results and performance under different conditions, e.g., best-case, worst-case, or average-case, and balancing legal, patent, and license issues.</p><p>Inspired by the philosophy of building large systems out of smaller functions, I propose the funclet abstraction and methodology to tackle the first challenge. The funclet abstraction is a well-defined, evolvable, reusable, independently deployable, and testable functionality with modest complexity. Each funclet interoperates with other funclets through standard bus interfaces or interconnections. Four funclet building blocks: chiplet, HWlet, envlet, and servlet at the chip, hardware, environment management, and service layers form the four-layer funclet architecture. The advantages of the funclet abstraction and architecture are discussed. The project’s website is publicly available from <span>https://www.opensourcecomputer.org</span><svg><path></path></svg> or <span>https://www.computercouncil.org</span><svg><path></path></svg>.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"2 1","pages":"Article 100038"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485922000254/pdfft?md5=918af912e65cb9e5c5712c174cc420e9&pid=1-s2.0-S2772485922000254-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137288745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Call for consistent benchmarking across multi-disciplines 呼吁在多学科之间建立一致的基准
Pub Date : 2021-12-01 DOI: 10.1016/j.tbench.2021.100012
Jianfeng Zhan
{"title":"Call for consistent benchmarking across multi-disciplines","authors":"Jianfeng Zhan","doi":"10.1016/j.tbench.2021.100012","DOIUrl":"https://doi.org/10.1016/j.tbench.2021.100012","url":null,"abstract":"","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88626666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Performance optimization opportunities in the Android software stack Android软件栈中的性能优化机会
Pub Date : 2021-10-01 DOI: 10.1016/j.tbench.2021.100003
Varun Gohil , Nisarg Ujjainkar , Joycee Mekie , Manu Awasthi

The smartphone hardware and software ecosystems have evolved very rapidly. Multiple innovations in the system software, including OS, languages, and runtimes have been made in the last decade. Although, performance characterization of microarchitecture has been done, there is little analysis available for application performance bottlenecks of the system software stack, especially for contemporary applications on mobile operating systems.

In this work, we perform system utilization analysis from a software perspective, thereby supplementing the hardware perspective offered by prior work. We focus our analysis on Android powered smartphones, running newer versions of Android. Using 11 representative apps and regions of interest within them, we carry out performance analysis of the entire Android software stack to identify system performance bottlenecks.

We observe that for the majority of apps, the most time-consuming system level thread is a frame rendering thread. However, more surprisingly, our results indicate that all apps spend a significant amount of time doing Inter Process Communication (IPC), hinting that the Android IPC stack is a ripe target for performance optimization via software development and a potential target for hardware acceleration.

智能手机硬件和软件生态系统的发展非常迅速。在过去十年中,系统软件(包括操作系统、语言和运行时)出现了多种创新。尽管已经完成了微架构的性能表征,但对于系统软件堆栈的应用程序性能瓶颈,特别是对于移动操作系统上的当代应用程序,几乎没有可用的分析。在这项工作中,我们从软件角度进行系统利用率分析,从而补充了先前工作提供的硬件角度。我们的分析主要集中在运行更新版本Android的Android智能手机上。使用11个代表性的应用程序和它们感兴趣的区域,我们对整个Android软件堆栈进行性能分析,以确定系统性能瓶颈。我们观察到,对于大多数应用程序,最耗时的系统级线程是帧渲染线程。然而,更令人惊讶的是,我们的结果表明,所有应用程序都花费了大量的时间来进行进程间通信(IPC),这表明Android IPC堆栈是通过软件开发进行性能优化的成熟目标,也是硬件加速的潜在目标。
{"title":"Performance optimization opportunities in the Android software stack","authors":"Varun Gohil ,&nbsp;Nisarg Ujjainkar ,&nbsp;Joycee Mekie ,&nbsp;Manu Awasthi","doi":"10.1016/j.tbench.2021.100003","DOIUrl":"10.1016/j.tbench.2021.100003","url":null,"abstract":"<div><p>The smartphone hardware and software ecosystems have evolved very rapidly. Multiple innovations in the system software, including OS, languages, and runtimes have been made in the last decade. Although, performance characterization of microarchitecture has been done, there is little analysis available for application performance bottlenecks of the system software stack, especially for contemporary applications on mobile operating systems.</p><p>In this work, we perform system utilization analysis from a software perspective, thereby supplementing the hardware perspective offered by prior work. We focus our analysis on Android powered smartphones, running newer versions of Android. Using 11 representative apps and regions of interest within them, we carry out performance analysis of the entire Android software stack to identify system performance bottlenecks.</p><p>We observe that for the majority of apps, the most time-consuming system level thread is a frame rendering thread. However, more surprisingly, our results indicate that <em>all apps</em> spend a significant amount of time doing Inter Process Communication (IPC), hinting that the Android IPC stack is a ripe target for performance optimization via software development and a potential target for hardware acceleration.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"1 1","pages":"Article 100003"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S277248592100003X/pdfft?md5=3477132301a132ff0fdf5f9370443f35&pid=1-s2.0-S277248592100003X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83697576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MVDI25K: A large-scale dataset of microscopic vaginal discharge images MVDI25K:显微阴道分泌物图像的大规模数据集
Pub Date : 2021-10-01 DOI: 10.1016/j.tbench.2021.100008
Lin Li , Jingyi Liu , Fei Yu , Xunkun Wang , Tian-Zhu Xiang

With the widespread application of artificial intelligence technology in the field of biomedical images, the deep learning-based detection of vaginal discharge, an important but challenging topic in medical image processing, has drawn an increasing amount of research interest. Although the past few decades have witnessed major advances in object detection of natural scenes, such successes have been slow to medical images, not only because of the complex background and diverse cell morphology in the microscope images, but also due to the scarcity of well-annotated datasets of objects in medical images. Until now, in most hospitals in China, the vaginal diseases are often checked by observation of cell morphology using the microscope manually, or observation of the color reaction experiment by inspectors, which are time-consuming, inefficient and easily interfered by subjective factors. To this end, we elaborately construct the first large-scale dataset of microscopic vaginal discharge images, named MVDI25K, which consists of 25,708 images covering 10 cell categories related to vaginal discharge detection. All the images in MVDI25K dataset are carefully annotated by experts with bounding-box and object-level labels. In addition, we conduct a systematical benchmark experiments on MVDI25K dataset with 10 representative state-of-the-art (SOTA) deep models focusing on two key tasks, i.e., object detection and object segmentation. Our research offers the community an opportunity to explore more in this new field.

随着人工智能技术在生物医学图像领域的广泛应用,基于深度学习的阴道分泌物检测作为医学图像处理中的一个重要而又具有挑战性的课题,引起了越来越多的研究兴趣。尽管在过去的几十年里,自然场景的目标检测取得了重大进展,但在医学图像中,这一成功进展缓慢,这不仅是因为显微镜图像中的背景复杂、细胞形态多样,而且还因为医学图像中缺乏经过良好注释的目标数据集。到目前为止,国内大多数医院对阴道疾病的检查多是人工显微镜下观察细胞形态,或由检查人员观察颜色反应实验,费时、低效且容易受到主观因素的干扰。为此,我们精心构建了第一个大规模阴道分泌物显微图像数据集MVDI25K,该数据集由25,708张图像组成,涵盖了与阴道分泌物检测相关的10个细胞类别。MVDI25K数据集中的所有图像都由专家使用边界框和对象级标签进行仔细注释。此外,我们在MVDI25K数据集上进行了系统的基准测试实验,其中包含10个具有代表性的最先进(SOTA)深度模型,重点关注两个关键任务,即目标检测和目标分割。我们的研究为社区提供了在这个新领域进行更多探索的机会。
{"title":"MVDI25K: A large-scale dataset of microscopic vaginal discharge images","authors":"Lin Li ,&nbsp;Jingyi Liu ,&nbsp;Fei Yu ,&nbsp;Xunkun Wang ,&nbsp;Tian-Zhu Xiang","doi":"10.1016/j.tbench.2021.100008","DOIUrl":"10.1016/j.tbench.2021.100008","url":null,"abstract":"<div><p>With the widespread application of artificial intelligence technology in the field of biomedical images, the deep learning-based detection of vaginal discharge, an important but challenging topic in medical image processing, has drawn an increasing amount of research interest. Although the past few decades have witnessed major advances in object detection of natural scenes, such successes have been slow to medical images, not only because of the complex background and diverse cell morphology in the microscope images, but also due to the scarcity of well-annotated datasets of objects in medical images. Until now, in most hospitals in China, the vaginal diseases are often checked by observation of cell morphology using the microscope manually, or observation of the color reaction experiment by inspectors, which are time-consuming, inefficient and easily interfered by subjective factors. To this end, we elaborately construct the first large-scale dataset of <strong>m</strong>icroscopic <strong>v</strong>aginal <strong>d</strong>ischarge <strong>i</strong>mages, named <strong><em>MVDI25K</em></strong>, which consists of 25,708 images covering 10 cell categories related to vaginal discharge detection. All the images in <em>MVDI25K</em> dataset are carefully annotated by experts with bounding-box and object-level labels. In addition, we conduct a systematical benchmark experiments on <em>MVDI25K</em> dataset with 10 representative state-of-the-art (SOTA) deep models focusing on two key tasks, <em>i.e.</em>, object detection and object segmentation. Our research offers the community an opportunity to explore more in this new field.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"1 1","pages":"Article 100008"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485921000089/pdfft?md5=d1824b70c714277bd224e6db44b1b71a&pid=1-s2.0-S2772485921000089-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89969530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A parallel sparse approximate inverse preconditioning algorithm based on MPI and CUDA 一种基于MPI和CUDA的并行稀疏近似逆预处理算法
Pub Date : 2021-10-01 DOI: 10.1016/j.tbench.2021.100007
Yizhou Wang, Wenhao Li, Jiaquan Gao

In this study, we present an efficient parallel sparse approximate inverse (SPAI) preconditioning algorithm based on MPI and CUDA, called HybridSPAI. For HybridSPAI, it optimizes a latest static SPAI preconditioning algorithm, and is extended from one GPU to multiple GPUs in order to process large-scale matrices. We make the following significant contributions: (1) a general parallel framework for optimizing the static SPAI preconditioner based on MPI and CUDA is presented, and (2) for each component of the preconditioner, a decision tree is established to choose the optimal kernel of computing it. Experimental results show that HybridSPAI is effective, and outperforms the popular preconditioning algorithms in two public libraries, and a latest parallel SPAI preconditioning algorithm.

在本研究中,我们提出了一种基于MPI和CUDA的高效并行稀疏近似逆(SPAI)预处理算法,称为HybridSPAI。对于HybridSPAI,它优化了一种最新的静态SPAI预处理算法,并将其从一个GPU扩展到多个GPU,以处理大规模矩阵。我们做出了以下重大贡献:(1)提出了一个基于MPI和CUDA的静态SPAI预条件优化通用并行框架;(2)对预条件的每个组成部分建立了决策树来选择计算它的最优核。实验结果表明,HybridSPAI是有效的,并且优于两个公共图书馆中流行的预处理算法,以及最新的并行SPAI预处理算法。
{"title":"A parallel sparse approximate inverse preconditioning algorithm based on MPI and CUDA","authors":"Yizhou Wang,&nbsp;Wenhao Li,&nbsp;Jiaquan Gao","doi":"10.1016/j.tbench.2021.100007","DOIUrl":"10.1016/j.tbench.2021.100007","url":null,"abstract":"<div><p>In this study, we present an efficient parallel sparse approximate inverse (SPAI) preconditioning algorithm based on MPI and CUDA, called HybridSPAI. For HybridSPAI, it optimizes a latest static SPAI preconditioning algorithm, and is extended from one GPU to multiple GPUs in order to process large-scale matrices. We make the following significant contributions: (1) a general parallel framework for optimizing the static SPAI preconditioner based on MPI and CUDA is presented, and (2) for each component of the preconditioner, a decision tree is established to choose the optimal kernel of computing it. Experimental results show that HybridSPAI is effective, and outperforms the popular preconditioning algorithms in two public libraries, and a latest parallel SPAI preconditioning algorithm.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"1 1","pages":"Article 100007"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485921000077/pdfft?md5=acaf310d54e04f99040f007213bf2d56&pid=1-s2.0-S2772485921000077-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91535551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
MLHarness: A scalable benchmarking system for MLCommons MLHarness:一个可扩展的MLCommons基准测试系统
Pub Date : 2021-10-01 DOI: 10.1016/j.tbench.2021.100002
Yen-Hsiang Chang , Jianhao Pu , Wen-mei Hwu , Jinjun Xiong

With the society’s growing adoption of machine learning (ML) and deep learning (DL) for various intelligent solutions, it becomes increasingly imperative to standardize a common set of measures for ML/DL models with large scale open datasets under common development practices and resources so that people can benchmark and compare models’ quality and performance on a common ground. MLCommons has emerged recently as a driving force from both industry and academia to orchestrate such an effort. Despite its wide adoption as standardized benchmarks, MLCommons Inference has only included a limited number of ML/DL models (in fact seven models in total). This significantly limits the generality of MLCommons Inference’s benchmarking results because there are many more novel ML/DL models from the research community, solving a wide range of problems with different inputs and outputs modalities. To address such a limitation, we propose MLHarness, a scalable benchmarking harness system for MLCommons Inference with three distinctive features: (1) it codifies the standard benchmark process as defined by MLCommons Inference including the models, datasets, DL frameworks, and software and hardware systems; (2) it provides an easy and declarative approach for model developers to contribute their models and datasets to MLCommons Inference; and (3) it includes the support of a wide range of models with varying inputs/outputs modalities so that we can scalably benchmark these models across different datasets, frameworks, and hardware systems. This harness system is developed on top of the MLModelScope system, and will be open sourced to the community. Our experimental results demonstrate the superior flexibility and scalability of this harness system for MLCommons Inference benchmarking.

随着社会对各种智能解决方案越来越多地采用机器学习(ML)和深度学习(DL),在共同的开发实践和资源下,标准化一套具有大规模开放数据集的ML/DL模型的通用度量变得越来越必要,以便人们可以在共同的基础上基准测试和比较模型的质量和性能。MLCommons最近作为工业界和学术界的一股推动力量出现,以协调这一努力。尽管作为标准化基准被广泛采用,MLCommons Inference只包含了有限数量的ML/DL模型(实际上总共有7个模型)。这极大地限制了MLCommons Inference基准测试结果的通用性,因为研究界有更多新颖的ML/DL模型,用不同的输入和输出模式解决了广泛的问题。为了解决这一限制,我们提出了MLHarness,这是一个可扩展的MLCommons Inference基准测试系统,具有三个显著特征:(1)它将MLCommons Inference定义的标准基准测试过程编码,包括模型、数据集、深度学习框架以及软件和硬件系统;(2)它为模型开发人员提供了一种简单的声明性方法,可以将他们的模型和数据集贡献给MLCommons Inference;(3)它包括对具有不同输入/输出模式的广泛模型的支持,以便我们可以跨不同的数据集、框架和硬件系统对这些模型进行可扩展的基准测试。这个线束系统是在MLModelScope系统的基础上开发的,并将向社区开放源代码。我们的实验结果表明,该控制系统具有优越的灵活性和可扩展性,可用于MLCommons Inference基准测试。
{"title":"MLHarness: A scalable benchmarking system for MLCommons","authors":"Yen-Hsiang Chang ,&nbsp;Jianhao Pu ,&nbsp;Wen-mei Hwu ,&nbsp;Jinjun Xiong","doi":"10.1016/j.tbench.2021.100002","DOIUrl":"10.1016/j.tbench.2021.100002","url":null,"abstract":"<div><p>With the society’s growing adoption of machine learning (ML) and deep learning (DL) for various intelligent solutions, it becomes increasingly imperative to standardize a common set of measures for ML/DL models with large scale open datasets under common development practices and resources so that people can benchmark and compare models’ quality and performance on a common ground. MLCommons has emerged recently as a driving force from both industry and academia to orchestrate such an effort. Despite its wide adoption as standardized benchmarks, MLCommons Inference has only included a limited number of ML/DL models (in fact seven models in total). This significantly limits the generality of MLCommons Inference’s benchmarking results because there are many more novel ML/DL models from the research community, solving a wide range of problems with different inputs and outputs modalities. To address such a limitation, we propose MLHarness, a scalable benchmarking harness system for MLCommons Inference with three distinctive features: (1) it codifies the standard benchmark process as defined by MLCommons Inference including the models, datasets, DL frameworks, and software and hardware systems; (2) it provides an easy and declarative approach for model developers to contribute their models and datasets to MLCommons Inference; and (3) it includes the support of a wide range of models with varying inputs/outputs modalities so that we can scalably benchmark these models across different datasets, frameworks, and hardware systems. This harness system is developed on top of the MLModelScope system, and will be open sourced to the community. Our experimental results demonstrate the superior flexibility and scalability of this harness system for MLCommons Inference benchmarking.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"1 1","pages":"Article 100002"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485921000028/pdfft?md5=7f9c2c5bfe8e2572b956bae3089e8207&pid=1-s2.0-S2772485921000028-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88253393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Stars shine: The report of 2021 BenchCouncil awards 明星闪耀:2021年BenchCouncil奖项报告
Pub Date : 2021-10-01 DOI: 10.1016/j.tbench.2021.100013
Taotao Zhan , Simin Chen

This report introduces the awards presented by the International Open Benchmark Council (BenchCouncil) in 2021 and highlights the award selection rules, committee, awardees, and their contributions.

本报告介绍了国际开放基准委员会(BenchCouncil)于2021年颁发的奖项,并重点介绍了奖项的评选规则、委员会、获奖者及其贡献。
{"title":"Stars shine: The report of 2021 BenchCouncil awards","authors":"Taotao Zhan ,&nbsp;Simin Chen","doi":"10.1016/j.tbench.2021.100013","DOIUrl":"10.1016/j.tbench.2021.100013","url":null,"abstract":"<div><p>This report introduces the awards presented by the International Open Benchmark Council (BenchCouncil) in 2021 and highlights the award selection rules, committee, awardees, and their contributions.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"1 1","pages":"Article 100013"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485921000132/pdfft?md5=825c9b2b90b3b0eada52051c0a7afac6&pid=1-s2.0-S2772485921000132-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88306293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fallout: Distributed systems testing as a service 余波:分布式系统测试即服务
Pub Date : 2021-10-01 DOI: 10.1016/j.tbench.2021.100010
Matt Fleming, Guy Bolton King, Sean McCarthy, Jake Luciani, Pushkala Pattabhiraman

All modern distributed systems list performance and scalability as their core strengths. Given that optimal performance requires carefully selecting configuration options, and typical cluster sizes can range anywhere from 2 to 300 nodes, it is rare for any two clusters to be exactly the same. Validating the behavior and performance of distributed systems in this large configuration space is challenging without automation that stretches across the software stack. In this paper we present Fallout, an open-source distributed systems testing service that automatically provisions and configures distributed systems and clients, supports running a variety of workloads and benchmarks, and generates performance reports based on collected metrics for visual analysis. We have been running the Fallout service internally at DataStax for over 5 years and have recently open sourced it to support our work with Apache Cassandra, Pulsar, and other open source projects. We describe the architecture of Fallout along with the evolution of its design and the lessons we learned operating this service in a dynamic environment where teams work on different products and favor different benchmarking tools.

所有现代分布式系统都将性能和可伸缩性列为其核心优势。考虑到最佳性能需要仔细选择配置选项,并且典型的集群大小可以从2到300个节点不等,因此很少有两个集群完全相同。如果没有跨越软件堆栈的自动化,在这个大的配置空间中验证分布式系统的行为和性能是具有挑战性的。在本文中,我们介绍了一个开源的分布式系统测试服务,它可以自动提供和配置分布式系统和客户端,支持运行各种工作负载和基准测试,并根据收集的指标生成性能报告进行可视化分析。我们已经在DataStax内部运行了5年多的辐射服务,最近开源了它,以支持我们与Apache Cassandra, Pulsar和其他开源项目的工作。我们描述了《辐射》的架构及其设计的演变,以及我们在一个动态环境中运营这项服务的经验教训,在这个环境中,团队开发不同的产品,喜欢不同的基准测试工具。
{"title":"Fallout: Distributed systems testing as a service","authors":"Matt Fleming,&nbsp;Guy Bolton King,&nbsp;Sean McCarthy,&nbsp;Jake Luciani,&nbsp;Pushkala Pattabhiraman","doi":"10.1016/j.tbench.2021.100010","DOIUrl":"10.1016/j.tbench.2021.100010","url":null,"abstract":"<div><p>All modern distributed systems list performance and scalability as their core strengths. Given that optimal performance requires carefully selecting configuration options, and typical cluster sizes can range anywhere from 2 to 300 nodes, it is rare for any two clusters to be exactly the same. Validating the behavior and performance of distributed systems in this large configuration space is challenging without automation that stretches across the software stack. In this paper we present Fallout, an open-source distributed systems testing service that automatically provisions and configures distributed systems and clients, supports running a variety of workloads and benchmarks, and generates performance reports based on collected metrics for visual analysis. We have been running the Fallout service internally at DataStax for over 5 years and have recently open sourced it to support our work with Apache Cassandra, Pulsar, and other open source projects. We describe the architecture of Fallout along with the evolution of its design and the lessons we learned operating this service in a dynamic environment where teams work on different products and favor different benchmarking tools.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"1 1","pages":"Article 100010"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485921000107/pdfft?md5=6a996ef2f804ec79d157461e3b7e2fba&pid=1-s2.0-S2772485921000107-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85673397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Comparative evaluation of deep learning workloads for leadership-class systems 领导力类系统深度学习工作量的比较评估
Pub Date : 2021-10-01 DOI: 10.1016/j.tbench.2021.100005
Junqi Yin, Aristeidis Tsaris, Sajal Dash, Ross Miller, Feiyi Wang, Mallikarjun (Arjun) Shankar

Deep learning (DL) workloads and their performance at scale are becoming important factors to consider as we design, develop and deploy next-generation high-performance computing systems. Since DL applications rely heavily on DL frameworks and underlying compute (CPU/GPU) stacks, it is essential to gain a holistic understanding from compute kernels, models, and frameworks of popular DL stacks, and to assess their impact on science-driven, mission-critical applications. At Oak Ridge Leadership Computing Facility (OLCF), we employ a set of micro and macro DL benchmarks established through the Collaboration of Oak Ridge, Argonne, and Livermore (CORAL) to evaluate the AI readiness of our next-generation supercomputers. In this paper, we present our early observations and performance benchmark comparisons between the Nvidia V100 based Summit system with its CUDA stack and an AMD MI100 based testbed system with its ROCm stack. We take a layered perspective on DL benchmarking and point to opportunities for future optimizations in the technologies that we consider.

深度学习(DL)工作负载及其大规模性能正在成为我们设计、开发和部署下一代高性能计算系统时需要考虑的重要因素。由于深度学习应用程序严重依赖于深度学习框架和底层计算(CPU/GPU)堆栈,因此有必要从流行的深度学习堆栈的计算内核、模型和框架中获得全面的理解,并评估它们对科学驱动的关键任务应用程序的影响。在橡树岭领导计算设施(OLCF),我们采用了一套由橡树岭、阿贡和利弗莫尔(CORAL)合作建立的微观和宏观深度学习基准来评估我们下一代超级计算机的人工智能准备情况。在本文中,我们介绍了基于Nvidia V100的Summit系统及其CUDA堆栈与基于AMD MI100的测试平台系统及其ROCm堆栈之间的早期观察和性能基准比较。我们对深度学习基准测试采取了分层的视角,并指出了我们所考虑的技术中未来优化的机会。
{"title":"Comparative evaluation of deep learning workloads for leadership-class systems","authors":"Junqi Yin,&nbsp;Aristeidis Tsaris,&nbsp;Sajal Dash,&nbsp;Ross Miller,&nbsp;Feiyi Wang,&nbsp;Mallikarjun (Arjun) Shankar","doi":"10.1016/j.tbench.2021.100005","DOIUrl":"10.1016/j.tbench.2021.100005","url":null,"abstract":"<div><p>Deep learning (DL) workloads and their performance at scale are becoming important factors to consider as we design, develop and deploy next-generation high-performance computing systems. Since DL applications rely heavily on DL frameworks and underlying compute (CPU/GPU) stacks, it is essential to gain a holistic understanding from compute kernels, models, and frameworks of popular DL stacks, and to assess their impact on science-driven, mission-critical applications. At Oak Ridge Leadership Computing Facility (OLCF), we employ a set of micro and macro DL benchmarks established through the Collaboration of Oak Ridge, Argonne, and Livermore (CORAL) to evaluate the AI readiness of our next-generation supercomputers. In this paper, we present our early observations and performance benchmark comparisons between the Nvidia V100 based Summit system with its CUDA stack and an AMD MI100 based testbed system with its ROCm stack. We take a layered perspective on DL benchmarking and point to opportunities for future optimizations in the technologies that we consider.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"1 1","pages":"Article 100005"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485921000053/pdfft?md5=7170efb2f45da50210176495650c4232&pid=1-s2.0-S2772485921000053-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76454943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Call for establishing benchmark science and engineering 呼吁建立科学与工程标杆
Pub Date : 2021-10-01 DOI: 10.1016/j.tbench.2021.100012
Jianfeng Zhan

Currently, there is no consistent benchmarking across multi-disciplines. Even no previous work tries to relate different categories of benchmarks in multi-disciplines. This article investigates the origin and evolution of the benchmark term. Five categories of benchmarks are summarized, including measurement standards, standardized data sets with defined properties, representative workloads, representative data sets, and best practices, which widely exist in multi-disciplines. I believe there are two pressing challenges in growing this discipline: establishing consistent benchmarking across multi-disciplines and developing meta-benchmark to measure the benchmarks themselves. I propose establishing benchmark science and engineering; one of the primary goals is to set up a standard benchmark hierarchy across multi-disciplines. It is the right time to launch a multi-disciplinary benchmark, standard, and evaluation journal, TBench, to communicate the state-of-the-art and state-of-the-practice of benchmark science and engineering.

目前,没有跨多学科的一致基准。甚至没有以前的工作试图将不同类别的基准在多学科中联系起来。本文研究基准术语的起源和演变。本文总结了五类基准,包括度量标准、具有定义属性的标准化数据集、代表性工作负载、代表性数据集和最佳实践,它们广泛存在于多学科中。我认为发展这一学科有两个紧迫的挑战:在多学科之间建立一致的基准和开发元基准来衡量基准本身。我建议建立科学和工程的标杆;主要目标之一是建立跨多学科的标准基准层次结构。现在是时候推出一个多学科的基准、标准和评估期刊TBench,以交流基准科学和工程的最先进和最先进的实践。
{"title":"Call for establishing benchmark science and engineering","authors":"Jianfeng Zhan","doi":"10.1016/j.tbench.2021.100012","DOIUrl":"https://doi.org/10.1016/j.tbench.2021.100012","url":null,"abstract":"<div><p>Currently, there is no consistent benchmarking across multi-disciplines. Even no previous work tries to relate different categories of benchmarks in multi-disciplines. This article investigates the origin and evolution of the benchmark term. Five categories of benchmarks are summarized, including measurement standards, standardized data sets with defined properties, representative workloads, representative data sets, and best practices, which widely exist in multi-disciplines. I believe there are two pressing challenges in growing this discipline: establishing consistent benchmarking across multi-disciplines and developing meta-benchmark to measure the benchmarks themselves. I propose establishing benchmark science and engineering; one of the primary goals is to set up a standard benchmark hierarchy across multi-disciplines. It is the right time to launch a multi-disciplinary benchmark, standard, and evaluation journal, TBench, to communicate the state-of-the-art and state-of-the-practice of benchmark science and engineering.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":"1 1","pages":"Article 100012"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485921000120/pdfft?md5=a1ed86c4fa15d92ea898e2111c96d7b9&pid=1-s2.0-S2772485921000120-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92003798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
BenchCouncil Transactions on Benchmarks, Standards and Evaluations
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1