MIPP:一个用于GPU架构性能、功耗和能耗表征的微基准套件

2016 11th IEEE Symposium on Industrial Embedded Systems (SIES) Pub Date : 2016-05-23 DOI:10.1109/SIES.2016.7509423

N. Bombieri, F. Busato, F. Fummi, Michele Scala

{"title":"MIPP:一个用于GPU架构性能、功耗和能耗表征的微基准套件","authors":"N. Bombieri, F. Busato, F. Fummi, Michele Scala","doi":"10.1109/SIES.2016.7509423","DOIUrl":null,"url":null,"abstract":"GPU-accelerated applications are becoming increasingly common in high-performance computing as well as in low-power heterogeneous embedded systems. Nevertheless, GPU programming is a challenging task, especially if a GPU application has to be tuned to fully take advantage of the GPU architectural configuration. Even more challenging is the application tuning by considering power and energy consumption, which have emerged as first-order design constraints in addition to performance. Solving bottlenecks of a GPU application such as high thread divergence or poor memory coalescing have a different impact on the overall performance, power and energy consumption. Such an impact also depends on the GPU device on which the application is run. This paper presents a suite of microbenchmarks, which are specialized chunks of GPU code that exercise specific device components (e.g., arithmetic instruction units, shared memory, cache, DRAM, etc.) and that provide the actual characteristics of such components in terms of throughput, power, and energy consumption. The suite aims at enriching standard profiler information and guiding the GPU application tuning on a specific GPU architecture by considering all three design constraints (i.e., power, performance, energy consumption). The paper presents the results obtained by applying the proposed suite to characterize two different GPU devices and to understand how application tuning may impact differently on them.","PeriodicalId":185636,"journal":{"name":"2016 11th IEEE Symposium on Industrial Embedded Systems (SIES)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"MIPP: A microbenchmark suite for performance, power, and energy consumption characterization of GPU architectures\",\"authors\":\"N. Bombieri, F. Busato, F. Fummi, Michele Scala\",\"doi\":\"10.1109/SIES.2016.7509423\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"GPU-accelerated applications are becoming increasingly common in high-performance computing as well as in low-power heterogeneous embedded systems. Nevertheless, GPU programming is a challenging task, especially if a GPU application has to be tuned to fully take advantage of the GPU architectural configuration. Even more challenging is the application tuning by considering power and energy consumption, which have emerged as first-order design constraints in addition to performance. Solving bottlenecks of a GPU application such as high thread divergence or poor memory coalescing have a different impact on the overall performance, power and energy consumption. Such an impact also depends on the GPU device on which the application is run. This paper presents a suite of microbenchmarks, which are specialized chunks of GPU code that exercise specific device components (e.g., arithmetic instruction units, shared memory, cache, DRAM, etc.) and that provide the actual characteristics of such components in terms of throughput, power, and energy consumption. The suite aims at enriching standard profiler information and guiding the GPU application tuning on a specific GPU architecture by considering all three design constraints (i.e., power, performance, energy consumption). The paper presents the results obtained by applying the proposed suite to characterize two different GPU devices and to understand how application tuning may impact differently on them.\",\"PeriodicalId\":185636,\"journal\":{\"name\":\"2016 11th IEEE Symposium on Industrial Embedded Systems (SIES)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 11th IEEE Symposium on Industrial Embedded Systems (SIES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIES.2016.7509423\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 11th IEEE Symposium on Industrial Embedded Systems (SIES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIES.2016.7509423","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

gpu加速应用程序在高性能计算以及低功耗异构嵌入式系统中变得越来越普遍。然而，GPU编程是一项具有挑战性的任务，特别是如果GPU应用程序必须调整以充分利用GPU架构配置。更具有挑战性的是通过考虑功率和能耗来进行应用程序调优，除了性能之外，这已经成为一阶设计约束。解决GPU应用程序的瓶颈，如高线程散度或差的内存合并，对整体性能，功率和能耗有不同的影响。这种影响还取决于运行应用程序的GPU设备。本文提出了一套微基准测试，这些微基准测试是专门的GPU代码块，用于运行特定的设备组件(例如，算术指令单元，共享内存，缓存，DRAM等)，并提供这些组件在吞吐量，功率和能耗方面的实际特性。该套件旨在丰富标准分析器信息，并通过考虑所有三个设计约束(即功率，性能，能耗)来指导特定GPU架构上的GPU应用程序调优。本文介绍了应用所提出的套件来描述两种不同GPU设备的结果，并了解应用程序调优如何对它们产生不同的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MIPP: A microbenchmark suite for performance, power, and energy consumption characterization of GPU architectures

GPU-accelerated applications are becoming increasingly common in high-performance computing as well as in low-power heterogeneous embedded systems. Nevertheless, GPU programming is a challenging task, especially if a GPU application has to be tuned to fully take advantage of the GPU architectural configuration. Even more challenging is the application tuning by considering power and energy consumption, which have emerged as first-order design constraints in addition to performance. Solving bottlenecks of a GPU application such as high thread divergence or poor memory coalescing have a different impact on the overall performance, power and energy consumption. Such an impact also depends on the GPU device on which the application is run. This paper presents a suite of microbenchmarks, which are specialized chunks of GPU code that exercise specific device components (e.g., arithmetic instruction units, shared memory, cache, DRAM, etc.) and that provide the actual characteristics of such components in terms of throughput, power, and energy consumption. The suite aims at enriching standard profiler information and guiding the GPU application tuning on a specific GPU architecture by considering all three design constraints (i.e., power, performance, energy consumption). The paper presents the results obtained by applying the proposed suite to characterize two different GPU devices and to understand how application tuning may impact differently on them.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 11th IEEE Symposium on Industrial Embedded Systems (SIES)

自引率

0.00%

发文量