Elena Aragon, J. M. Jiménez, Arian Maghazeh, J. Rasmusson, Unmesh D. Bordoloi
{"title":"Pattern matching in OpenCL: GPU vs CPU energy consumption on two mobile chipsets","authors":"Elena Aragon, J. M. Jiménez, Arian Maghazeh, J. Rasmusson, Unmesh D. Bordoloi","doi":"10.1145/2664666.2664671","DOIUrl":null,"url":null,"abstract":"Adaptations of the Aho-Corasick (AC) algorithm on high performance graphics processors (also called GPUs) have garnered increasing attention in recent years. However, no results have been reported regarding their implementations on mobile GPUs. In this paper, we show that implementing a state-of-the-art Aho-Corasick parallel algorithm on a mobile GPU delivers significant speedups. We study a few implementation optimizations some of which may seem counter-intuitive to standard optimizations for high-end GPUs. More importantly, we focus on measuring the energy consumed by different components of the OpenCL application rather than reporting the aggregate. We show that there are considerable energy savings compared to the CPU implementation of the AC algorithm.","PeriodicalId":73497,"journal":{"name":"International Workshop on OpenCL","volume":"53 1","pages":"5:1-5:7"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on OpenCL","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2664666.2664671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Adaptations of the Aho-Corasick (AC) algorithm on high performance graphics processors (also called GPUs) have garnered increasing attention in recent years. However, no results have been reported regarding their implementations on mobile GPUs. In this paper, we show that implementing a state-of-the-art Aho-Corasick parallel algorithm on a mobile GPU delivers significant speedups. We study a few implementation optimizations some of which may seem counter-intuitive to standard optimizations for high-end GPUs. More importantly, we focus on measuring the energy consumed by different components of the OpenCL application rather than reporting the aggregate. We show that there are considerable energy savings compared to the CPU implementation of the AC algorithm.