Execution time prediction model for parallel GPU realizations of discrete transforms computation algorithms

IF 1.2 4区 工程技术 Q3 ENGINEERING, MULTIDISCIPLINARY Bulletin of the Polish Academy of Sciences-Technical Sciences Pub Date : 2023-11-06 DOI:10.24425/bpasts.2021.139393
Dariusz Puchala, Kamil Stokfiszewski, Kamil Wieloch
{"title":"Execution time prediction model for parallel GPU realizations of discrete transforms computation algorithms","authors":"Dariusz Puchala, Kamil Stokfiszewski, Kamil Wieloch","doi":"10.24425/bpasts.2021.139393","DOIUrl":null,"url":null,"abstract":"Abstract. Parallel realizations of discrete transforms (DTs) computation algorithms (DTCAs) performed on graphics processing units (GPUs) play a significant role in many modern data processing methods utilized in numerous areas of human activity. In this paper the authors propose a novel execution time prediction model, which allows for accurate and rapid estimation of execution times of various kinds of structurally different DTCAs performed on GPUs of distinct architectures, without the necessity of conducting the actual experiments on physical hardware. The model can serve as a guide for the system analyst in making the optimal choice of the GPU hardware solution for a given computational task involving particular DT calculation, or can help in choosing the best appropriate parallel implementation of the selected DT, given the limitations imposed by available hardware. Restricting the model to exhaustively adhere only to the key common features of DTCAs, enables the authors to significantly simplify its structure, leading consequently to its design as a hybrid, analytically–simulationial method, exploiting jointly the main advantages of both of the mentioned techniques, namely: time-effectiveness and high prediction accuracy, while, at the same time, causing mutual elimination of the major weaknesses of both of the specified approaches within the proposed solution. The model is validated experimentally on two structurally different parallel methods of discrete wavelet transform (DWT) computation, i.e. the direct convolution-based and lattice structure-based schemes, by comparing its prediction results with the actual measurements taken for 6 different graphics cards, representing a fairly broad spectrum of GPUs’ compute architectures. Experimental results reveal the model’s overall average execution time prediction accuracy to be at a level of 97.2%, with global maximum prediction error of 14.5%, recorded throughout all the conducted experiments, maintaining at the same time high average evaluation speed of 3.5 ms for single simulation duration. The results allow to infer model’s generality and possibility of extrapolation to other DTCAs and different GPU architectures, what, along with the proposed model’s straightforwardness, time-effectiveness and ease of practical application, makes it, in the authors’ opinion, a very interesting alternative to the related existing solutions.","PeriodicalId":55299,"journal":{"name":"Bulletin of the Polish Academy of Sciences-Technical Sciences","volume":"13 11","pages":"0"},"PeriodicalIF":1.2000,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of the Polish Academy of Sciences-Technical Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24425/bpasts.2021.139393","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 1

Abstract

Abstract. Parallel realizations of discrete transforms (DTs) computation algorithms (DTCAs) performed on graphics processing units (GPUs) play a significant role in many modern data processing methods utilized in numerous areas of human activity. In this paper the authors propose a novel execution time prediction model, which allows for accurate and rapid estimation of execution times of various kinds of structurally different DTCAs performed on GPUs of distinct architectures, without the necessity of conducting the actual experiments on physical hardware. The model can serve as a guide for the system analyst in making the optimal choice of the GPU hardware solution for a given computational task involving particular DT calculation, or can help in choosing the best appropriate parallel implementation of the selected DT, given the limitations imposed by available hardware. Restricting the model to exhaustively adhere only to the key common features of DTCAs, enables the authors to significantly simplify its structure, leading consequently to its design as a hybrid, analytically–simulationial method, exploiting jointly the main advantages of both of the mentioned techniques, namely: time-effectiveness and high prediction accuracy, while, at the same time, causing mutual elimination of the major weaknesses of both of the specified approaches within the proposed solution. The model is validated experimentally on two structurally different parallel methods of discrete wavelet transform (DWT) computation, i.e. the direct convolution-based and lattice structure-based schemes, by comparing its prediction results with the actual measurements taken for 6 different graphics cards, representing a fairly broad spectrum of GPUs’ compute architectures. Experimental results reveal the model’s overall average execution time prediction accuracy to be at a level of 97.2%, with global maximum prediction error of 14.5%, recorded throughout all the conducted experiments, maintaining at the same time high average evaluation speed of 3.5 ms for single simulation duration. The results allow to infer model’s generality and possibility of extrapolation to other DTCAs and different GPU architectures, what, along with the proposed model’s straightforwardness, time-effectiveness and ease of practical application, makes it, in the authors’ opinion, a very interesting alternative to the related existing solutions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
并行GPU实现离散变换计算算法的执行时间预测模型
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.80
自引率
16.70%
发文量
0
审稿时长
6-12 weeks
期刊介绍: The Bulletin of the Polish Academy of Sciences: Technical Sciences is published bimonthly by the Division IV Engineering Sciences of the Polish Academy of Sciences, since the beginning of the existence of the PAS in 1952. The journal is peer‐reviewed and is published both in printed and electronic form. It is established for the publication of original high quality papers from multidisciplinary Engineering sciences with the following topics preferred: Artificial and Computational Intelligence, Biomedical Engineering and Biotechnology, Civil Engineering, Control, Informatics and Robotics, Electronics, Telecommunication and Optoelectronics, Mechanical and Aeronautical Engineering, Thermodynamics, Material Science and Nanotechnology, Power Systems and Power Electronics.
期刊最新文献
148439 148440 Enhancement of COVID-19 symptom-based screening with quality-based classifier optimisation Analyzing and improving tools for supporting fighting against COVID-19 based on prediction models and contact tracing The Effect of Protrusions on the Initiation of Partial Discharges in XLPE High Voltage Cables
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1