Automated test generation for OpenCL kernels using fuzzing and constraint solving

Chao Peng, A. Rajan
{"title":"Automated test generation for OpenCL kernels using fuzzing and constraint solving","authors":"Chao Peng, A. Rajan","doi":"10.1145/3366428.3380768","DOIUrl":null,"url":null,"abstract":"Graphics Processing Units (GPUs) are massively parallel processors offering performance acceleration and energy efficiency unmatched by current processors (CPUs) in computers. These advantages along with recent advances in the programmability of GPUs have made them attractive for general-purpose computations. Despite the advances in programmability, GPU kernels are hard to code and analyse due to the high complexity of memory sharing patterns, striding patterns for memory accesses, implicit synchronisation, and combinatorial explosion of thread interleavings. Existing few techniques for testing GPU kernels use symbolic execution for test generation that incur a high overhead, have limited scalability and do not handle all data types. We propose a test generation technique for OpenCL kernels that combines mutation-based fuzzing and selective constraint solving with the goal of being fast, effective and scalable. Fuzz testing for GPU kernels has not been explored previously. Our approach for fuzz testing randomly mutates input kernel argument values with the goal of increasing branch coverage. When fuzz testing is unable to increase branch coverage with random mutations, we gather path constraints for uncovered branch conditions and invoke the Z3 constraint solver to generate tests for them. In addition to the test generator, we also present a schedule amplifier that simulates multiple work-group schedules, with which to execute each of the generated tests. The schedule amplifier is designed to help uncover inter work-group data races. We evaluate the effectiveness of the generated tests and schedule amplifier using 217 kernels from open source projects and industry standard benchmark suites measuring branch coverage and fault finding. We find our test generation technique achieves close to 100% coverage and mutation score for majority of the kernels. Overhead incurred in test generation is small (average of 0.8 seconds). We also confirmed our technique scales easily to large kernels, and can support all OpenCL data types, including complex data structures.","PeriodicalId":266831,"journal":{"name":"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit","volume":"248 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366428.3380768","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Graphics Processing Units (GPUs) are massively parallel processors offering performance acceleration and energy efficiency unmatched by current processors (CPUs) in computers. These advantages along with recent advances in the programmability of GPUs have made them attractive for general-purpose computations. Despite the advances in programmability, GPU kernels are hard to code and analyse due to the high complexity of memory sharing patterns, striding patterns for memory accesses, implicit synchronisation, and combinatorial explosion of thread interleavings. Existing few techniques for testing GPU kernels use symbolic execution for test generation that incur a high overhead, have limited scalability and do not handle all data types. We propose a test generation technique for OpenCL kernels that combines mutation-based fuzzing and selective constraint solving with the goal of being fast, effective and scalable. Fuzz testing for GPU kernels has not been explored previously. Our approach for fuzz testing randomly mutates input kernel argument values with the goal of increasing branch coverage. When fuzz testing is unable to increase branch coverage with random mutations, we gather path constraints for uncovered branch conditions and invoke the Z3 constraint solver to generate tests for them. In addition to the test generator, we also present a schedule amplifier that simulates multiple work-group schedules, with which to execute each of the generated tests. The schedule amplifier is designed to help uncover inter work-group data races. We evaluate the effectiveness of the generated tests and schedule amplifier using 217 kernels from open source projects and industry standard benchmark suites measuring branch coverage and fault finding. We find our test generation technique achieves close to 100% coverage and mutation score for majority of the kernels. Overhead incurred in test generation is small (average of 0.8 seconds). We also confirmed our technique scales easily to large kernels, and can support all OpenCL data types, including complex data structures.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用模糊测试和约束求解自动生成OpenCL内核测试
图形处理单元(gpu)是大规模并行处理器,提供当前计算机处理器(cpu)无法比拟的性能加速和能效。这些优点以及gpu可编程性的最新进展使它们对通用计算具有吸引力。尽管在可编程性方面取得了进步,但由于内存共享模式、内存访问的跨行模式、隐式同步和线程交错的组合爆炸的高度复杂性,GPU内核很难编码和分析。现有的几种测试GPU内核的技术使用符号执行来生成测试,这会产生很高的开销,具有有限的可伸缩性,并且不能处理所有数据类型。我们提出了一种针对OpenCL内核的测试生成技术,该技术结合了基于突变的模糊和选择性约束求解,以实现快速、有效和可扩展的目标。GPU内核的模糊测试以前没有被探索过。我们的模糊测试方法随机改变输入的内核参数值,目标是增加分支覆盖率。当模糊测试无法通过随机突变增加分支覆盖率时,我们为未覆盖的分支条件收集路径约束,并调用Z3约束求解器为它们生成测试。除了测试生成器之外,我们还提供了一个计划放大器,它模拟多个工作组计划,用它来执行每个生成的测试。计划放大器的设计是为了帮助发现工作组之间的数据竞争。我们使用来自开源项目的217个内核和衡量分支覆盖率和故障查找的行业标准基准套件来评估生成的测试和调度放大器的有效性。我们发现我们的测试生成技术对大多数内核实现了接近100%的覆盖率和突变分数。在测试生成中产生的开销很小(平均为0.8秒)。我们还证实,我们的技术可以很容易地扩展到大型内核,并且可以支持所有OpenCL数据类型,包括复杂的数据结构。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Minos Computing Library: efficient parallel programming for extremely heterogeneous systems GPGPU performance estimation for frequency scaling using cross-benchmarking Unveiling kernel concurrency in multiresolution filters on GPUs with an image processing DSL Automatic generation of specialized direct convolutions for mobile GPUs High-level hardware feature extraction for GPU performance prediction of stencils
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1