Boosting symbolic execution via constraint solving time prediction (experience paper)

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis Pub Date : 2021-07-11 DOI:10.1145/3460319.3464813

Sicheng Luo, Hui Xu, Yanxiang Bi, Xin Wang, Yangfan Zhou

{"title":"Boosting symbolic execution via constraint solving time prediction (experience paper)","authors":"Sicheng Luo, Hui Xu, Yanxiang Bi, Xin Wang, Yangfan Zhou","doi":"10.1145/3460319.3464813","DOIUrl":null,"url":null,"abstract":"Symbolic execution is an essential approach for automated test case generation. However, the approach is generally not scalable to large programs. One critical reason is that the constraint solving problems in symbolic execution are generally hard. Consequently, the symbolic execution process may get stuck in solving such hard problems. To mitigate this issue, symbolic execution tools generally rely on a timeout threshold to terminate the solving. Such a timeout is generally set to a fixed, predefined value, e.g., five minutes in angr. Nevertheless, how to set a proper timeout is critical to the tool’s efficiency. This paper proposes an approach to tackle the problem by predicting the time required for solving a constraint model so that the symbolic execution engine could base on the information to determine whether to continue the current solving process. Due to the cost of the prediction itself, our approach triggers the predictor only when the solving time has exceeded a relatively small value. We have shown that such a predictor can achieve promising performance with several different machine learning models and datasets. By further employing an adaptive design, the predictor can achieve an F1-score ranging from 0.743 to 0.800 on these datasets. We then apply the predictor to eight programs and conduct simulation experiments. Results show that the efficiency of constraint solving for symbolic execution can be improved by 1.25x to 3x, depending on the distribution of the hardness of their constraint models.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3460319.3464813","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Symbolic execution is an essential approach for automated test case generation. However, the approach is generally not scalable to large programs. One critical reason is that the constraint solving problems in symbolic execution are generally hard. Consequently, the symbolic execution process may get stuck in solving such hard problems. To mitigate this issue, symbolic execution tools generally rely on a timeout threshold to terminate the solving. Such a timeout is generally set to a fixed, predefined value, e.g., five minutes in angr. Nevertheless, how to set a proper timeout is critical to the tool’s efficiency. This paper proposes an approach to tackle the problem by predicting the time required for solving a constraint model so that the symbolic execution engine could base on the information to determine whether to continue the current solving process. Due to the cost of the prediction itself, our approach triggers the predictor only when the solving time has exceeded a relatively small value. We have shown that such a predictor can achieve promising performance with several different machine learning models and datasets. By further employing an adaptive design, the predictor can achieve an F1-score ranging from 0.743 to 0.800 on these datasets. We then apply the predictor to eight programs and conduct simulation experiments. Results show that the efficiency of constraint solving for symbolic execution can be improved by 1.25x to 3x, depending on the distribution of the hardness of their constraint models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过约束求解时间预测提高符号执行(经验论文)

符号执行是自动化测试用例生成的基本方法。然而，这种方法通常不能扩展到大型程序中。一个关键原因是符号执行中的约束解决问题通常很困难。因此，符号执行过程可能会在解决这些难题时陷入困境。为了缓解这个问题，符号执行工具通常依赖于超时阈值来终止求解。这种超时通常设置为固定的预定义值，例如，在angr中设置为5分钟。然而，如何设置适当的超时对工具的效率至关重要。本文提出了一种通过预测求解约束模型所需的时间来解决这个问题的方法，以便符号执行引擎可以根据这些信息来决定是否继续当前的求解过程。由于预测本身的成本，我们的方法仅在求解时间超过一个相对较小的值时触发预测器。我们已经证明，这样的预测器可以在几个不同的机器学习模型和数据集上实现有希望的性能。通过进一步采用自适应设计，预测器可以在这些数据集上获得从0.743到0.800的f1分数。然后将该预测器应用于8个程序并进行了仿真实验。结果表明，根据约束模型的硬度分布，符号执行的约束求解效率可以提高1.25倍到3倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

自引率

0.00%

发文量

期刊最新文献

Semantic table structure identification in spreadsheets Parema: an unpacking framework for demystifying VM-based Android packers TERA: optimizing stochastic regression tests in machine learning projects Empirically evaluating readily available information for regression test optimization in continuous integration RESTest: automated black-box testing of RESTful web APIs