Prioritizing Alerts from Multiple Static Analysis Tools, Using Classification Models

2018 IEEE/ACM 1st International Workshop on Software Qualities and their Dependencies (SQUADE) Pub Date : 2018-05-28 DOI:10.1145/3194095.3194100

Lori Flynn, William Snavely, David Svoboda, Nathan M. VanHoudnos, Richard Qin, Jennifer Burns, D. Zubrow, R. Stoddard, Guillermo Marce-Santurio

{"title":"Prioritizing Alerts from Multiple Static Analysis Tools, Using Classification Models","authors":"Lori Flynn, William Snavely, David Svoboda, Nathan M. VanHoudnos, Richard Qin, Jennifer Burns, D. Zubrow, R. Stoddard, Guillermo Marce-Santurio","doi":"10.1145/3194095.3194100","DOIUrl":null,"url":null,"abstract":"Static analysis (SA) tools examine code for flaws without executing the code, and produce warnings (\"alerts\") about possible flaws. A human auditor then evaluates the validity of the purported code flaws. The effort required to manually audit all alerts and repair all confirmed code flaws is often too much for a project's budget and schedule. An alert triaging tool enables strategically prioritizing alerts for examination, and could use classifier confidence. We developed and tested classification models that predict if static analysis alerts are true or false positives, using a novel combination of multiple static analysis tools, features from the alerts, alert fusion, code base metrics, and archived audit determinations. We developed classifiers using a partition of the data, then evaluated the performance of the classifier using standard measurements, including specificity, sensitivity, and accuracy. Test results and overall data analysis show accurate classifiers were developed, and specifically using multiple SA tools increased classifier accuracy, but labeled data for many types of flaws were inadequately represented (if at all) in the archive data, resulting in poor predictive accuracy for many of those flaws.","PeriodicalId":103582,"journal":{"name":"2018 IEEE/ACM 1st International Workshop on Software Qualities and their Dependencies (SQUADE)","volume":"196 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM 1st International Workshop on Software Qualities and their Dependencies (SQUADE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3194095.3194100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

Abstract

Static analysis (SA) tools examine code for flaws without executing the code, and produce warnings ("alerts") about possible flaws. A human auditor then evaluates the validity of the purported code flaws. The effort required to manually audit all alerts and repair all confirmed code flaws is often too much for a project's budget and schedule. An alert triaging tool enables strategically prioritizing alerts for examination, and could use classifier confidence. We developed and tested classification models that predict if static analysis alerts are true or false positives, using a novel combination of multiple static analysis tools, features from the alerts, alert fusion, code base metrics, and archived audit determinations. We developed classifiers using a partition of the data, then evaluated the performance of the classifier using standard measurements, including specificity, sensitivity, and accuracy. Test results and overall data analysis show accurate classifiers were developed, and specifically using multiple SA tools increased classifier accuracy, but labeled data for many types of flaws were inadequately represented (if at all) in the archive data, resulting in poor predictive accuracy for many of those flaws.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用分类模型对来自多个静态分析工具的警报进行优先级排序

静态分析(SA)工具在不执行代码的情况下检查代码中的缺陷，并对可能的缺陷产生警告(“警报”)。然后，人工审核员评估所声称的代码缺陷的有效性。手动审核所有警报和修复所有已确认的代码缺陷所需要的工作量，对于项目的预算和进度来说，往往是太多了。警报分类工具可以战略性地对警报进行优先级排序，并可以使用分类器置信度。我们开发并测试了分类模型，使用多种静态分析工具、警报特性、警报融合、代码库度量和存档审计决定的新组合来预测静态分析警报是真还是假。我们使用数据的一个分区来开发分类器，然后使用标准测量来评估分类器的性能，包括特异性、灵敏度和准确性。测试结果和总体数据分析表明，开发了准确的分类器，特别是使用多个SA工具提高了分类器的准确性，但是许多类型的缺陷的标记数据在存档数据中没有充分表示(如果有的话)，导致许多这些缺陷的预测准确性很差。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 IEEE/ACM 1st International Workshop on Software Qualities and their Dependencies (SQUADE)

自引率

0.00%

发文量