使用混合程序分析挖掘SQL注入和跨站点脚本漏洞

2013 35th International Conference on Software Engineering (ICSE) Pub Date : 2013-05-18 DOI:10.1109/ICSE.2013.6606610

Lwin Khin Shar, Hee Beng Kuan Tan, L. Briand

{"title":"使用混合程序分析挖掘SQL注入和跨站点脚本漏洞","authors":"Lwin Khin Shar, Hee Beng Kuan Tan, L. Briand","doi":"10.1109/ICSE.2013.6606610","DOIUrl":null,"url":null,"abstract":"In previous work, we proposed a set of static attributes that characterize input validation and input sanitization code patterns. We showed that some of the proposed static attributes are significant predictors of SQL injection and cross site scripting vulnerabilities. Static attributes have the advantage of reflecting general properties of a program. Yet, dynamic attributes collected from execution traces may reflect more specific code characteristics that are complementary to static attributes. Hence, to improve our initial work, in this paper, we propose the use of dynamic attributes to complement static attributes in vulnerability prediction. Furthermore, since existing work relies on supervised learning, it is dependent on the availability of training data labeled with known vulnerabilities. This paper presents prediction models that are based on both classification and clustering in order to predict vulnerabilities, working in the presence or absence of labeled training data, respectively. In our experiments across six applications, our new supervised vulnerability predictors based on hybrid (static and dynamic) attributes achieved, on average, 90% recall and 85% precision, that is a sharp increase in recall when compared to static analysis-based predictions. Though not nearly as accurate, our unsupervised predictors based on clustering achieved, on average, 76% recall and 39% precision, thus suggesting they can be useful in the absence of labeled training data.","PeriodicalId":322423,"journal":{"name":"2013 35th International Conference on Software Engineering (ICSE)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"99","resultStr":"{\"title\":\"Mining SQL injection and cross site scripting vulnerabilities using hybrid program analysis\",\"authors\":\"Lwin Khin Shar, Hee Beng Kuan Tan, L. Briand\",\"doi\":\"10.1109/ICSE.2013.6606610\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In previous work, we proposed a set of static attributes that characterize input validation and input sanitization code patterns. We showed that some of the proposed static attributes are significant predictors of SQL injection and cross site scripting vulnerabilities. Static attributes have the advantage of reflecting general properties of a program. Yet, dynamic attributes collected from execution traces may reflect more specific code characteristics that are complementary to static attributes. Hence, to improve our initial work, in this paper, we propose the use of dynamic attributes to complement static attributes in vulnerability prediction. Furthermore, since existing work relies on supervised learning, it is dependent on the availability of training data labeled with known vulnerabilities. This paper presents prediction models that are based on both classification and clustering in order to predict vulnerabilities, working in the presence or absence of labeled training data, respectively. In our experiments across six applications, our new supervised vulnerability predictors based on hybrid (static and dynamic) attributes achieved, on average, 90% recall and 85% precision, that is a sharp increase in recall when compared to static analysis-based predictions. Though not nearly as accurate, our unsupervised predictors based on clustering achieved, on average, 76% recall and 39% precision, thus suggesting they can be useful in the absence of labeled training data.\",\"PeriodicalId\":322423,\"journal\":{\"name\":\"2013 35th International Conference on Software Engineering (ICSE)\",\"volume\":\"94 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"99\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 35th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE.2013.6606610\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 35th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2013.6606610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 99

摘要

在之前的工作中，我们提出了一组描述输入验证和输入清理代码模式的静态属性。我们展示了一些建议的静态属性是SQL注入和跨站点脚本漏洞的重要预测因素。静态属性具有反映程序一般属性的优点。然而，从执行跟踪中收集的动态属性可能反映出与静态属性互补的更具体的代码特征。因此，为了改进我们的前期工作，在本文中，我们提出在漏洞预测中使用动态属性来补充静态属性。此外，由于现有的工作依赖于监督学习，它依赖于标记有已知漏洞的训练数据的可用性。本文提出了基于分类和聚类的预测模型，以预测漏洞，分别在有或没有标记训练数据的情况下工作。在我们对六个应用程序的实验中，我们基于混合(静态和动态)属性的新监督漏洞预测器平均实现了90%的召回率和85%的准确率，与基于静态分析的预测相比，召回率大幅提高。虽然没有那么准确，但我们基于聚类的无监督预测器平均达到了76%的召回率和39%的准确率，因此表明它们在没有标记训练数据的情况下是有用的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Mining SQL injection and cross site scripting vulnerabilities using hybrid program analysis

In previous work, we proposed a set of static attributes that characterize input validation and input sanitization code patterns. We showed that some of the proposed static attributes are significant predictors of SQL injection and cross site scripting vulnerabilities. Static attributes have the advantage of reflecting general properties of a program. Yet, dynamic attributes collected from execution traces may reflect more specific code characteristics that are complementary to static attributes. Hence, to improve our initial work, in this paper, we propose the use of dynamic attributes to complement static attributes in vulnerability prediction. Furthermore, since existing work relies on supervised learning, it is dependent on the availability of training data labeled with known vulnerabilities. This paper presents prediction models that are based on both classification and clustering in order to predict vulnerabilities, working in the presence or absence of labeled training data, respectively. In our experiments across six applications, our new supervised vulnerability predictors based on hybrid (static and dynamic) attributes achieved, on average, 90% recall and 85% precision, that is a sharp increase in recall when compared to static analysis-based predictions. Though not nearly as accurate, our unsupervised predictors based on clustering achieved, on average, 76% recall and 39% precision, thus suggesting they can be useful in the absence of labeled training data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 35th International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量