Hanqing Jiang, Shaopei Ji, Chengchao Zha, Yanhong Liu
{"title":"基于代码属性图展示和 Bi-LSTM 神经网络提取的软件漏洞检测方法","authors":"Hanqing Jiang, Shaopei Ji, Chengchao Zha, Yanhong Liu","doi":"10.1117/12.3032032","DOIUrl":null,"url":null,"abstract":"Nowadays, the scale of software is getting larger and more complex. The forms of vulnerability also tend to be more diversified. Traditional vulnerability detection methods have the disadvantages of high manual participation and weak ability to detect unknown vulnerabilities. It can no longer meet the detection requirements of diversified vulnerabilities. In order to improve the detection effect of unknown vulnerabilities, A large number of machine learning methods have been applied to the field of software vulnerability detection. Because the existing methods have high loss of syntax and semantic information in the process of code representation, Lead to high false alarm rate and false alarm rate. To solve this problem, this paper presents a software vulnerability detection method based on code attribute graph and Bi-LSTM (Long Short-Term Memory), in which abstract syntax tree sequence and control flow graph sequence are extracted from the code attribute graph of function, Reduce the loss of information in the process of code representation, Bi-LSTM is selected to build a feature extraction model, Experimental results show that, compared with the method based on abstract syntax tree, this method can greatly improve the accuracy and recall of vulnerability detection, improve the vulnerability detection effect for real data sets mixed with multiple software source codes, and effectively reduce the false positive rate and false negative rate.","PeriodicalId":198425,"journal":{"name":"Other Conferences","volume":"18 2‐3","pages":"131751N - 131751N-10"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Software vulnerability detection method based on code attribute graph presentation and Bi-LSTM neural network extraction\",\"authors\":\"Hanqing Jiang, Shaopei Ji, Chengchao Zha, Yanhong Liu\",\"doi\":\"10.1117/12.3032032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, the scale of software is getting larger and more complex. The forms of vulnerability also tend to be more diversified. Traditional vulnerability detection methods have the disadvantages of high manual participation and weak ability to detect unknown vulnerabilities. It can no longer meet the detection requirements of diversified vulnerabilities. In order to improve the detection effect of unknown vulnerabilities, A large number of machine learning methods have been applied to the field of software vulnerability detection. Because the existing methods have high loss of syntax and semantic information in the process of code representation, Lead to high false alarm rate and false alarm rate. To solve this problem, this paper presents a software vulnerability detection method based on code attribute graph and Bi-LSTM (Long Short-Term Memory), in which abstract syntax tree sequence and control flow graph sequence are extracted from the code attribute graph of function, Reduce the loss of information in the process of code representation, Bi-LSTM is selected to build a feature extraction model, Experimental results show that, compared with the method based on abstract syntax tree, this method can greatly improve the accuracy and recall of vulnerability detection, improve the vulnerability detection effect for real data sets mixed with multiple software source codes, and effectively reduce the false positive rate and false negative rate.\",\"PeriodicalId\":198425,\"journal\":{\"name\":\"Other Conferences\",\"volume\":\"18 2‐3\",\"pages\":\"131751N - 131751N-10\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Other Conferences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3032032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Other Conferences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3032032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Software vulnerability detection method based on code attribute graph presentation and Bi-LSTM neural network extraction
Nowadays, the scale of software is getting larger and more complex. The forms of vulnerability also tend to be more diversified. Traditional vulnerability detection methods have the disadvantages of high manual participation and weak ability to detect unknown vulnerabilities. It can no longer meet the detection requirements of diversified vulnerabilities. In order to improve the detection effect of unknown vulnerabilities, A large number of machine learning methods have been applied to the field of software vulnerability detection. Because the existing methods have high loss of syntax and semantic information in the process of code representation, Lead to high false alarm rate and false alarm rate. To solve this problem, this paper presents a software vulnerability detection method based on code attribute graph and Bi-LSTM (Long Short-Term Memory), in which abstract syntax tree sequence and control flow graph sequence are extracted from the code attribute graph of function, Reduce the loss of information in the process of code representation, Bi-LSTM is selected to build a feature extraction model, Experimental results show that, compared with the method based on abstract syntax tree, this method can greatly improve the accuracy and recall of vulnerability detection, improve the vulnerability detection effect for real data sets mixed with multiple software source codes, and effectively reduce the false positive rate and false negative rate.