Learning to Identify Security-Related Issues Using Convolutional Neural Networks

David N. Palacio, Daniel McCrystal, Kevin Moran, Carlos Bernal-Cárdenas, D. Poshyvanyk, Chris Shenefiel
{"title":"Learning to Identify Security-Related Issues Using Convolutional Neural Networks","authors":"David N. Palacio, Daniel McCrystal, Kevin Moran, Carlos Bernal-Cárdenas, D. Poshyvanyk, Chris Shenefiel","doi":"10.1109/ICSME.2019.00024","DOIUrl":null,"url":null,"abstract":"Software security is becoming a high priority for both large companies and start-ups alike due to the increasing potential for harm that vulnerabilities and breaches carry with them. However, attaining robust security assurance while delivering features requires a precarious balancing act in the context of agile development practices. One path forward to help aid development teams in securing their software products is through the design and development of security-focused automation. Ergo, we present a novel approach, called SecureReqNet, for automatically identifying whether issues in software issue tracking systems describe security-related content. Our approach consists of a two-phase neural net architecture that operates purely on the natural language descriptions of issues. The first phase of our approach learns high dimensional word embeddings from hundreds of thousands of vulnerability descriptions listed in the CVE database and issue descriptions extracted from open source projects. The second phase then utilizes the semantic ontology represented by these embeddings to train a convolutional neural network capable of predicting whether a given issue is security-related. We evaluated SecureReqNet by applying it to identify security-related issues from a dataset of thousands of issues mined from popular projects on GitLab and GitHub. In addition, we also applied our approach to identify security-related requirements from a commercial software project developed by a major telecommunication company. Our preliminary results are encouraging, with SecureReqNet achieving an accuracy of 96% on open source issues and 71.6% on industrial requirements.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME.2019.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Software security is becoming a high priority for both large companies and start-ups alike due to the increasing potential for harm that vulnerabilities and breaches carry with them. However, attaining robust security assurance while delivering features requires a precarious balancing act in the context of agile development practices. One path forward to help aid development teams in securing their software products is through the design and development of security-focused automation. Ergo, we present a novel approach, called SecureReqNet, for automatically identifying whether issues in software issue tracking systems describe security-related content. Our approach consists of a two-phase neural net architecture that operates purely on the natural language descriptions of issues. The first phase of our approach learns high dimensional word embeddings from hundreds of thousands of vulnerability descriptions listed in the CVE database and issue descriptions extracted from open source projects. The second phase then utilizes the semantic ontology represented by these embeddings to train a convolutional neural network capable of predicting whether a given issue is security-related. We evaluated SecureReqNet by applying it to identify security-related issues from a dataset of thousands of issues mined from popular projects on GitLab and GitHub. In addition, we also applied our approach to identify security-related requirements from a commercial software project developed by a major telecommunication company. Our preliminary results are encouraging, with SecureReqNet achieving an accuracy of 96% on open source issues and 71.6% on industrial requirements.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
学习使用卷积神经网络识别安全相关问题
软件安全正成为大公司和初创公司的首要任务,因为漏洞和破坏带来的危害越来越大。然而,在交付特性的同时获得健壮的安全保证需要在敏捷开发实践的上下文中进行不稳定的平衡。帮助开发团队保护其软件产品的一个途径是通过设计和开发以安全为重点的自动化。因此,我们提出了一种新的方法,称为SecureReqNet,用于自动识别软件问题跟踪系统中的问题是否描述了与安全相关的内容。我们的方法由一个两阶段的神经网络架构组成,该架构纯粹基于问题的自然语言描述。我们方法的第一阶段从CVE数据库中列出的数十万个漏洞描述和从开源项目中提取的问题描述中学习高维词嵌入。然后,第二阶段利用这些嵌入表示的语义本体来训练卷积神经网络,该网络能够预测给定问题是否与安全相关。我们通过将SecureReqNet应用于从GitLab和GitHub上的热门项目中挖掘的数千个问题的数据集中识别与安全相关的问题来评估它。此外,我们还应用我们的方法来识别由一家大型电信公司开发的商业软件项目的安全相关需求。我们的初步结果令人鼓舞,SecureReqNet在开源问题上达到了96%的准确率,在工业需求上达到了71.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Same App, Different Countries: A Preliminary User Reviews Study on Most Downloaded iOS Apps Towards Better Understanding Developer Perception of Refactoring Decomposing God Classes at Siemens Self-Admitted Technical Debt Removal and Refactoring Actions: Co-Occurrence or More? A Validation Method of Self-Adaptive Strategy Based on POMDP
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1