HardVD: High-capacity cross-modal adversarial reprogramming for data-efficient vulnerability detection

IF 8.1 1区 计算机科学 N/A COMPUTER SCIENCE, INFORMATION SYSTEMS Information Sciences Pub Date : 2024-08-22 DOI:10.1016/j.ins.2024.121370
{"title":"HardVD: High-capacity cross-modal adversarial reprogramming for data-efficient vulnerability detection","authors":"","doi":"10.1016/j.ins.2024.121370","DOIUrl":null,"url":null,"abstract":"<div><p>The substantial proliferation of software vulnerabilities poses a persistent threat to system security, driving increased interest in applying deep learning (DL) for vulnerability detection. However, DL-based detectors often operate with a fixed number of input tokens, leading to semantic loss over large code snippets. Additionally, developing these detectors demands substantial labeled data and training time. To address these limitations, this paper proposes HardVD, which explores <u>H</u>igh-capacity cross-modal <u>a</u>dversarial <u>r</u>eprogramming for <u>d</u>ata-efficient <u>V</u>ulnerability <u>D</u>etection. HardVD devises a high-capacity semantic extractor to capture salient features per line of code, which are then arranged as patches to form an image representing the target function. These images are processed using convolutional filters as universal perturbations and non-parametric label remapping to adapt a pretrained Vision Transformer (ViT) for vulnerability detection, updating only the limited parameters of the perturbation filters during training. Extensive experiments demonstrate that HardVD outperforms DL-based baselines in terms of detection effectiveness, data-limited performance, and computational overhead. The ablation study also confirms the essential role of our high-capacity semantic extractor, without which an averaged relative decrease of 5.87% and 7.98% in accuracy and F1 score is observed, respectively.</p></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":null,"pages":null},"PeriodicalIF":8.1000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025524012842","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"N/A","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The substantial proliferation of software vulnerabilities poses a persistent threat to system security, driving increased interest in applying deep learning (DL) for vulnerability detection. However, DL-based detectors often operate with a fixed number of input tokens, leading to semantic loss over large code snippets. Additionally, developing these detectors demands substantial labeled data and training time. To address these limitations, this paper proposes HardVD, which explores High-capacity cross-modal adversarial reprogramming for data-efficient Vulnerability Detection. HardVD devises a high-capacity semantic extractor to capture salient features per line of code, which are then arranged as patches to form an image representing the target function. These images are processed using convolutional filters as universal perturbations and non-parametric label remapping to adapt a pretrained Vision Transformer (ViT) for vulnerability detection, updating only the limited parameters of the perturbation filters during training. Extensive experiments demonstrate that HardVD outperforms DL-based baselines in terms of detection effectiveness, data-limited performance, and computational overhead. The ablation study also confirms the essential role of our high-capacity semantic extractor, without which an averaged relative decrease of 5.87% and 7.98% in accuracy and F1 score is observed, respectively.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
HardVD:高容量跨模式对抗重编程,实现数据高效漏洞检测
软件漏洞的大量涌现对系统安全构成了持续威胁,促使人们对应用深度学习(DL)进行漏洞检测的兴趣与日俱增。然而,基于深度学习的检测器通常使用固定数量的输入标记进行操作,导致大量代码片段的语义损失。此外,开发这些检测器需要大量的标记数据和训练时间。为了解决这些局限性,本文提出了 HardVD,探索高容量跨模式对抗重编程,以实现数据高效的漏洞检测。HardVD 设计了一种大容量语义提取器,用于捕捉每行代码的显著特征,然后将这些特征排列成补丁,形成代表目标函数的图像。在处理这些图像时,使用卷积滤波器作为通用扰动和非参数标签重映射,以调整用于漏洞检测的预训练视觉变换器(ViT),在训练过程中只更新扰动滤波器的有限参数。大量实验证明,HardVD 在检测效果、数据限制性能和计算开销方面都优于基于 DL 的基线。消融研究还证实了我们的大容量语义提取器的重要作用,如果没有它,准确率和 F1 分数的平均相对降幅分别为 5.87% 和 7.98%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information Sciences
Information Sciences 工程技术-计算机:信息系统
CiteScore
14.00
自引率
17.30%
发文量
1322
审稿时长
10.4 months
期刊介绍: Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions. Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.
期刊最新文献
Ex-RL: Experience-based reinforcement learning Editorial Board Joint consensus kernel learning and adaptive hypergraph regularization for graph-based clustering RT-DIFTWD: A novel data-driven intuitionistic fuzzy three-way decision model with regret theory Granular correlation-based label-specific feature augmentation for multi-label classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1