Sensitive Behavioral Chain-Focused Android Malware Detection Fused With AST Semantics

IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS IEEE Transactions on Information Forensics and Security Pub Date : 2024-09-26 DOI:10.1109/TIFS.2024.3468891
Jiacheng Gong;Weina Niu;Song Li;Mingxue Zhang;Xiaosong Zhang
{"title":"Sensitive Behavioral Chain-Focused Android Malware Detection Fused With AST Semantics","authors":"Jiacheng Gong;Weina Niu;Song Li;Mingxue Zhang;Xiaosong Zhang","doi":"10.1109/TIFS.2024.3468891","DOIUrl":null,"url":null,"abstract":"The proliferation of Android malware poses a substantial security threat to mobile devices. Thus, achieving efficient and accurate malware detection and malware family identification is crucial for safeguarding users’ individual property and privacy. Graph-based approaches have demonstrated remarkable detection performance in the realm of intelligent Android malware detection methods. This is attributed to the robust representation capabilities of graphs and the rich semantic information. The function call graph (FCG) is the most widely used graph in intelligent Android malware detection. However, existing FCG-based malware detection methods face challenges, such as the enormous computational and storage costs of modeling large graphs. Additionally, the ignorance of code semantics also makes them susceptible to structured attacks. In this paper, we proposed AndroAnalyzer, which embeds abstract syntax tree (AST) code semantics while focusing on sensitive behavior chains. It leverages FCGs to represent the macroscopic behavior of the application, and employs structured code semantics to represent the microscopic behavior of functions. Furthermore, we proposed the sensitive function call graph (SFCG) generation algorithm to narrow down the analysis scope to sensitive function calls, and the AST vectorization algorithm (AST2Vec) to capture structured code semantics. Experimental results demonstrate that the proposed SFCG generation algorithm noticeably reduces graph size while ensuring robust detection performance. AndroAnalyzer outperforms the baseline methods in binary and multiclass classification tasks, achieving F1-scores of 99.21% and 98.45% respectively. Moreover, AndroAnalyzer (trained with samples of 2010-2018) exhibits good generalization capabilities in detecting samples of 2019-2022.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"19 ","pages":"9216-9229"},"PeriodicalIF":6.3000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10695137/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The proliferation of Android malware poses a substantial security threat to mobile devices. Thus, achieving efficient and accurate malware detection and malware family identification is crucial for safeguarding users’ individual property and privacy. Graph-based approaches have demonstrated remarkable detection performance in the realm of intelligent Android malware detection methods. This is attributed to the robust representation capabilities of graphs and the rich semantic information. The function call graph (FCG) is the most widely used graph in intelligent Android malware detection. However, existing FCG-based malware detection methods face challenges, such as the enormous computational and storage costs of modeling large graphs. Additionally, the ignorance of code semantics also makes them susceptible to structured attacks. In this paper, we proposed AndroAnalyzer, which embeds abstract syntax tree (AST) code semantics while focusing on sensitive behavior chains. It leverages FCGs to represent the macroscopic behavior of the application, and employs structured code semantics to represent the microscopic behavior of functions. Furthermore, we proposed the sensitive function call graph (SFCG) generation algorithm to narrow down the analysis scope to sensitive function calls, and the AST vectorization algorithm (AST2Vec) to capture structured code semantics. Experimental results demonstrate that the proposed SFCG generation algorithm noticeably reduces graph size while ensuring robust detection performance. AndroAnalyzer outperforms the baseline methods in binary and multiclass classification tasks, achieving F1-scores of 99.21% and 98.45% respectively. Moreover, AndroAnalyzer (trained with samples of 2010-2018) exhibits good generalization capabilities in detecting samples of 2019-2022.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
融合 AST 语义的以敏感行为链为重点的安卓恶意软件检测
安卓恶意软件的激增对移动设备构成了巨大的安全威胁。因此,实现高效、准确的恶意软件检测和恶意软件家族识别对于保护用户的个人财产和隐私至关重要。在智能安卓恶意软件检测方法领域,基于图的方法已显示出卓越的检测性能。这归功于图的强大表示能力和丰富的语义信息。函数调用图(FCG)是智能安卓恶意软件检测中使用最广泛的图。然而,现有的基于 FCG 的恶意软件检测方法面临着一些挑战,例如对大型图建模的巨大计算和存储成本。此外,对代码语义的忽略也使其容易受到结构化攻击。在本文中,我们提出了 AndroAnalyzer,它嵌入了抽象语法树(AST)代码语义,同时关注敏感行为链。它利用 FCG 表示应用程序的宏观行为,并采用结构化代码语义表示函数的微观行为。此外,我们还提出了敏感函数调用图(SFCG)生成算法,以将分析范围缩小到敏感函数调用,并提出了 AST 向量化算法(AST2Vec)来捕捉结构化代码语义。实验结果表明,所提出的 SFCG 生成算法在确保稳健检测性能的同时,明显缩小了图的大小。AndroAnalyzer 在二分类和多分类任务中的表现优于基准方法,F1 分数分别达到 99.21% 和 98.45%。此外,AndroAnalyzer(使用 2010-2018 年的样本进行训练)在检测 2019-2022 年的样本时表现出良好的泛化能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Information Forensics and Security
IEEE Transactions on Information Forensics and Security 工程技术-工程:电子与电气
CiteScore
14.40
自引率
7.40%
发文量
234
审稿时长
6.5 months
期刊介绍: The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features
期刊最新文献
Attackers Are Not the Same! Unveiling the Impact of Feature Distribution on Label Inference Attacks Backdoor Online Tracing With Evolving Graphs LHADRO: A Robust Control Framework for Autonomous Vehicles Under Cyber-Physical Attacks Towards Mobile Palmprint Recognition via Multi-view Hierarchical Graph Learning Succinct Hash-based Arbitrary-Range Proofs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1