从 Android 应用程序中精确提取复杂变量值

IF 6.6 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING ACM Transactions on Software Engineering and Methodology Pub Date : 2024-02-27 DOI:10.1145/3649591
Marc Miltenberger, Steven Arzt
{"title":"从 Android 应用程序中精确提取复杂变量值","authors":"Marc Miltenberger, Steven Arzt","doi":"10.1145/3649591","DOIUrl":null,"url":null,"abstract":"<p>Millions of users nowadays rely on their smartphones to process sensitive data through apps from various vendors and sources. Therefore, it is vital to assess these apps for security vulnerabilities and privacy violations. Information such as to which server an app connects through which protocol, and which algorithm it applies for encryption are usually encoded as variable values and arguments of API calls. However, extracting these values from an app is not trivial. The source code of an app is usually not available, and manual reverse engineering is cumbersome with binary sizes in the tens of megabytes. Current automated tools, on the other hand, cannot retrieve values that are computed at runtime through complex transformations. </p><p>In this paper, we present <span>ValDroid</span>, a novel static analysis tool for automatically extracting the set of possible values for a given variable at a given statement in the Dalvik byte code of an Android app. We evaluate <span>ValDroid</span>\nagainst existing approaches (JSA, Violist, DroidRA, Harvester, BlueSeal, StringHound, IC3, COAL) on benchmarks and 794 real-world apps. <span>ValDroid</span> greatly outperforms existing tools. It provides an average <i>F</i><sub>1</sub> score of more than 90%, while only requiring 0.1 seconds per value on average. For many data types including Network Connections and Dynamic Code Loading, its recall is more than twice the recall of the best existing approaches.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"5 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Precisely Extracting Complex Variable Values from Android Apps\",\"authors\":\"Marc Miltenberger, Steven Arzt\",\"doi\":\"10.1145/3649591\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Millions of users nowadays rely on their smartphones to process sensitive data through apps from various vendors and sources. Therefore, it is vital to assess these apps for security vulnerabilities and privacy violations. Information such as to which server an app connects through which protocol, and which algorithm it applies for encryption are usually encoded as variable values and arguments of API calls. However, extracting these values from an app is not trivial. The source code of an app is usually not available, and manual reverse engineering is cumbersome with binary sizes in the tens of megabytes. Current automated tools, on the other hand, cannot retrieve values that are computed at runtime through complex transformations. </p><p>In this paper, we present <span>ValDroid</span>, a novel static analysis tool for automatically extracting the set of possible values for a given variable at a given statement in the Dalvik byte code of an Android app. We evaluate <span>ValDroid</span>\\nagainst existing approaches (JSA, Violist, DroidRA, Harvester, BlueSeal, StringHound, IC3, COAL) on benchmarks and 794 real-world apps. <span>ValDroid</span> greatly outperforms existing tools. It provides an average <i>F</i><sub>1</sub> score of more than 90%, while only requiring 0.1 seconds per value on average. For many data types including Network Connections and Dynamic Code Loading, its recall is more than twice the recall of the best existing approaches.</p>\",\"PeriodicalId\":50933,\"journal\":{\"name\":\"ACM Transactions on Software Engineering and Methodology\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2024-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Software Engineering and Methodology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3649591\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3649591","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

如今,数以百万计的用户依靠智能手机通过来自不同供应商和来源的应用程序处理敏感数据。因此,评估这些应用程序是否存在安全漏洞和侵犯隐私至关重要。应用程序通过哪种协议连接到哪台服务器、采用哪种算法进行加密等信息,通常以变量值和 API 调用参数的形式编码。然而,从应用程序中提取这些值并非易事。应用程序的源代码通常不可用,手动逆向工程非常麻烦,二进制文件大小可达几十兆字节。另一方面,当前的自动工具无法检索运行时通过复杂转换计算的值。在本文中,我们介绍了 ValDroid,这是一种新型静态分析工具,用于自动提取安卓应用程序 Dalvik 字节代码中给定语句下给定变量的可能值集。我们在基准测试和 794 个真实应用程序上对 ValDroid 与现有方法(JSA、Violist、DroidRA、Harvester、BlueSeal、StringHound、IC3、COAL)进行了评估。ValDroid 的性能大大优于现有工具。它的平均 F1 得分超过 90%,而平均每个值只需要 0.1 秒。对于包括网络连接和动态代码加载在内的许多数据类型,它的召回率是现有最佳方法召回率的两倍多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Precisely Extracting Complex Variable Values from Android Apps

Millions of users nowadays rely on their smartphones to process sensitive data through apps from various vendors and sources. Therefore, it is vital to assess these apps for security vulnerabilities and privacy violations. Information such as to which server an app connects through which protocol, and which algorithm it applies for encryption are usually encoded as variable values and arguments of API calls. However, extracting these values from an app is not trivial. The source code of an app is usually not available, and manual reverse engineering is cumbersome with binary sizes in the tens of megabytes. Current automated tools, on the other hand, cannot retrieve values that are computed at runtime through complex transformations.

In this paper, we present ValDroid, a novel static analysis tool for automatically extracting the set of possible values for a given variable at a given statement in the Dalvik byte code of an Android app. We evaluate ValDroid against existing approaches (JSA, Violist, DroidRA, Harvester, BlueSeal, StringHound, IC3, COAL) on benchmarks and 794 real-world apps. ValDroid greatly outperforms existing tools. It provides an average F1 score of more than 90%, while only requiring 0.1 seconds per value on average. For many data types including Network Connections and Dynamic Code Loading, its recall is more than twice the recall of the best existing approaches.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology 工程技术-计算机:软件工程
CiteScore
6.30
自引率
4.50%
发文量
164
审稿时长
>12 weeks
期刊介绍: Designing and building a large, complex software system is a tremendous challenge. ACM Transactions on Software Engineering and Methodology (TOSEM) publishes papers on all aspects of that challenge: specification, design, development and maintenance. It covers tools and methodologies, languages, data structures, and algorithms. TOSEM also reports on successful efforts, noting practical lessons that can be scaled and transferred to other projects, and often looks at applications of innovative technologies. The tone is scholarly but readable; the content is worthy of study; the presentation is effective.
期刊最新文献
Effective, Platform-Independent GUI Testing via Image Embedding and Reinforcement Learning Bitmap-Based Security Monitoring for Deeply Embedded Systems Harmonising Contributions: Exploring Diversity in Software Engineering through CQA Mining on Stack Overflow An Empirical Study on the Characteristics of Database Access Bugs in Java Applications Self-planning Code Generation with Large Language Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1