{"title":"从 Android 应用程序中精确提取复杂变量值","authors":"Marc Miltenberger, Steven Arzt","doi":"10.1145/3649591","DOIUrl":null,"url":null,"abstract":"<p>Millions of users nowadays rely on their smartphones to process sensitive data through apps from various vendors and sources. Therefore, it is vital to assess these apps for security vulnerabilities and privacy violations. Information such as to which server an app connects through which protocol, and which algorithm it applies for encryption are usually encoded as variable values and arguments of API calls. However, extracting these values from an app is not trivial. The source code of an app is usually not available, and manual reverse engineering is cumbersome with binary sizes in the tens of megabytes. Current automated tools, on the other hand, cannot retrieve values that are computed at runtime through complex transformations. </p><p>In this paper, we present <span>ValDroid</span>, a novel static analysis tool for automatically extracting the set of possible values for a given variable at a given statement in the Dalvik byte code of an Android app. We evaluate <span>ValDroid</span>\nagainst existing approaches (JSA, Violist, DroidRA, Harvester, BlueSeal, StringHound, IC3, COAL) on benchmarks and 794 real-world apps. <span>ValDroid</span> greatly outperforms existing tools. It provides an average <i>F</i><sub>1</sub> score of more than 90%, while only requiring 0.1 seconds per value on average. For many data types including Network Connections and Dynamic Code Loading, its recall is more than twice the recall of the best existing approaches.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"5 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Precisely Extracting Complex Variable Values from Android Apps\",\"authors\":\"Marc Miltenberger, Steven Arzt\",\"doi\":\"10.1145/3649591\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Millions of users nowadays rely on their smartphones to process sensitive data through apps from various vendors and sources. Therefore, it is vital to assess these apps for security vulnerabilities and privacy violations. Information such as to which server an app connects through which protocol, and which algorithm it applies for encryption are usually encoded as variable values and arguments of API calls. However, extracting these values from an app is not trivial. The source code of an app is usually not available, and manual reverse engineering is cumbersome with binary sizes in the tens of megabytes. Current automated tools, on the other hand, cannot retrieve values that are computed at runtime through complex transformations. </p><p>In this paper, we present <span>ValDroid</span>, a novel static analysis tool for automatically extracting the set of possible values for a given variable at a given statement in the Dalvik byte code of an Android app. We evaluate <span>ValDroid</span>\\nagainst existing approaches (JSA, Violist, DroidRA, Harvester, BlueSeal, StringHound, IC3, COAL) on benchmarks and 794 real-world apps. <span>ValDroid</span> greatly outperforms existing tools. It provides an average <i>F</i><sub>1</sub> score of more than 90%, while only requiring 0.1 seconds per value on average. For many data types including Network Connections and Dynamic Code Loading, its recall is more than twice the recall of the best existing approaches.</p>\",\"PeriodicalId\":50933,\"journal\":{\"name\":\"ACM Transactions on Software Engineering and Methodology\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2024-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Software Engineering and Methodology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3649591\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3649591","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
摘要
如今,数以百万计的用户依靠智能手机通过来自不同供应商和来源的应用程序处理敏感数据。因此,评估这些应用程序是否存在安全漏洞和侵犯隐私至关重要。应用程序通过哪种协议连接到哪台服务器、采用哪种算法进行加密等信息,通常以变量值和 API 调用参数的形式编码。然而,从应用程序中提取这些值并非易事。应用程序的源代码通常不可用,手动逆向工程非常麻烦,二进制文件大小可达几十兆字节。另一方面,当前的自动工具无法检索运行时通过复杂转换计算的值。在本文中,我们介绍了 ValDroid,这是一种新型静态分析工具,用于自动提取安卓应用程序 Dalvik 字节代码中给定语句下给定变量的可能值集。我们在基准测试和 794 个真实应用程序上对 ValDroid 与现有方法(JSA、Violist、DroidRA、Harvester、BlueSeal、StringHound、IC3、COAL)进行了评估。ValDroid 的性能大大优于现有工具。它的平均 F1 得分超过 90%,而平均每个值只需要 0.1 秒。对于包括网络连接和动态代码加载在内的许多数据类型,它的召回率是现有最佳方法召回率的两倍多。
Precisely Extracting Complex Variable Values from Android Apps
Millions of users nowadays rely on their smartphones to process sensitive data through apps from various vendors and sources. Therefore, it is vital to assess these apps for security vulnerabilities and privacy violations. Information such as to which server an app connects through which protocol, and which algorithm it applies for encryption are usually encoded as variable values and arguments of API calls. However, extracting these values from an app is not trivial. The source code of an app is usually not available, and manual reverse engineering is cumbersome with binary sizes in the tens of megabytes. Current automated tools, on the other hand, cannot retrieve values that are computed at runtime through complex transformations.
In this paper, we present ValDroid, a novel static analysis tool for automatically extracting the set of possible values for a given variable at a given statement in the Dalvik byte code of an Android app. We evaluate ValDroid
against existing approaches (JSA, Violist, DroidRA, Harvester, BlueSeal, StringHound, IC3, COAL) on benchmarks and 794 real-world apps. ValDroid greatly outperforms existing tools. It provides an average F1 score of more than 90%, while only requiring 0.1 seconds per value on average. For many data types including Network Connections and Dynamic Code Loading, its recall is more than twice the recall of the best existing approaches.
期刊介绍:
Designing and building a large, complex software system is a tremendous challenge. ACM Transactions on Software Engineering and Methodology (TOSEM) publishes papers on all aspects of that challenge: specification, design, development and maintenance. It covers tools and methodologies, languages, data structures, and algorithms. TOSEM also reports on successful efforts, noting practical lessons that can be scaled and transferred to other projects, and often looks at applications of innovative technologies. The tone is scholarly but readable; the content is worthy of study; the presentation is effective.