Automated Third-Party Library Detection for Android Applications: Are We There Yet?

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) Pub Date : 2020-09-01 DOI:10.1145/3324884.3416582

Xian Zhan, Lingling Fan, Tianming Liu, Sen Chen, Li Li, Haoyu Wang, Yifei Xu, Xiapu Luo, Yang Liu

{"title":"Automated Third-Party Library Detection for Android Applications: Are We There Yet?","authors":"Xian Zhan, Lingling Fan, Tianming Liu, Sen Chen, Li Li, Haoyu Wang, Yifei Xu, Xiapu Luo, Yang Liu","doi":"10.1145/3324884.3416582","DOIUrl":null,"url":null,"abstract":"Third-party libraries (TPLs) have become a significant part of the Android ecosystem. Developers can employ various TPLs with different functionalities to facilitate their app development. Unfortunately, the popularity of TPLs also brings new challenges and even threats. TPLs may carry malicious or vulnerable code, which can infect popular apps to pose threats to mobile users. Besides, the code of third-party libraries could constitute noises in some downstream tasks (e.g., malware and repackaged app detection). Thus, researchers have developed various tools to identify TPLs. However, no existing work has studied these TPL detection tools in detail; different tools focus on different applications with performance differences, but little is known about them. To better understand existing TPL detection tools and dissect TPL detection techniques, we conduct a comprehensive empirical study to fill the gap by evaluating and comparing all publicly available TPL detection tools based on four criteria: effectiveness, efficiency, code obfuscation-resilience capability, and ease of use. We reveal their advantages and disadvantages based on a systematic and thorough empirical study. Furthermore, we also conduct a user study to evaluate the usability of each tool. The results showthat LibScout outperforms others regarding effectiveness, LibRadar takes less time than others and is also regarded as the most easy-to-use one, and LibPecker performs the best in defending against code obfuscation techniques. We further summarize the lessons learned from different perspectives, including users, tool implementation, and researchers. Besides, we enhance these open-sourced tools by fixing their limitations to improve their detection ability. We also build an extensible framework that integrates all existing available TPL detection tools, providing online service for the research community. We make publicly available the evaluation dataset and enhanced tools. We believe our work provides a clear picture of existing TPL detection techniques and also give a road-map for future directions.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"718 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3324884.3416582","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 38

Abstract

Third-party libraries (TPLs) have become a significant part of the Android ecosystem. Developers can employ various TPLs with different functionalities to facilitate their app development. Unfortunately, the popularity of TPLs also brings new challenges and even threats. TPLs may carry malicious or vulnerable code, which can infect popular apps to pose threats to mobile users. Besides, the code of third-party libraries could constitute noises in some downstream tasks (e.g., malware and repackaged app detection). Thus, researchers have developed various tools to identify TPLs. However, no existing work has studied these TPL detection tools in detail; different tools focus on different applications with performance differences, but little is known about them. To better understand existing TPL detection tools and dissect TPL detection techniques, we conduct a comprehensive empirical study to fill the gap by evaluating and comparing all publicly available TPL detection tools based on four criteria: effectiveness, efficiency, code obfuscation-resilience capability, and ease of use. We reveal their advantages and disadvantages based on a systematic and thorough empirical study. Furthermore, we also conduct a user study to evaluate the usability of each tool. The results showthat LibScout outperforms others regarding effectiveness, LibRadar takes less time than others and is also regarded as the most easy-to-use one, and LibPecker performs the best in defending against code obfuscation techniques. We further summarize the lessons learned from different perspectives, including users, tool implementation, and researchers. Besides, we enhance these open-sourced tools by fixing their limitations to improve their detection ability. We also build an extensible framework that integrates all existing available TPL detection tools, providing online service for the research community. We make publicly available the evaluation dataset and enhanced tools. We believe our work provides a clear picture of existing TPL detection techniques and also give a road-map for future directions.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Android应用程序的自动第三方库检测:我们做到了吗?

第三方库(tpl)已经成为Android生态系统的重要组成部分。开发人员可以使用具有不同功能的各种tpl来促进他们的应用程序开发。不幸的是，第三方物流的普及也带来了新的挑战甚至威胁。tpl可能携带恶意或易受攻击的代码，这些代码可以感染流行的应用程序，对移动用户构成威胁。此外，第三方库的代码可能会在一些下游任务中构成噪音(例如恶意软件和重新打包的应用程序检测)。因此，研究人员开发了各种工具来识别tpl。然而，目前还没有详细研究这些TPL检测工具的工作;不同的工具针对不同的应用程序，这些应用程序具有不同的性能差异，但人们对它们知之甚少。为了更好地了解现有的TPL检测工具和剖析TPL检测技术，我们进行了一项全面的实证研究，通过基于四个标准评估和比较所有公开可用的TPL检测工具来填补空白:有效性、效率、代码混淆恢复能力和易用性。通过系统深入的实证研究，揭示了它们的优缺点。此外，我们还进行了用户研究，以评估每个工具的可用性。结果表明，LibScout在效率方面优于其他工具，LibRadar比其他工具花费的时间更少，也被认为是最容易使用的工具，LibPecker在防御代码混淆技术方面表现最好。我们进一步总结了从不同角度(包括用户、工具实现和研究人员)获得的经验教训。此外，我们还对这些开源工具进行了改进，修正了它们的局限性，提高了它们的检测能力。我们还构建了一个可扩展的框架，集成了所有现有可用的TPL检测工具，为研究社区提供在线服务。我们公开了评估数据集和增强工具。我们相信我们的工作为现有的TPL检测技术提供了一个清晰的画面，也为未来的方向提供了路线图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

自引率

0.00%

发文量

期刊最新文献

Towards Generating Thread-Safe Classes Automatically Anti-patterns for Java Automated Program Repair Tools Automating Just-In-Time Comment Updating Synthesizing Smart Solving Strategy for Symbolic Execution Identifying and Describing Information Seeking Tasks