Who Changed You? Obfuscator Identification for Android

2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft) Pub Date : 2017-05-20 DOI:10.1109/MOBILESoft.2017.18

Yan Wang, A. Rountev

{"title":"Who Changed You? Obfuscator Identification for Android","authors":"Yan Wang, A. Rountev","doi":"10.1109/MOBILESoft.2017.18","DOIUrl":null,"url":null,"abstract":"Android developers commonly use app obfuscation to secure their apps and intellectual property. Although obfuscation provides protection, it presents an obstacle for a number of legitimate program analyses such as detection of app cloning and repackaging, malware detection, identification of third-party libraries, provenance analysis for digital forensics, and reverse engineering for test generation and performance analysis. If the obfuscator used to create an app can be identified, and if some details of the obfuscation process can be inferred, subsequent analyses can exploit this knowledge. Thus, it is desirable to be able to automatically analyze a given app and determine (1) whether it was obfuscated, (2) which obfuscator was used, and (3) how the obfuscator was configured. We have developed novel techniques to identify the obfuscator of an Android app for several widely-used obfuscation tools and for a number of their configuration options. We define the obfuscator identification problem and propose a solution based on machine learning. To the best of our knowledge, this is the first work to formulate and solve this problem. We identify a feature vector that represents the characteristics of the obfuscated code. We then implement a tool that extracts this feature vector from Dalvik bytecode and uses it to identify the obfuscator provenance information. We evaluate the proposed approach on real-world Android apps obfuscated with different obfuscators, under several configurations. Our experiments indicate that the approach identifies the obfuscator with about 97% accuracy and recognizes the configuration with more than 90% accuracy.","PeriodicalId":281934,"journal":{"name":"2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft)","volume":"184 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MOBILESoft.2017.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 41

Abstract

Android developers commonly use app obfuscation to secure their apps and intellectual property. Although obfuscation provides protection, it presents an obstacle for a number of legitimate program analyses such as detection of app cloning and repackaging, malware detection, identification of third-party libraries, provenance analysis for digital forensics, and reverse engineering for test generation and performance analysis. If the obfuscator used to create an app can be identified, and if some details of the obfuscation process can be inferred, subsequent analyses can exploit this knowledge. Thus, it is desirable to be able to automatically analyze a given app and determine (1) whether it was obfuscated, (2) which obfuscator was used, and (3) how the obfuscator was configured. We have developed novel techniques to identify the obfuscator of an Android app for several widely-used obfuscation tools and for a number of their configuration options. We define the obfuscator identification problem and propose a solution based on machine learning. To the best of our knowledge, this is the first work to formulate and solve this problem. We identify a feature vector that represents the characteristics of the obfuscated code. We then implement a tool that extracts this feature vector from Dalvik bytecode and uses it to identify the obfuscator provenance information. We evaluate the proposed approach on real-world Android apps obfuscated with different obfuscators, under several configurations. Our experiments indicate that the approach identifies the obfuscator with about 97% accuracy and recognizes the configuration with more than 90% accuracy.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

谁改变了你?Android的混淆识别器

Android开发者通常使用应用混淆来保护他们的应用和知识产权。虽然混淆提供了保护，但它为许多合法程序分析提供了障碍，例如检测应用程序克隆和重新包装，恶意软件检测，第三方库识别，数字取证的来源分析以及用于测试生成和性能分析的逆向工程。如果可以识别用于创建应用程序的混淆器，并且可以推断出混淆过程的一些细节，则后续分析可以利用这些知识。因此，希望能够自动分析给定的应用程序并确定(1)是否被混淆，(2)使用了哪个混淆器，以及(3)如何配置混淆器。我们开发了新的技术来识别Android应用程序的几个广泛使用的混淆工具及其配置选项的混淆器。我们定义了模糊识别问题，并提出了一种基于机器学习的解决方案。据我们所知，这是第一次提出并解决这个问题。我们确定一个特征向量，表示混淆代码的特征。然后我们实现了一个工具，从Dalvik字节码中提取这个特征向量，并用它来识别混淆器的来源信息。我们在几种配置下对使用不同混淆器混淆的真实Android应用程序评估了所提出的方法。实验表明，该方法识别混淆器的准确率约为97%，识别配置的准确率超过90%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft)

自引率

0.00%

发文量

期刊最新文献

Same App, Different App Stores: A Comparative Study Predicting Android Application Security and Privacy Risk with Static Code Metrics A Set of Metrics for the Effort Estimation of Mobile Apps Assessing the Impact of Service Workers on the Energy Efficiency of Progressive Web Apps Towards Mobile Twin Peaks for App Development