Can Android Applications Be Identified Using Only TCP/IP Headers of Their Launch Time Traffic?

Proceedings of the 9th ACM Conference on Security & Privacy in Wireless and Mobile Networks Pub Date : 2016-07-18 DOI:10.1145/2939918.2939929

Hasan Faik Alan, J. Kaur

{"title":"Can Android Applications Be Identified Using Only TCP/IP Headers of Their Launch Time Traffic?","authors":"Hasan Faik Alan, J. Kaur","doi":"10.1145/2939918.2939929","DOIUrl":null,"url":null,"abstract":"The ability to identify mobile apps in network traffic has significant implications in many domains, including traffic management, malware detection, and maintaining user privacy. App identification methods in the literature typically use deep packet inspection (DPI) and analyze HTTP headers to extract app fingerprints. However, these methods cannot be used if HTTP traffic is encrypted. We investigate whether Android apps can be identified from their launch-time network traffic using only TCP/IP headers. We first capture network traffic of 86,109 app launches by repeatedly running 1,595 apps on 4 distinct Android devices. We then use supervised learning methods used previously in the web page identification literature, to identify the apps that generated the traffic. We find that: (i) popular Android apps can be identified with 88% accuracy, by using the packet sizes of the first 64 packets they generate, when the learning methods are trained and tested on the data collected from same device; (ii) when the data from an unseen device (but similar operating system/vendor) is used for testing, the apps can be identified with 67% accuracy; (iii) the app identification accuracy does not drop significantly even if the training data are stale by several days, and (iv) the accuracy does drop quite significantly if the operating system/vendor is very different. We discuss the implications of our findings as well as open issues.","PeriodicalId":387704,"journal":{"name":"Proceedings of the 9th ACM Conference on Security & Privacy in Wireless and Mobile Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"75","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th ACM Conference on Security & Privacy in Wireless and Mobile Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2939918.2939929","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 75

Abstract

The ability to identify mobile apps in network traffic has significant implications in many domains, including traffic management, malware detection, and maintaining user privacy. App identification methods in the literature typically use deep packet inspection (DPI) and analyze HTTP headers to extract app fingerprints. However, these methods cannot be used if HTTP traffic is encrypted. We investigate whether Android apps can be identified from their launch-time network traffic using only TCP/IP headers. We first capture network traffic of 86,109 app launches by repeatedly running 1,595 apps on 4 distinct Android devices. We then use supervised learning methods used previously in the web page identification literature, to identify the apps that generated the traffic. We find that: (i) popular Android apps can be identified with 88% accuracy, by using the packet sizes of the first 64 packets they generate, when the learning methods are trained and tested on the data collected from same device; (ii) when the data from an unseen device (but similar operating system/vendor) is used for testing, the apps can be identified with 67% accuracy; (iii) the app identification accuracy does not drop significantly even if the training data are stale by several days, and (iv) the accuracy does drop quite significantly if the operating system/vendor is very different. We discuss the implications of our findings as well as open issues.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Android应用程序可以被识别仅使用TCP/IP头的启动时间流量?

在网络流量中识别移动应用程序的能力在许多领域都具有重要意义，包括流量管理、恶意软件检测和维护用户隐私。文献中的应用识别方法通常使用深度包检测(DPI)和分析HTTP标头来提取应用指纹。但是，如果HTTP流量是加密的，则不能使用这些方法。我们调查Android应用程序是否可以通过仅使用TCP/IP报头从其启动时网络流量中识别出来。我们首先通过在4个不同的Android设备上重复运行1595个应用来获取86109个应用启动的网络流量。然后，我们使用之前在网页识别文献中使用的监督学习方法来识别产生流量的应用程序。我们发现:(i)当学习方法在同一设备收集的数据上进行训练和测试时，通过使用它们生成的前64个数据包的数据包大小，可以识别出流行的Android应用程序，准确率为88%;(ii)当使用来自未见过的设备(但类似的操作系统/供应商)的数据进行测试时，应用程序的识别准确率可以达到67%;(iii)即使训练数据过期几天，应用识别准确率也不会显著下降;(iv)如果操作系统/供应商差异很大，准确率确实会显著下降。我们讨论了我们的发现的含义以及开放的问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 9th ACM Conference on Security & Privacy in Wireless and Mobile Networks

自引率

0.00%

发文量