{"title":"Towards accurate detection of obfuscated web tracking","authors":"Hoan Le, Federico Fallace, P. Barlet-Ros","doi":"10.1109/IWMN.2017.8078365","DOIUrl":null,"url":null,"abstract":"Web tracking is currently recognized as one of the most important privacy threats on the Internet. Over the last years, many methodologies have been developed to uncover web trackers. Most of them are based on static code analysis and the use of predefined blacklists. However, our main hypothesis is that web tracking has started to use obfuscated programming, a transformation of code that renders previous detection methodologies ineffective and easy to evade. In this paper, we propose a new methodology based on dynamic code analysis that monitors the actual JavaScript calls made by the browser and compares them to the original source code of the website in order to detect obfuscated tracking. The main advantage of this approach is that detection cannot be evaded by code obfuscation. We applied this methodology to detect the use of canvas-font tracking and canvas fingerprinting on the top-10K most visited websites according to Alexa's ranking. Canvas-based tracking is a fingerprinting method based on JavaScript that uses the HTML5 canvas element to uniquely identify a user. Our results show that 10.44% of the top-10K websites use canvas-based tracking (canvas-font and canvas fingerprinting), while obfuscation was used in 2.25% of them. These results confirm our initial hypothesis that obfuscated programming in web tracking is already in use. Finally, we argue that canvas-based tracking can be more present in secondary pages than in the home page of websites.","PeriodicalId":201479,"journal":{"name":"2017 IEEE International Workshop on Measurement and Networking (M&N)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Workshop on Measurement and Networking (M&N)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWMN.2017.8078365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Web tracking is currently recognized as one of the most important privacy threats on the Internet. Over the last years, many methodologies have been developed to uncover web trackers. Most of them are based on static code analysis and the use of predefined blacklists. However, our main hypothesis is that web tracking has started to use obfuscated programming, a transformation of code that renders previous detection methodologies ineffective and easy to evade. In this paper, we propose a new methodology based on dynamic code analysis that monitors the actual JavaScript calls made by the browser and compares them to the original source code of the website in order to detect obfuscated tracking. The main advantage of this approach is that detection cannot be evaded by code obfuscation. We applied this methodology to detect the use of canvas-font tracking and canvas fingerprinting on the top-10K most visited websites according to Alexa's ranking. Canvas-based tracking is a fingerprinting method based on JavaScript that uses the HTML5 canvas element to uniquely identify a user. Our results show that 10.44% of the top-10K websites use canvas-based tracking (canvas-font and canvas fingerprinting), while obfuscation was used in 2.25% of them. These results confirm our initial hypothesis that obfuscated programming in web tracking is already in use. Finally, we argue that canvas-based tracking can be more present in secondary pages than in the home page of websites.