Errors, Misunderstandings, and Attacks: Analyzing the Crowdsourcing Process of Ad-blocking Systems

Proceedings of the Internet Measurement Conference 2018 Pub Date : 2019-10-21 DOI:10.1145/3355369.3355588

Mshabab Alrizah, Sencun Zhu, Xinyu Xing, Gang Wang

{"title":"Errors, Misunderstandings, and Attacks: Analyzing the Crowdsourcing Process of Ad-blocking Systems","authors":"Mshabab Alrizah, Sencun Zhu, Xinyu Xing, Gang Wang","doi":"10.1145/3355369.3355588","DOIUrl":null,"url":null,"abstract":"Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period. Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50% of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"87 18","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Internet Measurement Conference 2018","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3355369.3355588","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

Abstract

Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period. Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50% of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

错误、误解和攻击:分析广告拦截系统的众包过程

像Adblock Plus这样的广告拦截系统依靠众包来建立和维护过滤列表，这是决定在网页上拦截哪些广告的基础。在这项工作中，我们试图提高我们对广告拦截社区以及众包过程中的错误和陷阱的理解。为此，我们收集并分析了一个纵向数据集，该数据集涵盖了流行过滤列表EasyList 9年的动态变化以及同期人群提交的错误报告。我们的研究产生了一些关于FP和FN错误的特征及其原因的重要发现。例如，尽管社区做出了努力，我们发现误报错误(即错误地阻止合法内容)仍然需要很长时间才能被发现(其中50%需要一个多月)。EasyList的编辑和网站所有者都应该为误报负责。此外，我们发现大量的假阴性错误(即未能屏蔽真实广告)要么被错误报道，要么被编辑忽略。此外，我们分析了广告发布商对广告拦截器的规避攻击。总的来说，我们的分析涵盖了15种攻击方法，其中包括8种尚未被研究界研究的方法。我们展示了广告发布商如何利用它们来规避广告拦截器，并根据经验衡量广告拦截器的反应。通过深入分析，我们的研究结果有望为未来发展广告拦截和优化众包机制的工作提供帮助。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Internet Measurement Conference 2018

自引率

0.00%

发文量

期刊最新文献

Reducing Permission Requests in Mobile Apps A Look at the ECS Behavior of DNS Resolvers RPKI is Coming of Age: A Longitudinal Study of RPKI Deployment and Invalid Route Origins Scanning the Scanners: Sensing the Internet from a Massively Distributed Network Telescope Learning Regexes to Extract Router Names from Hostnames