GUI input generation tools for Android apps, such as Android's Monkey, are useful for automatically producing test inputs, but these tests are generally orders of magnitude larger than necessary, making them difficult for humans to understand. We present a technique for minimizing the output of such tools. Our technique accounts for the non-deterministic behavior of mobile apps, producing small event traces that reach a desired activity with high probability. We propose a variant of delta debugging, augmented to handle non-determinism, to solve the problem of trace minimization. We evaluate our algorithm on two sets of commercial and open-source Android applications, showing that we can minimize large event traces reaching a particular application activity, producing traces that are, on average, less than 2% the size of the original traces.
{"title":"Minimizing GUI event traces","authors":"Lazaro Clapp, O. Bastani, Saswat Anand, A. Aiken","doi":"10.1145/2950290.2950342","DOIUrl":"https://doi.org/10.1145/2950290.2950342","url":null,"abstract":"GUI input generation tools for Android apps, such as Android's Monkey, are useful for automatically producing test inputs, but these tests are generally orders of magnitude larger than necessary, making them difficult for humans to understand. We present a technique for minimizing the output of such tools. Our technique accounts for the non-deterministic behavior of mobile apps, producing small event traces that reach a desired activity with high probability. We propose a variant of delta debugging, augmented to handle non-determinism, to solve the problem of trace minimization. We evaluate our algorithm on two sets of commercial and open-source Android applications, showing that we can minimize large event traces reaching a particular application activity, producing traces that are, on average, less than 2% the size of the original traces.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82501919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The large body of existing research in Test Case Prioritization (TCP) techniques, can be broadly classified into two categories: dynamic techniques (that rely on run-time execution information) and static techniques (that operate directly on source and test code). Absent from this current body of work is a comprehensive study aimed at understanding and evaluating the static approaches and comparing them to dynamic approaches on a large set of projects. In this work, we perform the first extensive study aimed at empirically evaluating four static TCP techniques comparing them with state-of-research dynamic TCP techniques at different test-case granularities (e.g., method and class-level) in terms of effectiveness, efficiency and similarity of faults detected. This study was performed on 30 real-word Java programs encompassing 431 KLoC. In terms of effectiveness, we find that the static call-graph-based technique outperforms the other static techniques at test-class level, but the topic-model-based technique performs better at test-method level. In terms of efficiency, the static call-graph-based technique is also the most efficient when compared to other static techniques. When examining the similarity of faults detected for the four static techniques compared to the four dynamic ones, we find that on average, the faults uncovered by these two groups of techniques are quite dissimilar, with the top 10% of test cases agreeing on only 25% - 30% of detected faults. This prompts further research into the severity/importance of faults uncovered by these techniques, and into the potential for combining static and dynamic information for more effective approaches.
{"title":"A large-scale empirical comparison of static and dynamic test case prioritization techniques","authors":"Qi Luo, Kevin Moran, D. Poshyvanyk","doi":"10.1145/2950290.2950344","DOIUrl":"https://doi.org/10.1145/2950290.2950344","url":null,"abstract":"The large body of existing research in Test Case Prioritization (TCP) techniques, can be broadly classified into two categories: dynamic techniques (that rely on run-time execution information) and static techniques (that operate directly on source and test code). Absent from this current body of work is a comprehensive study aimed at understanding and evaluating the static approaches and comparing them to dynamic approaches on a large set of projects. In this work, we perform the first extensive study aimed at empirically evaluating four static TCP techniques comparing them with state-of-research dynamic TCP techniques at different test-case granularities (e.g., method and class-level) in terms of effectiveness, efficiency and similarity of faults detected. This study was performed on 30 real-word Java programs encompassing 431 KLoC. In terms of effectiveness, we find that the static call-graph-based technique outperforms the other static techniques at test-class level, but the topic-model-based technique performs better at test-method level. In terms of efficiency, the static call-graph-based technique is also the most efficient when compared to other static techniques. When examining the similarity of faults detected for the four static techniques compared to the four dynamic ones, we find that on average, the faults uncovered by these two groups of techniques are quite dissimilar, with the top 10% of test cases agreeing on only 25% - 30% of detected faults. This prompts further research into the severity/importance of faults uncovered by these techniques, and into the potential for combining static and dynamic information for more effective approaches.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"61 5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77460949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To customize the behavior of a smart home, an end user writes rules. When an external event satisfies a rule's trigger, the rule's action executes; for example, when the temperature is above a certain threshold, then window awnings might be extended. End users often write incorrect rules. This paper's technique prevents a certain category of errors in the rules: errors due to too few triggers. It statically analyzes a rule's actions to automatically determine a set of necessary and sufficient triggers. We implemented the technique in a tool called TrigGen and tested it on 96 end-user written rules for openHAB, an open-source home automation platform. It identified that 80% of the rules had fewer triggers than required for correct behavior. The missing triggers could lead to unexpected behavior and security vulnerabilities in a smart home.
{"title":"Automatic trigger generation for end user written rules for home automation","authors":"Chandrakana Nandi","doi":"10.1145/2950290.2983965","DOIUrl":"https://doi.org/10.1145/2950290.2983965","url":null,"abstract":"To customize the behavior of a smart home, an end user writes rules. When an external event satisfies a rule's trigger, the rule's action executes; for example, when the temperature is above a certain threshold, then window awnings might be extended. End users often write incorrect rules. This paper's technique prevents a certain category of errors in the rules: errors due to too few triggers. It statically analyzes a rule's actions to automatically determine a set of necessary and sufficient triggers. We implemented the technique in a tool called TrigGen and tested it on 96 end-user written rules for openHAB, an open-source home automation platform. It identified that 80% of the rules had fewer triggers than required for correct behavior. The missing triggers could lead to unexpected behavior and security vulnerabilities in a smart home.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"71 4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90718884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Research aimed at understanding and addressing coordination breakdowns experienced in global software development (GSD) projects at Lucent Technologies took a path from open-ended qualitative exploratory studies to quantitative studies with a tight focus on a key problem – delay – and its causes. Rather than being directly associated with delay, multi-site work items involved more people than comparable same-site work items, and the number of people was a powerful predictor of delay. To counteract this, we developed and deployed tools and practices to support more effective communication and expertise location. After conducting two case studies of open source development, an extreme form of GSD, we realized that many tools and practices could be effective for multi-site work, but none seemed to work under all conditions. To achieve deeper insight, we developed and tested our Socio-Technical Theory of Coordination (STTC) in which the dependencies among engineering decisions are seen as defining a constraint satisfaction problem that the organization can solve in a variety of ways. I conclude by explaining how we applied these ideas to transparent development environments, then sketch important open research questions.
{"title":"Building a socio-technical theory of coordination: why and how (outstanding research award)","authors":"J. Herbsleb","doi":"10.1145/2950290.2994160","DOIUrl":"https://doi.org/10.1145/2950290.2994160","url":null,"abstract":"Research aimed at understanding and addressing coordination breakdowns experienced in global software development (GSD) projects at Lucent Technologies took a path from open-ended qualitative exploratory studies to quantitative studies with a tight focus on a key problem – delay – and its causes. Rather than being directly associated with delay, multi-site work items involved more people than comparable same-site work items, and the number of people was a powerful predictor of delay. To counteract this, we developed and deployed tools and practices to support more effective communication and expertise location. After conducting two case studies of open source development, an extreme form of GSD, we realized that many tools and practices could be effective for multi-site work, but none seemed to work under all conditions. To achieve deeper insight, we developed and tested our Socio-Technical Theory of Coordination (STTC) in which the dependencies among engineering decisions are seen as defining a constraint satisfaction problem that the organization can solve in a variety of ways. I conclude by explaining how we applied these ideas to transparent development environments, then sketch important open research questions.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"311 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77390588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Code contributions in Free/Libre and Open Source Software projects are controlled to maintain high-quality of software. Alternatives to patch-based code contribution tools such as mailing lists and issue trackers have been developed with the pull request systems being the most visible and widely available on GitHub. Is the code contribution process more effective with pull request systems? To answer that, we quantify the effectiveness via the rates contributions are accepted and ignored, via the time until the first response and final resolution and via the numbers of contributions. To control for the latent variables, our study includes a project that migrated from an issue tracker to the GitHub pull request system and a comparison between projects using mailing lists and pull request systems. Our results show pull request systems to be associated with reduced review times and larger numbers of contributions. However, not all the comparisons indicate substantially better accept or ignore rates in pull request systems. These variations may be most simply explained by the differences in contribution practices the projects employ and may be less affected by the type of tool. Our results clarify the importance of understanding the role of tools in effective management of the broad network of potential contributors and may lead to strategies and practices making the code contribution more satisfying and efficient from both contributors' and maintainers' perspectives.
{"title":"Effectiveness of code contribution: from patch-based to pull-request-based tools","authors":"Jiaxin Zhu, Minghui Zhou, A. Mockus","doi":"10.1145/2950290.2950364","DOIUrl":"https://doi.org/10.1145/2950290.2950364","url":null,"abstract":"Code contributions in Free/Libre and Open Source Software projects are controlled to maintain high-quality of software. Alternatives to patch-based code contribution tools such as mailing lists and issue trackers have been developed with the pull request systems being the most visible and widely available on GitHub. Is the code contribution process more effective with pull request systems? To answer that, we quantify the effectiveness via the rates contributions are accepted and ignored, via the time until the first response and final resolution and via the numbers of contributions. To control for the latent variables, our study includes a project that migrated from an issue tracker to the GitHub pull request system and a comparison between projects using mailing lists and pull request systems. Our results show pull request systems to be associated with reduced review times and larger numbers of contributions. However, not all the comparisons indicate substantially better accept or ignore rates in pull request systems. These variations may be most simply explained by the differences in contribution practices the projects employ and may be less affected by the type of tool. Our results clarify the importance of understanding the role of tools in effective management of the broad network of potential contributors and may lead to strategies and practices making the code contribution more satisfying and efficient from both contributors' and maintainers' perspectives.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90371378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Nguyen, Michael C Hilton, Mihai Codoban, H. Nguyen, L. Mast, E. Rademacher, T. Nguyen, Danny Dig
Learning and remembering how to use APIs is difficult. While code-completion tools can recommend API methods, browsing a long list of API method names and their documentation is tedious. Moreover, users can easily be overwhelmed with too much information. We present a novel API recommendation approach that taps into the predictive power of repetitive code changes to provide relevant API recommendations for developers. Our approach and tool, APIREC, is based on statistical learning from fine-grained code changes and from the context in which those changes were made. Our empirical evaluation shows that APIREC correctly recommends an API call in the first position 59% of the time, and it recommends the correct API call in the top five positions 77% of the time. This is a significant improvement over the state-of-the-art approaches by 30-160% for top-1 accuracy, and 10-30% for top-5 accuracy, respectively. Our result shows that APIREC performs well even with a one-time, minimal training dataset of 50 publicly available projects.
{"title":"API code recommendation using statistical learning from fine-grained changes","authors":"A. Nguyen, Michael C Hilton, Mihai Codoban, H. Nguyen, L. Mast, E. Rademacher, T. Nguyen, Danny Dig","doi":"10.1145/2950290.2950333","DOIUrl":"https://doi.org/10.1145/2950290.2950333","url":null,"abstract":"Learning and remembering how to use APIs is difficult. While code-completion tools can recommend API methods, browsing a long list of API method names and their documentation is tedious. Moreover, users can easily be overwhelmed with too much information. We present a novel API recommendation approach that taps into the predictive power of repetitive code changes to provide relevant API recommendations for developers. Our approach and tool, APIREC, is based on statistical learning from fine-grained code changes and from the context in which those changes were made. Our empirical evaluation shows that APIREC correctly recommends an API call in the first position 59% of the time, and it recommends the correct API call in the top five positions 77% of the time. This is a significant improvement over the state-of-the-art approaches by 30-160% for top-1 accuracy, and 10-30% for top-5 accuracy, respectively. Our result shows that APIREC performs well even with a one-time, minimal training dataset of 50 publicly available projects.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78836754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas Bang, Abdulbaki Aydin, Quoc-Sang Phan, C. Pasareanu, T. Bultan
We present an automated approach for detecting and quantifying side channels in Java programs, which uses symbolic execution, string analysis and model counting to compute information leakage for a single run of a program. We further extend this approach to compute information leakage for multiple runs for a type of side channels called segmented oracles, where the attacker is able to explore each segment of a secret (for example each character of a password) independently. We present an efficient technique for segmented oracles that computes information leakage for multiple runs using only the path constraints generated from a single run symbolic execution. Our implementation uses the symbolic execution tool Symbolic PathFinder (SPF), SMT solver Z3, and two model counting constraint solvers LattE and ABC. Although LattE has been used before for analyzing numeric constraints, in this paper, we present an approach for using LattE for analyzing string constraints. We also extend the string constraint solver ABC for analysis of both numeric and string constraints, and we integrate ABC in SPF, enabling quantitative symbolic string analysis.
{"title":"String analysis for side channels with segmented oracles","authors":"Lucas Bang, Abdulbaki Aydin, Quoc-Sang Phan, C. Pasareanu, T. Bultan","doi":"10.1145/2950290.2950362","DOIUrl":"https://doi.org/10.1145/2950290.2950362","url":null,"abstract":"We present an automated approach for detecting and quantifying side channels in Java programs, which uses symbolic execution, string analysis and model counting to compute information leakage for a single run of a program. We further extend this approach to compute information leakage for multiple runs for a type of side channels called segmented oracles, where the attacker is able to explore each segment of a secret (for example each character of a password) independently. We present an efficient technique for segmented oracles that computes information leakage for multiple runs using only the path constraints generated from a single run symbolic execution. Our implementation uses the symbolic execution tool Symbolic PathFinder (SPF), SMT solver Z3, and two model counting constraint solvers LattE and ABC. Although LattE has been used before for analyzing numeric constraints, in this paper, we present an approach for using LattE for analyzing string constraints. We also extend the string constraint solver ABC for analysis of both numeric and string constraints, and we integrate ABC in SPF, enabling quantitative symbolic string analysis.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"189 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79487355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An increasingly important concern of software engineers is handling uncertainties at design time, such as environment dynamics that may be difficult to predict or requirements that may change during operation. The idea of self-adaptation is to handle such uncertainties at runtime, when the knowledge becomes available. As more systems with strict requirements require self-adaptation, providing guarantees for adaptation has become a high-priority. Providing such guarantees with traditional architecture-based approaches has shown to be challenging. In response, researchers have studied the application of control theory to realize self-adaptation. However, existing control-theoretic approaches applied to adapt software systems have primarily focused on satisfying only a single adaptation goal at a time, which is often too restrictive for real applications. In this paper, we present Simplex Control Adaptation, SimCA, a new approach to self-adaptation that satisfies multiple goals, while being optimal with respect to an additional goal. SimCA offers robustness to measurement inaccuracy and environmental disturbances, and provides guarantees. We evaluate SimCA for two systems with strict requirements that have to deal with uncertainties: an underwater vehicle system used for oceanic surveillance, and a tele-assistance system for health care support.
{"title":"Keep it SIMPLEX: satisfying multiple goals with guarantees in control-based self-adaptive systems","authors":"S. Shevtsov, Danny Weyns","doi":"10.1145/2950290.2950301","DOIUrl":"https://doi.org/10.1145/2950290.2950301","url":null,"abstract":"An increasingly important concern of software engineers is handling uncertainties at design time, such as environment dynamics that may be difficult to predict or requirements that may change during operation. The idea of self-adaptation is to handle such uncertainties at runtime, when the knowledge becomes available. As more systems with strict requirements require self-adaptation, providing guarantees for adaptation has become a high-priority. Providing such guarantees with traditional architecture-based approaches has shown to be challenging. In response, researchers have studied the application of control theory to realize self-adaptation. However, existing control-theoretic approaches applied to adapt software systems have primarily focused on satisfying only a single adaptation goal at a time, which is often too restrictive for real applications. In this paper, we present Simplex Control Adaptation, SimCA, a new approach to self-adaptation that satisfies multiple goals, while being optimal with respect to an additional goal. SimCA offers robustness to measurement inaccuracy and environmental disturbances, and provides guarantees. We evaluate SimCA for two systems with strict requirements that have to deal with uncertainties: an underwater vehicle system used for oceanic surveillance, and a tele-assistance system for health care support.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87413144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C language has complicated semantics for integers. Integer errors lead to serious software failures or exploitable vulnerabilities while they are harbored in various real-world programs. It is labor-intensive and error-prone to manually address integer errors. The usability of existing automated techniques is generally poor, as they heavily rely on external specifications or simply transform bugs into crash. We propose RABIEF, a novel and fully automatic approach to fix C integer errors based on range analysis. RABIEF is inspired by the following insights: (1) fixes for various integer errors have typical patterns including sanitization, explicit cast and declared type alteration; (2) range analysis provides sound basis for error detection and guides fix generation. We implemented RABIEF into a tool Argyi. Its effectiveness and efficiency have been substantiated by the facts that: (1) Argyi succeeds in fixing 93.9% of 5414 integer bugs from Juliet test suite, scaling to 600 KLOC within 5500 seconds; (2) Argyi is confirmed to correctly fix 20 errors from 4 real-world programs within only 240 seconds.
{"title":"RABIEF: range analysis based integer error fixing","authors":"Xi Cheng","doi":"10.1145/2950290.2983961","DOIUrl":"https://doi.org/10.1145/2950290.2983961","url":null,"abstract":"C language has complicated semantics for integers. Integer errors lead to serious software failures or exploitable vulnerabilities while they are harbored in various real-world programs. It is labor-intensive and error-prone to manually address integer errors. The usability of existing automated techniques is generally poor, as they heavily rely on external specifications or simply transform bugs into crash. We propose RABIEF, a novel and fully automatic approach to fix C integer errors based on range analysis. RABIEF is inspired by the following insights: (1) fixes for various integer errors have typical patterns including sanitization, explicit cast and declared type alteration; (2) range analysis provides sound basis for error detection and guides fix generation. We implemented RABIEF into a tool Argyi. Its effectiveness and efficiency have been substantiated by the facts that: (1) Argyi succeeds in fixing 93.9% of 5414 integer bugs from Juliet test suite, scaling to 600 KLOC within 5500 seconds; (2) Argyi is confirmed to correctly fix 20 errors from 4 real-world programs within only 240 seconds.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84510915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays, a rapidly increasing number of web users are using Ad-blockers to block online advertisements. Ad-blockers are browser-based software that can block most Ads on the websites, speeding up web browsers and saving bandwidth. Despite these benefits to end users, Ad-blockers could be catastrophic for the economic structure underlying the web, especially considering the rise of Ad-blocking as well as the number of technologies and services that rely exclusively on Ads to compensate their cost. In this paper, we introduce WebRanz that utilizes a randomization mechanism to circumvent Ad-blocking. Using WebRanz, content publishers can constantly mutate the internal HTML elements and element attributes of their web pages, without affecting their visual appearances and functionalities. Randomization invalidates the pre-defined patterns that Ad-blockers use to filter out Ads. Though the design of WebRanz is motivated by evading Ad-blockers, WebRanz also benefits the defense against bot scripts. We evaluate the effectiveness of WebRanz and its overhead using 221 randomly sampled top Alexa web pages and 8 representative bot scripts.
{"title":"WebRanz: web page randomization for better advertisement delivery and web-bot prevention","authors":"Weihang Wang, Yunhui Zheng, Xinyu Xing, Yonghwi Kwon, X. Zhang, P. Eugster","doi":"10.1145/2950290.2950352","DOIUrl":"https://doi.org/10.1145/2950290.2950352","url":null,"abstract":"Nowadays, a rapidly increasing number of web users are using Ad-blockers to block online advertisements. Ad-blockers are browser-based software that can block most Ads on the websites, speeding up web browsers and saving bandwidth. Despite these benefits to end users, Ad-blockers could be catastrophic for the economic structure underlying the web, especially considering the rise of Ad-blocking as well as the number of technologies and services that rely exclusively on Ads to compensate their cost. In this paper, we introduce WebRanz that utilizes a randomization mechanism to circumvent Ad-blocking. Using WebRanz, content publishers can constantly mutate the internal HTML elements and element attributes of their web pages, without affecting their visual appearances and functionalities. Randomization invalidates the pre-defined patterns that Ad-blockers use to filter out Ads. Though the design of WebRanz is motivated by evading Ad-blockers, WebRanz also benefits the defense against bot scripts. We evaluate the effectiveness of WebRanz and its overhead using 221 randomly sampled top Alexa web pages and 8 representative bot scripts.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87143596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}