D. Fava, Daniel G. Shapiro, J. Osborn, Martin Schäf, E. J. Whitehead
Invariant discovery is one of the central problems in software verification. This paper reports on an approach that addresses this problem in a novel way; it crowdsources logical expressions for likely invariants by turning invariant discovery into a computer game. The game, called Binary Fission, employs a classification model. In it, players compose preconditions by separating program states that preserve or violate program assertions. The players have no special expertise in formal methods or programming, and are not specifically aware they are solving verification tasks. We show that Binary Fission players discover concise, general, novel, and human readable program preconditions. Our proof of concept suggests that crowdsourcing offers a feasible and promising path towards the practical application of verification technology.
{"title":"Crowdsourcing Program Preconditions via a Classification Game","authors":"D. Fava, Daniel G. Shapiro, J. Osborn, Martin Schäf, E. J. Whitehead","doi":"10.1145/2884781.2884865","DOIUrl":"https://doi.org/10.1145/2884781.2884865","url":null,"abstract":"Invariant discovery is one of the central problems in software verification. This paper reports on an approach that addresses this problem in a novel way; it crowdsources logical expressions for likely invariants by turning invariant discovery into a computer game. The game, called Binary Fission, employs a classification model. In it, players compose preconditions by separating program states that preserve or violate program assertions. The players have no special expertise in formal methods or programming, and are not specifically aware they are solving verification tasks. We show that Binary Fission players discover concise, general, novel, and human readable program preconditions. Our proof of concept suggests that crowdsourcing offers a feasible and promising path towards the practical application of verification technology.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"6 1","pages":"1086-1096"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83177331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuai Wang, Shaukat Ali, T. Yue, Yan Li, Marius Liaaen
Many software engineering problems are multi-objective in nature, which has been largely recognized by the Search-based Software Engineering (SBSE) community. In this regard, Pareto- based search algorithms, e.g., Non-dominated Sorting Genetic Algorithm II, have already shown good performance for solving multi-objective optimization problems. These algorithms produce Pareto fronts, where each Pareto front consists of a set of non- dominated solutions. Eventually, a user selects one or more of the solutions from a Pareto front for their specific problems. A key challenge of applying Pareto-based search algorithms is to select appropriate quality indicators, e.g., hypervolume, to assess the quality of Pareto fronts. Based on the results of an extended literature review, we found that the current literature and practice in SBSE lacks a practical guide for selecting quality indicators despite a large number of published SBSE works. In this direction, the paper presents a practical guide for the SBSE community to select quality indicators for assessing Pareto-based search algorithms in different software engineering contexts. The practical guide is derived from the following complementary theoretical and empirical methods: 1) key theoretical foundations of quality indicators; 2) evidence from an extended literature review; and 3) evidence collected from an extensive experiment that was conducted to evaluate eight quality indicators from four different categories with six Pareto-based search algorithms using three real industrial problems from two diverse domains.
{"title":"A Practical Guide to Select Quality Indicators for Assessing Pareto-Based Search Algorithms in Search-Based Software Engineering","authors":"Shuai Wang, Shaukat Ali, T. Yue, Yan Li, Marius Liaaen","doi":"10.1145/2884781.2884880","DOIUrl":"https://doi.org/10.1145/2884781.2884880","url":null,"abstract":"Many software engineering problems are multi-objective in nature, which has been largely recognized by the Search-based Software Engineering (SBSE) community. In this regard, Pareto- based search algorithms, e.g., Non-dominated Sorting Genetic Algorithm II, have already shown good performance for solving multi-objective optimization problems. These algorithms produce Pareto fronts, where each Pareto front consists of a set of non- dominated solutions. Eventually, a user selects one or more of the solutions from a Pareto front for their specific problems. A key challenge of applying Pareto-based search algorithms is to select appropriate quality indicators, e.g., hypervolume, to assess the quality of Pareto fronts. Based on the results of an extended literature review, we found that the current literature and practice in SBSE lacks a practical guide for selecting quality indicators despite a large number of published SBSE works. In this direction, the paper presents a practical guide for the SBSE community to select quality indicators for assessing Pareto-based search algorithms in different software engineering contexts. The practical guide is derived from the following complementary theoretical and empirical methods: 1) key theoretical foundations of quality indicators; 2) evidence from an extended literature review; and 3) evidence collected from an extensive experiment that was conducted to evaluate eight quality indicators from four different categories with six Pareto-based search algorithms using three real industrial problems from two diverse domains.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"18 1","pages":"631-642"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85922899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingwei Lin, Jian-Guang Lou, Hongyu Zhang, D. Zhang
One challenge for maintaining a large-scale software system, especially an online service system, is to quickly respond to customer issues. The issue reports typically have many categorical attributes that reflect the characteristics of the issues. For a commercial system, most of the time the volume of reported issues is relatively constant. Sometimes, there are emerging issues that lead to significant volume increase. It is important for support engineers to efficiently and effectively identify and resolve such emerging issues, since they have impacted a large number of customers. Currently, problem identification for an emerging issue is a tedious and error-prone process, because it requires support engineers to manually identify a particular attribute combination that characterizes the emerging issue among a large number of attribute combinations. We call such an attribute combination effective combination, which is important for issue isolation and diagnosis. In this paper, we propose iDice, an approach that can identify the effective combination for an emerging issue with high quality and performance. We evaluate the effectiveness and efficiency of iDice through experiments. We have also successfully applied iDice to several Microsoft online service systems in production. The results confirm that iDice can help identify emerging issues and reduce maintenance effort.
{"title":"iDice: Problem Identification for Emerging Issues","authors":"Qingwei Lin, Jian-Guang Lou, Hongyu Zhang, D. Zhang","doi":"10.1145/2884781.2884795","DOIUrl":"https://doi.org/10.1145/2884781.2884795","url":null,"abstract":"One challenge for maintaining a large-scale software system, especially an online service system, is to quickly respond to customer issues. The issue reports typically have many categorical attributes that reflect the characteristics of the issues. For a commercial system, most of the time the volume of reported issues is relatively constant. Sometimes, there are emerging issues that lead to significant volume increase. It is important for support engineers to efficiently and effectively identify and resolve such emerging issues, since they have impacted a large number of customers. Currently, problem identification for an emerging issue is a tedious and error-prone process, because it requires support engineers to manually identify a particular attribute combination that characterizes the emerging issue among a large number of attribute combinations. We call such an attribute combination effective combination, which is important for issue isolation and diagnosis. In this paper, we propose iDice, an approach that can identify the effective combination for an emerging issue with high quality and performance. We evaluate the effectiveness and efficiency of iDice through experiments. We have also successfully applied iDice to several Microsoft online service systems in production. The results confirm that iDice can help identify emerging issues and reduce maintenance effort.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"33 1","pages":"214-224"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78335813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Empirical software engineering has produced a steady stream of evidence-based results concerning the factors that affect important outcomes such as cost, quality, and interval. However, programmers often also have strongly-held a priori opinions about these issues. These opinions are important, since developers are highlytrained professionals whose beliefs would doubtless affect their practice. As in evidence-based medicine, disseminating empirical findings to developers is a key step in ensuring that the findings impact practice. In this paper, we describe a case study, on the prior beliefs of developers at Microsoft, and the relationship of these beliefs to actual empirical data on the projects in which these developers work. Our findings are that a) programmers do indeed have very strong beliefs on certain topics b) their beliefs are primarily formed based on personal experience, rather than on findings in empirical research and c) beliefs can vary with each project, but do not necessarily correspond with actual evidence in that project. Our findings suggest that more effort should be taken to disseminate empirical findings to developers and that more in-depth study the interplay of belief and evidence in software practice is needed.
{"title":"Belief & Evidence in Empirical Software Engineering","authors":"Premkumar T. Devanbu, Thomas Zimmermann, C. Bird","doi":"10.1145/2884781.2884812","DOIUrl":"https://doi.org/10.1145/2884781.2884812","url":null,"abstract":"Empirical software engineering has produced a steady stream of evidence-based results concerning the factors that affect important outcomes such as cost, quality, and interval. However, programmers often also have strongly-held a priori opinions about these issues. These opinions are important, since developers are highlytrained professionals whose beliefs would doubtless affect their practice. As in evidence-based medicine, disseminating empirical findings to developers is a key step in ensuring that the findings impact practice. In this paper, we describe a case study, on the prior beliefs of developers at Microsoft, and the relationship of these beliefs to actual empirical data on the projects in which these developers work. Our findings are that a) programmers do indeed have very strong beliefs on certain topics b) their beliefs are primarily formed based on personal experience, rather than on findings in empirical research and c) beliefs can vary with each project, but do not necessarily correspond with actual evidence in that project. Our findings suggest that more effort should be taken to disseminate empirical findings to developers and that more in-depth study the interplay of belief and evidence in software practice is needed.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"2 1","pages":"108-119"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81756520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Schröter, S. Krieter, Thomas Thüm, Fabian Benduhn, G. Saake
Today’s software systems are often customizable by means of load-time or compile-time configuration options. These options are typically not independent and their dependencies can be specified by means of feature models. As many industrial systems contain thousands of options, the maintenance and utilization of feature models is a challenge for all stakeholders. In the last two decades, numerous approaches have been presented to support stakeholders in analyzing feature models. Such analyses are commonly reduced to satisfiability problems, which suffer from the growing number of options. While first attempts have been made to decompose feature models into smaller parts, they still require to compose all parts for analysis. We propose the concept of a feature-model interface that only consists of a subset of features, typically selected by experts, and hides all other features and dependencies. Based on a formalization of feature-model interfaces, we prove compositionality properties. We evaluate feature-model interfaces using a three-month history of an industrial feature model from the automotive domain with 18,616 features. Our results indicate performance benefits especially under evolution as often only parts of the feature model need to be analyzed again.
{"title":"Feature-Model Interfaces: The Highway to Compositional Analyses of Highly-Configurable Systems","authors":"R. Schröter, S. Krieter, Thomas Thüm, Fabian Benduhn, G. Saake","doi":"10.1145/2884781.2884823","DOIUrl":"https://doi.org/10.1145/2884781.2884823","url":null,"abstract":"Today’s software systems are often customizable by means of load-time or compile-time configuration options. These options are typically not independent and their dependencies can be specified by means of feature models. As many industrial systems contain thousands of options, the maintenance and utilization of feature models is a challenge for all stakeholders. In the last two decades, numerous approaches have been presented to support stakeholders in analyzing feature models. Such analyses are commonly reduced to satisfiability problems, which suffer from the growing number of options. While first attempts have been made to decompose feature models into smaller parts, they still require to compose all parts for analysis. We propose the concept of a feature-model interface that only consists of a subset of features, typically selected by experts, and hides all other features and dependencies. Based on a formalization of feature-model interfaces, we prove compositionality properties. We evaluate feature-model interfaces using a three-month history of an industrial feature model from the automotive domain with 18,616 features. Our results indicate performance benefits especially under evolution as often only parts of the feature model need to be analyzed again.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"667-678"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91057305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bogdan Vasilescu, Kelly Blincoe, Qi Xuan, Casey Casalnuovo, D. Damian, Premkumar T. Devanbu, V. Filkov
Software development has always inherently required multitasking: developers switch between coding, reviewing, testing, designing, and meeting with colleagues. The advent of software ecosystems like GitHub has enabled something new: the ability to easily switch between projects. Developers also have social incentives to contribute to many projects; prolific contributors gain social recognition and (eventually) economic rewards. Multitasking, however, comes at a cognitive cost: frequent context-switches can lead to distraction, sub-standard work, and even greater stress. In this paper, we gather ecosystem-level data on a group of programmers working on a large collection of projects. We develop models and methods for measuring the rate and breadth of a developers' context-switching behavior, and we study how context-switching affects their productivity. We also survey developers to understand the reasons for and perceptions of multitasking. We find that the most common reason for multitasking is interrelationships and dependencies between projects. Notably, we find that the rate of switching and breadth (number of projects) of a developer's work matter. Developers who work on many projects have higher productivity if they focus on few projects per day. Developers that switch projects too much during the course of a day have lower productivity as they work on more projects overall. Despite these findings, developers perceptions of the benefits of multitasking are varied.
{"title":"The Sky Is Not the Limit: Multitasking Across GitHub Projects","authors":"Bogdan Vasilescu, Kelly Blincoe, Qi Xuan, Casey Casalnuovo, D. Damian, Premkumar T. Devanbu, V. Filkov","doi":"10.1145/2884781.2884875","DOIUrl":"https://doi.org/10.1145/2884781.2884875","url":null,"abstract":"Software development has always inherently required multitasking: developers switch between coding, reviewing, testing, designing, and meeting with colleagues. The advent of software ecosystems like GitHub has enabled something new: the ability to easily switch between projects. Developers also have social incentives to contribute to many projects; prolific contributors gain social recognition and (eventually) economic rewards. Multitasking, however, comes at a cognitive cost: frequent context-switches can lead to distraction, sub-standard work, and even greater stress. In this paper, we gather ecosystem-level data on a group of programmers working on a large collection of projects. We develop models and methods for measuring the rate and breadth of a developers' context-switching behavior, and we study how context-switching affects their productivity. We also survey developers to understand the reasons for and perceptions of multitasking. We find that the most common reason for multitasking is interrelationships and dependencies between projects. Notably, we find that the rate of switching and breadth (number of projects) of a developer's work matter. Developers who work on many projects have higher productivity if they focus on few projects per day. Developers that switch projects too much during the course of a day have lower productivity as they work on more projects overall. Despite these findings, developers perceptions of the benefits of multitasking are varied.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"44 1","pages":"994-1005"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88353245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The use of virtual devices in place of physical hardware is increasing in activities such as design, testing and debugging. Yet virtual devices are simply software applications, and like all software they are prone to faults. A full system simulator (FSS), is a class of virtual machine that includes a large set of virtual devices – enough to run the full target software stack. Defects in an FSS virtual device may have cascading effects as the incorrect behavior can be propagated forward to many different platforms as well as to guest programs. In this work we present VDTest, a novel framework for testing virtual devices within an FSS. VDTest begins by generat- ing a test specification obtained through static analysis. It then employs a two-phase testing approach to test virtual components both individually and in combination. It lever- ages a differential oracle strategy, taking advantage of the existence of a physical or golden device to eliminate the need for manually generating test oracles. In an empirical study using both open source and commercial FSSs, we found 64 faults, 83% more than random testing.
{"title":"VDTest: An Automated Framework to Support Testing for Virtual Devices","authors":"Tingting Yu, Xiao Qu, Myra B. Cohen","doi":"10.1145/2884781.2884866","DOIUrl":"https://doi.org/10.1145/2884781.2884866","url":null,"abstract":"The use of virtual devices in place of physical hardware is increasing in activities such as design, testing and debugging. Yet virtual devices are simply software applications, and like all software they are prone to faults. A full system simulator (FSS), is a class of virtual machine that includes a large set of virtual devices – enough to run the full target software stack. Defects in an FSS virtual device may have cascading effects as the incorrect behavior can be propagated forward to many different platforms as well as to guest programs. In this work we present VDTest, a novel framework for testing virtual devices within an FSS. VDTest begins by generat- ing a test specification obtained through static analysis. It then employs a two-phase testing approach to test virtual components both individually and in combination. It lever- ages a differential oracle strategy, taking advantage of the existence of a physical or golden device to eliminate the need for manually generating test oracles. In an empirical study using both open source and commercial FSSs, we found 64 faults, 83% more than random testing.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"9 1","pages":"583-594"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90483991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conventional concolic testing has been used to provide high coverage of paths in statically typed languages. While it has also been applied in the context of JavaScript (JS) programs, we observe that applying concolic testing to dynamically-typed JS programs involves tackling unique problems to ensure scalability. In particular, a naive type-agnostic extension of concolic testing to JS programs causes generation of large number of inputs. Consequently, many executions operate on undefined values and repeatedly explore same paths resulting in redundant tests, thus diminishing the scalability of testing drastically. In this paper, we address this problem by proposing a simple yet effective approach that incorporates type-awareness intelligently in conventional concolic testing to reduce the number of generated inputs for JS programs. We extend our approach inter-procedurally by generating preconditions for each function that provide a summary of the relation between the variable types and paths. Employing the function preconditions when testing reduces the number of inputs generated even further. We implement our ideas and validate it on a number of open-source JS programs (and libraries). For a significant percentage (on average 50%) of the functions, we observe that type-aware concolic testing generates a minuscule percentage (less than 5%) of the inputs as compared to conventional concolic testing approach implemented on top of Jalangi. On average, this approach achieves over 97% of line coverage and over 94% of branch coverage for all the functions across all benchmarks. Moreover, the use of function preconditions reduces the number of inputs generated by 50%. We also demonstrate the use of function preconditions in automatically avoiding real crashes due to incorrectly typed objects.
{"title":"Type-Aware Concolic Testing of JavaScript Programs","authors":"Monika Dhok, M. Ramanathan, Nishant Sinha","doi":"10.1145/2884781.2884859","DOIUrl":"https://doi.org/10.1145/2884781.2884859","url":null,"abstract":"Conventional concolic testing has been used to provide high coverage of paths in statically typed languages. While it has also been applied in the context of JavaScript (JS) programs, we observe that applying concolic testing to dynamically-typed JS programs involves tackling unique problems to ensure scalability. In particular, a naive type-agnostic extension of concolic testing to JS programs causes generation of large number of inputs. Consequently, many executions operate on undefined values and repeatedly explore same paths resulting in redundant tests, thus diminishing the scalability of testing drastically. In this paper, we address this problem by proposing a simple yet effective approach that incorporates type-awareness intelligently in conventional concolic testing to reduce the number of generated inputs for JS programs. We extend our approach inter-procedurally by generating preconditions for each function that provide a summary of the relation between the variable types and paths. Employing the function preconditions when testing reduces the number of inputs generated even further. We implement our ideas and validate it on a number of open-source JS programs (and libraries). For a significant percentage (on average 50%) of the functions, we observe that type-aware concolic testing generates a minuscule percentage (less than 5%) of the inputs as compared to conventional concolic testing approach implemented on top of Jalangi. On average, this approach achieves over 97% of line coverage and over 94% of branch coverage for all the functions across all benchmarks. Moreover, the use of function preconditions reduces the number of inputs generated by 50%. We also demonstrate the use of function preconditions in automatically avoiding real crashes due to incorrectly typed objects.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"43 1","pages":"168-179"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86007833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konrad Jamrozik, Philipp von Styp-Rekowsky, A. Zeller
We present sandbox mining, a technique to confine an application to resources accessed during automatic testing. Sandbox mining first explores software behavior by means of automatic test generation, and extracts the set of resources accessed during these tests. This set is then used as a sandbox, blocking access to resources not used during testing. The mined sandbox thus protects against behavior changes such as the activation of latent malware, infections, targeted attacks, or malicious updates. The use of test generation makes sandbox mining a fully automatic process that can be run by vendors and end users alike. Our BOXMATE prototype requires less than one hour to extract a sandbox from an Android app, with few to no confirmations required for frequently used functionality.
{"title":"Mining Sandboxes","authors":"Konrad Jamrozik, Philipp von Styp-Rekowsky, A. Zeller","doi":"10.1145/2884781.2884782","DOIUrl":"https://doi.org/10.1145/2884781.2884782","url":null,"abstract":"We present sandbox mining, a technique to confine an application to resources accessed during automatic testing. Sandbox mining first explores software behavior by means of automatic test generation, and extracts the set of resources accessed during these tests. This set is then used as a sandbox, blocking access to resources not used during testing. The mined sandbox thus protects against behavior changes such as the activation of latent malware, infections, targeted attacks, or malicious updates. The use of test generation makes sandbox mining a fully automatic process that can be run by vendors and end users alike. Our BOXMATE prototype requires less than one hour to extract a sandbox from an Android app, with few to no confirmations required for frequently used functionality.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"22 1","pages":"37-48"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84360646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rocky Slavin, Xiaoyin Wang, M. Hosseini, James Hester, R. Krishnan, Jaspreet Bhatia, T. Breaux, Jianwei Niu
Mobile applications frequently access sensitive personal informa- tion to meet user or business requirements. Because such informa- tion is sensitive in general, regulators increasingly require mobile- app developers to publish privacy policies that describe what infor- mation is collected. Furthermore, regulators have fined companies when these policies are inconsistent with the actual data practices of mobile apps. To help mobile-app developers check their pri- vacy policies against their apps’ code for consistency, we propose a semi-automated framework that consists of a policy terminology- API method map that links policy phrases to API methods that pro- duce sensitive information, and information flow analysis to detect misalignments. We present an implementation of our framework based on a privacy-policy-phrase ontology and a collection of map- pings from API methods to policy phrases. Our empirical eval- uation on 477 top Android apps discovered 341 potential privacy policy violations.
{"title":"Toward a Framework for Detecting Privacy Policy Violations in Android Application Code","authors":"Rocky Slavin, Xiaoyin Wang, M. Hosseini, James Hester, R. Krishnan, Jaspreet Bhatia, T. Breaux, Jianwei Niu","doi":"10.1145/2884781.2884855","DOIUrl":"https://doi.org/10.1145/2884781.2884855","url":null,"abstract":"Mobile applications frequently access sensitive personal informa- tion to meet user or business requirements. Because such informa- tion is sensitive in general, regulators increasingly require mobile- app developers to publish privacy policies that describe what infor- mation is collected. Furthermore, regulators have fined companies when these policies are inconsistent with the actual data practices of mobile apps. To help mobile-app developers check their pri- vacy policies against their apps’ code for consistency, we propose a semi-automated framework that consists of a policy terminology- API method map that links policy phrases to API methods that pro- duce sensitive information, and information flow analysis to detect misalignments. We present an implementation of our framework based on a privacy-policy-phrase ontology and a collection of map- pings from API methods to policy phrases. Our empirical eval- uation on 477 top Android apps discovered 341 potential privacy policy violations.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"20 1","pages":"25-36"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77743999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}