Informal discussions on social platforms (e.g., Stack Overflow) accumulates a large body of programming knowledge in natural language text. Natural language process (NLP) techniques can be exploited to harvest this knowledge base for software engineering tasks. To make an effective use of NLP techniques, consistent vocabulary is essential. Unfortunately, the same concepts are often intentionally or accidentally mentioned in many different morphological forms in informal discussions, such as abbreviations, synonyms and misspellings. Existing techniques to deal with such morphological forms are either designed for general English or predominantly rely on domain-specific lexical rules. A thesaurus of software-specific terms and commonly-used morphological forms is desirable for normalizing software engineering text, but very difficult to build manually. In this work, we propose an automatic approach to build such a thesaurus. Our approach identifies software-specific terms by contrasting software-specific and general corpuses, and infers morphological forms of software-specific terms by combining distributed word semantics, domain-specific lexical rules and transformations, and graph analysis of morphological relations. We evaluate the coverage and accuracy of the resulting thesaurus against community-curated lists of software-specific terms, abbreviations and synonyms. We also manually examine the correctness of the identified abbreviations and synonyms in our thesaurus. We demonstrate the usefulness of our thesaurus in a case study of normalizing questions from Stack Overflow and CodeProject.
{"title":"Unsupervised Software-Specific Morphological Forms Inference from Informal Discussions","authors":"Chunyang Chen, Zhenchang Xing, Ximing Wang","doi":"10.1109/ICSE.2017.48","DOIUrl":"https://doi.org/10.1109/ICSE.2017.48","url":null,"abstract":"Informal discussions on social platforms (e.g., Stack Overflow) accumulates a large body of programming knowledge in natural language text. Natural language process (NLP) techniques can be exploited to harvest this knowledge base for software engineering tasks. To make an effective use of NLP techniques, consistent vocabulary is essential. Unfortunately, the same concepts are often intentionally or accidentally mentioned in many different morphological forms in informal discussions, such as abbreviations, synonyms and misspellings. Existing techniques to deal with such morphological forms are either designed for general English or predominantly rely on domain-specific lexical rules. A thesaurus of software-specific terms and commonly-used morphological forms is desirable for normalizing software engineering text, but very difficult to build manually. In this work, we propose an automatic approach to build such a thesaurus. Our approach identifies software-specific terms by contrasting software-specific and general corpuses, and infers morphological forms of software-specific terms by combining distributed word semantics, domain-specific lexical rules and transformations, and graph analysis of morphological relations. We evaluate the coverage and accuracy of the resulting thesaurus against community-curated lists of software-specific terms, abbreviations and synonyms. We also manually examine the correctness of the identified abbreviations and synonyms in our thesaurus. We demonstrate the usefulness of our thesaurus in a case study of normalizing questions from Stack Overflow and CodeProject.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"920 1","pages":"450-461"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85533259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aleksander Fabijan, Pavel A. Dmitriev, H. H. Olsson, J. Bosch
Software development companies are increasingly aiming to become data-driven by trying to continuously experiment with the products used by their customers. Although familiar with the competitive edge that the A/B testing technology delivers, they seldom succeed in evolving and adopting the methodology. In this paper, and based on an exhaustive and collaborative case study research in a large software-intense company with highly developed experimentation culture, we present the evolution process of moving from ad-hoc customer data analysis towards continuous controlled experimentation at scale. Our main contribution is the "Experimentation Evolution Model" in which we detail three phases of evolution: technical, organizational and business evolution. With our contribution, we aim to provide guidance to practitioners on how to develop and scale continuous experimentation in software organizations with the purpose of becoming data-driven at scale.
{"title":"The Evolution of Continuous Experimentation in Software Product Development: From Data to a Data-Driven Organization at Scale","authors":"Aleksander Fabijan, Pavel A. Dmitriev, H. H. Olsson, J. Bosch","doi":"10.1109/ICSE.2017.76","DOIUrl":"https://doi.org/10.1109/ICSE.2017.76","url":null,"abstract":"Software development companies are increasingly aiming to become data-driven by trying to continuously experiment with the products used by their customers. Although familiar with the competitive edge that the A/B testing technology delivers, they seldom succeed in evolving and adopting the methodology. In this paper, and based on an exhaustive and collaborative case study research in a large software-intense company with highly developed experimentation culture, we present the evolution process of moving from ad-hoc customer data analysis towards continuous controlled experimentation at scale. Our main contribution is the \"Experimentation Evolution Model\" in which we detail three phases of evolution: technical, organizational and business evolution. With our contribution, we aim to provide guidance to practitioners on how to develop and scale continuous experimentation in software organizations with the purpose of becoming data-driven at scale.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"481 1","pages":"770-780"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76374741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Android Wear (AW) is Google's platform for developing applications for wearable devices. Our goal is to make a first step toward a foundation for analysis and testing of AW apps. We focus on a core feature of such apps: notifications issued by a handheld device (e.g., a smartphone) and displayed on a wearable device (e.g., a smartwatch). We first define a formal semantics of AW notifications in order to capture the core features and behavior of the notification mechanism. Next, we describe a constraint-based static analysis to build a model of this run-time behavior. We then use this model to develop a novel testing tool for AW apps. The tool contains a testing framework together with components to support AW-specific coverage criteria and to automate the generation of GUI events on the wearable. These contributions advance the state of the art in the increasingly important area of software for wearable devices.
{"title":"Analysis and Testing of Notifications in Android Wear Applications","authors":"Hailong Zhang, A. Rountev","doi":"10.1109/ICSE.2017.39","DOIUrl":"https://doi.org/10.1109/ICSE.2017.39","url":null,"abstract":"Android Wear (AW) is Google's platform for developing applications for wearable devices. Our goal is to make a first step toward a foundation for analysis and testing of AW apps. We focus on a core feature of such apps: notifications issued by a handheld device (e.g., a smartphone) and displayed on a wearable device (e.g., a smartwatch). We first define a formal semantics of AW notifications in order to capture the core features and behavior of the notification mechanism. Next, we describe a constraint-based static analysis to build a model of this run-time behavior. We then use this model to develop a novel testing tool for AW apps. The tool contains a testing framework together with components to support AW-specific coverage criteria and to automate the generation of GUI events on the wearable. These contributions advance the state of the art in the increasingly important area of software for wearable devices.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"34 1","pages":"347-357"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83736068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Uncertainty and conflicting stakeholders' objectives make many requirements and architecture decisions particularly hard. Quantitative probabilistic models allow software architects to analyse such decisions using stochastic simulation and multi-objective optimisation, but the difficulty of elaborating the models is an obstacle to the wider adoption of such techniques. To reduce this obstacle, this paper presents a novel modelling language and analysis tool, called RADAR, intended to facilitate requirements and architecture decision analysis. The language has relations to quantitative AND/OR goal models used in requirements engineering and to feature models used in software product lines. However, it simplifies such models to a minimum set of language constructs essential for decision analysis. The paper presents RADAR's modelling language, automated support for decision analysis, and evaluates its application to four real-world examples.
{"title":"RADAR: A Lightweight Tool for Requirements and Architecture Decision Analysis","authors":"Saheed A. Busari, Emmanuel Letier","doi":"10.1109/ICSE.2017.57","DOIUrl":"https://doi.org/10.1109/ICSE.2017.57","url":null,"abstract":"Uncertainty and conflicting stakeholders' objectives make many requirements and architecture decisions particularly hard. Quantitative probabilistic models allow software architects to analyse such decisions using stochastic simulation and multi-objective optimisation, but the difficulty of elaborating the models is an obstacle to the wider adoption of such techniques. To reduce this obstacle, this paper presents a novel modelling language and analysis tool, called RADAR, intended to facilitate requirements and architecture decision analysis. The language has relations to quantitative AND/OR goal models used in requirements engineering and to feature models used in software product lines. However, it simplifies such models to a minimum set of language constructs essential for decision analysis. The paper presents RADAR's modelling language, automated support for decision analysis, and evaluates its application to four real-world examples.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"125 1","pages":"552-562"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75812844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yutaka Tsutano, Shakthi Bachala, W. Srisa-an, G. Rothermel, Jackson Dinh
When multiple apps on an Android platform interact, faults and security vulnerabilities can occur. Software engineers need to be able to analyze interacting apps to detect such problems. Current approaches for performing such analyses, however, do not scale to the numbers of apps that may need to be considered, and thus, are impractical for application to real-world scenarios. In this paper, we introduce JITANA, a program analysis framework designed to analyze multiple Android apps simultaneously. By using a classloader-based approach instead of a compiler-based approach such as SOOT, JITANA is able to simultaneously analyze large numbers of interacting apps, perform on-demand analysis of large libraries, and effectively analyze dynamically generated code. Empirical studies of JITANA show that it is substantially more efficient than a state-of-the-art approach, and that it can effectively and efficiently analyze complex apps including Facebook, Pokemon Go, and Pandora that the state-of-the-art approach cannot handle.
{"title":"An Efficient, Robust, and Scalable Approach for Analyzing Interacting Android Apps","authors":"Yutaka Tsutano, Shakthi Bachala, W. Srisa-an, G. Rothermel, Jackson Dinh","doi":"10.1109/ICSE.2017.37","DOIUrl":"https://doi.org/10.1109/ICSE.2017.37","url":null,"abstract":"When multiple apps on an Android platform interact, faults and security vulnerabilities can occur. Software engineers need to be able to analyze interacting apps to detect such problems. Current approaches for performing such analyses, however, do not scale to the numbers of apps that may need to be considered, and thus, are impractical for application to real-world scenarios. In this paper, we introduce JITANA, a program analysis framework designed to analyze multiple Android apps simultaneously. By using a classloader-based approach instead of a compiler-based approach such as SOOT, JITANA is able to simultaneously analyze large numbers of interacting apps, perform on-demand analysis of large libraries, and effectively analyze dynamically generated code. Empirical studies of JITANA show that it is substantially more efficient than a state-of-the-art approach, and that it can effectively and efficiently analyze complex apps including Facebook, Pokemon Go, and Pandora that the state-of-the-art approach cannot handle.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"19 1","pages":"324-334"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82426998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Rojas, Thomas D. White, Benjamin S. Clegg, G. Fraser
Writing good software tests is difficult and not every developer's favorite occupation. Mutation testing aims to help by seeding artificial faults (mutants) that good tests should identify, and test generation tools help by providing automatically generated tests. However, mutation tools tend to produce huge numbers of mutants, many of which are trivial, redundant, or semantically equivalent to the original program, automated test generation tools tend to produce tests that achieve good code coverage, but are otherwise weak and have no clear purpose. In this paper, we present an approach based on gamification and crowdsourcing to produce better software tests and mutants: The Code Defenders web-based game lets teams of players compete over a program, where attackers try to create subtle mutants, which the defenders try to counter by writing strong tests. Experiments in controlled and crowdsourced scenarios reveal that writing tests as part of the game is more enjoyable, and that playing Code Defenders results in stronger test suites and mutants than those produced by automated tools.
{"title":"Code Defenders: Crowdsourcing Effective Tests and Subtle Mutants with a Mutation Testing Game","authors":"J. Rojas, Thomas D. White, Benjamin S. Clegg, G. Fraser","doi":"10.1109/ICSE.2017.68","DOIUrl":"https://doi.org/10.1109/ICSE.2017.68","url":null,"abstract":"Writing good software tests is difficult and not every developer's favorite occupation. Mutation testing aims to help by seeding artificial faults (mutants) that good tests should identify, and test generation tools help by providing automatically generated tests. However, mutation tools tend to produce huge numbers of mutants, many of which are trivial, redundant, or semantically equivalent to the original program, automated test generation tools tend to produce tests that achieve good code coverage, but are otherwise weak and have no clear purpose. In this paper, we present an approach based on gamification and crowdsourcing to produce better software tests and mutants: The Code Defenders web-based game lets teams of players compete over a program, where attackers try to create subtle mutants, which the defenders try to counter by writing strong tests. Experiments in controlled and crowdsourced scenarios reveal that writing tests as part of the game is more enjoyable, and that playing Code Defenders results in stronger test suites and mutants than those produced by automated tools.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"44 1","pages":"677-688"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80259615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wanwangying Ma, Lin Chen, X. Zhang, Yuming Zhou, Baowen Xu
GitHub, a popular social-software-development platform, has fostered a variety of software ecosystems where projects depend on one another and practitioners interact with each other. Projects within an ecosystem often have complex inter-dependencies that impose new challenges in bug reporting and fixing. In this paper, we conduct an empirical study on cross-project correlated bugs, i.e., causally related bugs reported to different projects, focusing on two aspects: 1) how developers track the root causes across projects, and 2) how the downstream developers coordinate to deal with upstream bugs. Through manual inspection of bug reports collected from the scientific Python ecosystem and an online survey with developers, this study reveals the common practices of developers and the various factors in fixing cross-project bugs. These findings provide implications for future software bug analysis in the scope of ecosystem, as well as shed light on the requirements of issue trackers for such bugs.
{"title":"How Do Developers Fix Cross-Project Correlated Bugs? A Case Study on the GitHub Scientific Python Ecosystem","authors":"Wanwangying Ma, Lin Chen, X. Zhang, Yuming Zhou, Baowen Xu","doi":"10.1109/ICSE.2017.42","DOIUrl":"https://doi.org/10.1109/ICSE.2017.42","url":null,"abstract":"GitHub, a popular social-software-development platform, has fostered a variety of software ecosystems where projects depend on one another and practitioners interact with each other. Projects within an ecosystem often have complex inter-dependencies that impose new challenges in bug reporting and fixing. In this paper, we conduct an empirical study on cross-project correlated bugs, i.e., causally related bugs reported to different projects, focusing on two aspects: 1) how developers track the root causes across projects, and 2) how the downstream developers coordinate to deal with upstream bugs. Through manual inspection of bug reports collected from the scientific Python ecosystem and an online survey with developers, this study reveals the common practices of developers and the various factors in fixing cross-project bugs. These findings provide implications for future software bug analysis in the scope of ecosystem, as well as shed light on the requirements of issue trackers for such bugs.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"29 1","pages":"381-392"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75295897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Context: Since software development is a complex socio-technical activity that involves coordinating different disciplines and skill sets, it provides ample opportunities for waste to emerge. Waste is any activity that produces no value for the customer or user. Objective: The purpose of this paper is to identify and describe different types of waste in software development. Method: Following Constructivist Grounded Theory, we conducted a two-year five-month participant-observation study of eight software development projects at Pivotal, a software development consultancy. We also interviewed 33 software engineers, interaction designers, and product managers, and analyzed one year of retrospection topics. We iterated between analysis and theoretical sampling until achieving theoretical saturation. Results: This paper introduces the first empirical waste taxonomy. It identifies nine wastes and explores their causes, underlying tensions, and overall relationship to the waste taxonomy found in Lean Software Development. Limitations: Grounded Theory does not support statistical generalization. While the proposed taxonomy appears widely applicable, organizations with different software development cultures may experience different waste types. Conclusion: Software development projects manifest nine types of waste: building the wrong feature or product, mismanaging the backlog, rework, unnecessarily complex solutions, extraneous cognitive load, psychological distress, waiting/multitasking, knowledge loss, and ineffective communication.
{"title":"Software Development Waste","authors":"Todd Sedano, P. Ralph, Cécile Péraire","doi":"10.1109/ICSE.2017.20","DOIUrl":"https://doi.org/10.1109/ICSE.2017.20","url":null,"abstract":"Context: Since software development is a complex socio-technical activity that involves coordinating different disciplines and skill sets, it provides ample opportunities for waste to emerge. Waste is any activity that produces no value for the customer or user. Objective: The purpose of this paper is to identify and describe different types of waste in software development. Method: Following Constructivist Grounded Theory, we conducted a two-year five-month participant-observation study of eight software development projects at Pivotal, a software development consultancy. We also interviewed 33 software engineers, interaction designers, and product managers, and analyzed one year of retrospection topics. We iterated between analysis and theoretical sampling until achieving theoretical saturation. Results: This paper introduces the first empirical waste taxonomy. It identifies nine wastes and explores their causes, underlying tensions, and overall relationship to the waste taxonomy found in Lean Software Development. Limitations: Grounded Theory does not support statistical generalization. While the proposed taxonomy appears widely applicable, organizations with different software development cultures may experience different waste types. Conclusion: Software development projects manifest nine types of waste: building the wrong feature or product, mismanaging the backlog, rework, unnecessarily complex solutions, extraneous cognitive load, psychological distress, waiting/multitasking, knowledge loss, and ineffective communication.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"130-140"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88314778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mozhan Soltani, Annibale Panichella, A. van Deursen
To reduce the effort developers have to make for crash debugging, researchers have proposed several solutions for automatic failure reproduction. Recent advances proposed the use of symbolic execution, mutation analysis, and directed model checking as underling techniques for post-failure analysis of crash stack traces. However, existing approaches still cannot reproduce many real-world crashes due to such limitations as environment dependencies, path explosion, and time complexity. To address these challenges, we present EvoCrash, a post-failure approach which uses a novel Guided Genetic Algorithm (GGA) to cope with the large search space characterizing real-world software programs. Our empirical study on three open-source systems shows that EvoCrash can replicate 41 (82%) of real-world crashes, 34 (89%) of which are useful reproductions for debugging purposes, outperforming the state-of-the-art in crash replication.
{"title":"A Guided Genetic Algorithm for Automated Crash Reproduction","authors":"Mozhan Soltani, Annibale Panichella, A. van Deursen","doi":"10.1109/ICSE.2017.27","DOIUrl":"https://doi.org/10.1109/ICSE.2017.27","url":null,"abstract":"To reduce the effort developers have to make for crash debugging, researchers have proposed several solutions for automatic failure reproduction. Recent advances proposed the use of symbolic execution, mutation analysis, and directed model checking as underling techniques for post-failure analysis of crash stack traces. However, existing approaches still cannot reproduce many real-world crashes due to such limitations as environment dependencies, path explosion, and time complexity. To address these challenges, we present EvoCrash, a post-failure approach which uses a novel Guided Genetic Algorithm (GGA) to cope with the large search space characterizing real-world software programs. Our empirical study on three open-source systems shows that EvoCrash can replicate 41 (82%) of real-world crashes, 34 (89%) of which are useful reproductions for debugging purposes, outperforming the state-of-the-art in crash replication.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"4 1","pages":"209-220"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89561794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Bertolino, Breno Miranda, R. Pietrantuono, S. Russo
We introduce covrel, an adaptive software testing approach based on the combined use of operational profile and coverage spectrum, with the ultimate goal of improving the delivered reliability of the program under test. Operational profile-based testing is a black-box technique that selects test cases having the largest impact on failure probability in operation, as such, it is considered well suited when reliability is a major concern. Program spectrum is a characterization of a program's behavior in terms of the code entities (e.g., branches, statements, functions) that are covered as the program executes. The driving idea of covrel is to complement operational profile information with white-box coverage measures based on count spectra, so as to dynamically select the most effective test cases for reliability improvement. In particular, we bias operational profile-based test selection towards those entities covered less frequently. We assess the approach by experiments with 18 versions from 4 subjects commonly used in software testing research, comparing results with traditional operational and coverage testing. Results show that exploiting operational and coverage data in a combined adaptive way actually pays in terms of reliability improvement, with covrel overcoming conventional operational testing in more than 80% of the cases.
{"title":"Adaptive Coverage and Operational Profile-Based Testing for Reliability Improvement","authors":"A. Bertolino, Breno Miranda, R. Pietrantuono, S. Russo","doi":"10.1109/ICSE.2017.56","DOIUrl":"https://doi.org/10.1109/ICSE.2017.56","url":null,"abstract":"We introduce covrel, an adaptive software testing approach based on the combined use of operational profile and coverage spectrum, with the ultimate goal of improving the delivered reliability of the program under test. Operational profile-based testing is a black-box technique that selects test cases having the largest impact on failure probability in operation, as such, it is considered well suited when reliability is a major concern. Program spectrum is a characterization of a program's behavior in terms of the code entities (e.g., branches, statements, functions) that are covered as the program executes. The driving idea of covrel is to complement operational profile information with white-box coverage measures based on count spectra, so as to dynamically select the most effective test cases for reliability improvement. In particular, we bias operational profile-based test selection towards those entities covered less frequently. We assess the approach by experiments with 18 versions from 4 subjects commonly used in software testing research, comparing results with traditional operational and coverage testing. Results show that exploiting operational and coverage data in a combined adaptive way actually pays in terms of reliability improvement, with covrel overcoming conventional operational testing in more than 80% of the cases.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"101 1","pages":"541-551"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80605621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}