Background: While FLOSS projects espouse openness and acceptance for all, in practice, female contributors often face discriminatory barriers to contribution. Aims: In this paper, we examine the extent to which these problems still exist. We also study male and female contributors' perceptions of other contributors. Method: We surveyed participants from 15 FLOSS projects, asking a series of open-ended, closed-ended, and behavioral scale questions to gather information about the issue of gender in FLOSS projects. Results: Though many of those we surveyed expressed a positive sentiment towards females who participate in FLOSS projects, some were still strongly against their inclusion. Often, the respondents who were against inclusiveness also believed their own sentiments were the prevailing belief in the community, contrary to our findings. Others did not see the purpose of attempting to be inclusive, expressing the sentiment that a discussion of gender has no place in FLOSS. Conclusions: FLOSS projects have started to move forwards in terms of gender acceptance. However, there is still a need for more progress in the inclusion of gender-diverse contributors.
{"title":"FLOSS Participants' Perceptions About Gender and Inclusiveness: A Survey","authors":"Amanda Lee, Jeffrey C. Carver","doi":"10.1109/ICSE.2019.00077","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00077","url":null,"abstract":"Background: While FLOSS projects espouse openness and acceptance for all, in practice, female contributors often face discriminatory barriers to contribution. Aims: In this paper, we examine the extent to which these problems still exist. We also study male and female contributors' perceptions of other contributors. Method: We surveyed participants from 15 FLOSS projects, asking a series of open-ended, closed-ended, and behavioral scale questions to gather information about the issue of gender in FLOSS projects. Results: Though many of those we surveyed expressed a positive sentiment towards females who participate in FLOSS projects, some were still strongly against their inclusion. Often, the respondents who were against inclusiveness also believed their own sentiments were the prevailing belief in the community, contrary to our findings. Others did not see the purpose of attempting to be inclusive, expressing the sentiment that a discussion of gender has no place in FLOSS. Conclusions: FLOSS projects have started to move forwards in terms of gender acceptance. However, there is still a need for more progress in the inclusion of gender-diverse contributors.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"90 1","pages":"677-687"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85913306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software specifications often use natural language to describe the desired behavior, but such specifications are difficult to verify automatically. We present Swami, an automated technique that extracts test oracles and generates executable tests from structured natural language specifications. Swami focuses on exceptional behavior and boundary conditions that often cause field failures but that developers often fail to manually write tests for. Evaluated on the official JavaScript specification (ECMA-262), 98.4% of the tests Swami generated were precise to the specification. Using Swami to augment developer-written test suites improved coverage and identified 1 previously unknown defect and 15 missing JavaScript features in Rhino, 1 previously unknown defect in Node.js, and 18 semantic ambiguities in the ECMA-262 specification.
{"title":"Automatically Generating Precise Oracles from Structured Natural Language Specifications","authors":"Manish Motwani, Yuriy Brun","doi":"10.1109/ICSE.2019.00035","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00035","url":null,"abstract":"Software specifications often use natural language to describe the desired behavior, but such specifications are difficult to verify automatically. We present Swami, an automated technique that extracts test oracles and generates executable tests from structured natural language specifications. Swami focuses on exceptional behavior and boundary conditions that often cause field failures but that developers often fail to manually write tests for. Evaluated on the official JavaScript specification (ECMA-262), 98.4% of the tests Swami generated were precise to the specification. Using Swami to augment developer-written test suites improved coverage and identified 1 previously unknown defect and 15 missing JavaScript features in Rhino, 1 previously unknown defect in Node.js, and 18 semantic ambiguities in the ECMA-262 specification.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"4 1","pages":"188-199"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82453378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Platforms like Stack Overflow and GitHub's gist system promote the sharing of ideas and programming techniques via the distribution of code snippets designed to illustrate particular tasks. Python, a popular and fast-growing programming language, sees heavy use on both sites, with nearly one million questions asked on Stack Overflow and 400 thousand public gists on GitHub. Unfortunately, around 75% of the Python example code shared through these sites cannot be directly executed. When run in a clean environment, over 50% of public Python gists fail due to an import error for a missing library. We present DockerizeMe, a technique for inferring the dependencies needed to execute a Python code snippet without import error. DockerizeMe starts with offline knowledge acquisition of the resources and dependencies for popular Python packages from the Python Package Index (PyPI). It then builds Docker specifications using a graph-based inference procedure. Our inference procedure resolves import errors in 892 out of nearly 3,000 gists from the Gistable dataset for which Gistable's baseline approach could not find and install all dependencies.
{"title":"DockerizeMe: Automatic Inference of Environment Dependencies for Python Code Snippets","authors":"Eric Horton, Chris Parnin","doi":"10.1109/ICSE.2019.00047","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00047","url":null,"abstract":"Platforms like Stack Overflow and GitHub's gist system promote the sharing of ideas and programming techniques via the distribution of code snippets designed to illustrate particular tasks. Python, a popular and fast-growing programming language, sees heavy use on both sites, with nearly one million questions asked on Stack Overflow and 400 thousand public gists on GitHub. Unfortunately, around 75% of the Python example code shared through these sites cannot be directly executed. When run in a clean environment, over 50% of public Python gists fail due to an import error for a missing library. We present DockerizeMe, a technique for inferring the dependencies needed to execute a Python code snippet without import error. DockerizeMe starts with offline knowledge acquisition of the resources and dependencies for popular Python packages from the Python Package Index (PyPI). It then builds Docker specifications using a graph-based inference procedure. Our inference procedure resolves import errors in 892 out of nearly 3,000 gists from the Gistable dataset for which Gistable's baseline approach could not find and install all dependencies.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"38 1","pages":"328-338"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82635248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper introduces RESTler, the first stateful REST API fuzzer. RESTler analyzes the API specification of a cloud service and generates sequences of requests that automatically test the service through its API. RESTler generates test sequences by (1) inferring producer-consumer dependencies among request types declared in the specification (e.g., inferring that "a request B should be executed after request A" because B takes as an input a resource-id x produced by A) and by (2) analyzing dynamic feedback from responses observed during prior test executions in order to generate new tests (e.g., learning that "a request C after a request sequence A;B is refused by the service" and therefore avoiding this combination in the future). We present experimental results showing that these two techniques are necessary to thoroughly exercise a service under test while pruning the large search space of possible request sequences. We used RESTler to test GitLab, an open-source Git service, as well as several Microsoft Azure and Office365 cloud services. RESTler found 28 bugs in GitLab and several bugs in each of the Azure and Office365 cloud services tested so far. These bugs have been confirmed and fixed by the service owners.
{"title":"RESTler: Stateful REST API Fuzzing","authors":"Vaggelis Atlidakis, Patrice Godefroid, Marina Polishchuk","doi":"10.1109/ICSE.2019.00083","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00083","url":null,"abstract":"This paper introduces RESTler, the first stateful REST API fuzzer. RESTler analyzes the API specification of a cloud service and generates sequences of requests that automatically test the service through its API. RESTler generates test sequences by (1) inferring producer-consumer dependencies among request types declared in the specification (e.g., inferring that \"a request B should be executed after request A\" because B takes as an input a resource-id x produced by A) and by (2) analyzing dynamic feedback from responses observed during prior test executions in order to generate new tests (e.g., learning that \"a request C after a request sequence A;B is refused by the service\" and therefore avoiding this combination in the future). We present experimental results showing that these two techniques are necessary to thoroughly exercise a service under test while pruning the large search space of possible request sequences. We used RESTler to test GitLab, an open-source Git service, as well as several Microsoft Azure and Office365 cloud services. RESTler found 28 bugs in GitLab and several bugs in each of the Azure and Office365 cloud services tested so far. These bugs have been confirmed and fixed by the service owners.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"22 1","pages":"748-758"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83315177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Because of differences between development and production environments, many software performance problems are detected only after software enters production. We present PerformanceHat, a new system that uses profiling information from production executions to develop a global performance model suitable for integration into interactive development environments. PerformanceHat's ability to incrementally update this global model as the software is changed in the development environment enables it to deliver near real-time predictions of performance consequences reflecting the impact on the production environment. We implement PerformanceHat as an Eclipse plugin and evaluate it in a controlled experiment with 20 professional software developers implementing several software maintenance tasks using our approach and a representative baseline (Kibana). Our results indicate that developers using PerformanceHat were significantly faster in (1) detecting the performance problem, and (2) finding the root-cause of the problem. These results provide encouraging evidence that our approach helps developers detect, prevent, and debug production performance problems during development before the problem manifests in production.
{"title":"Interactive Production Performance Feedback in the IDE","authors":"Jürgen Cito, P. Leitner, M. Rinard, H. Gall","doi":"10.1109/ICSE.2019.00102","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00102","url":null,"abstract":"Because of differences between development and production environments, many software performance problems are detected only after software enters production. We present PerformanceHat, a new system that uses profiling information from production executions to develop a global performance model suitable for integration into interactive development environments. PerformanceHat's ability to incrementally update this global model as the software is changed in the development environment enables it to deliver near real-time predictions of performance consequences reflecting the impact on the production environment. We implement PerformanceHat as an Eclipse plugin and evaluate it in a controlled experiment with 20 professional software developers implementing several software maintenance tasks using our approach and a representative baseline (Kibana). Our results indicate that developers using PerformanceHat were significantly faster in (1) detecting the performance problem, and (2) finding the root-cause of the problem. These results provide encouraging evidence that our approach helps developers detect, prevent, and debug production performance problems during development before the problem manifests in production.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"64 1","pages":"971-981"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75027500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many software systems provide users with a set of configuration options and different configurations may lead to different runtime performance of the system. As the combination of configurations could be exponential, it is difficult to exhaustively deploy and measure system performance under all possible configurations. Recently, several learning methods have been proposed to build a performance prediction model based on performance data collected from a small sample of configurations, and then use the model to predict system performance under a new configuration. In this paper, we propose a novel approach to model highly configurable software system using a deep feedforward neural network (FNN) combined with a sparsity regularization technique, e.g. the L1 regularization. Besides, we also design a practical search strategy for automatically tuning the network hyperparameters efficiently. Our method, called DeepPerf, can predict performance values of highly configurable software systems with binary and/or numeric configuration options at much higher prediction accuracy with less training data than the state-of-the art approaches. Experimental results on eleven public real-world datasets confirm the effectiveness of our approach.
{"title":"DeepPerf: Performance Prediction for Configurable Software with Deep Sparse Neural Network","authors":"Huong Ha, Hongyu Zhang","doi":"10.1109/ICSE.2019.00113","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00113","url":null,"abstract":"Many software systems provide users with a set of configuration options and different configurations may lead to different runtime performance of the system. As the combination of configurations could be exponential, it is difficult to exhaustively deploy and measure system performance under all possible configurations. Recently, several learning methods have been proposed to build a performance prediction model based on performance data collected from a small sample of configurations, and then use the model to predict system performance under a new configuration. In this paper, we propose a novel approach to model highly configurable software system using a deep feedforward neural network (FNN) combined with a sparsity regularization technique, e.g. the L1 regularization. Besides, we also design a practical search strategy for automatically tuning the network hyperparameters efficiently. Our method, called DeepPerf, can predict performance values of highly configurable software systems with binary and/or numeric configuration options at much higher prediction accuracy with less training data than the state-of-the art approaches. Experimental results on eleven public real-world datasets confirm the effectiveness of our approach.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"56 1","pages":"1095-1106"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75327611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nasif Imtiaz, Justin Middleton, Joymallya Chakraborty, N. Robson, Gina R. Bai, E. Murphy-Hill
Diversity, including gender diversity, is valued by many software development organizations, yet the field remains dominated by men. One reason for this lack of diversity is gender bias. In this paper, we study the effects of that bias by using an existing framework derived from the gender studies literature.We adapt the four main effects proposed in the framework by posing hypotheses about how they might manifest on GitHub,then evaluate those hypotheses quantitatively. While our results how that effects of gender bias are largely invisible on the GitHub platform itself, there are still signals of women concentrating their work in fewer places and being more restrained in communication than men.
{"title":"Investigating the Effects of Gender Bias on GitHub","authors":"Nasif Imtiaz, Justin Middleton, Joymallya Chakraborty, N. Robson, Gina R. Bai, E. Murphy-Hill","doi":"10.1109/ICSE.2019.00079","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00079","url":null,"abstract":"Diversity, including gender diversity, is valued by many software development organizations, yet the field remains dominated by men. One reason for this lack of diversity is gender bias. In this paper, we study the effects of that bias by using an existing framework derived from the gender studies literature.We adapt the four main effects proposed in the framework by posing hypotheses about how they might manifest on GitHub,then evaluate those hypotheses quantitatively. While our results how that effects of gender bias are largely invisible on the GitHub platform itself, there are still signals of women concentrating their work in fewer places and being more restrained in communication than men.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"69 1","pages":"700-711"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74093582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sami Lazreg, Maxime Cordy, P. Collet, P. Heymans, Sébastien Mosser
Embedded systems, like those found in the automotive domain, must comply with stringent functional and non-functional requirements. To fulfil these requirements, engineers are confronted with a plethora of design alternatives both at the software and hardware level, out of which they must select the optimal solution wrt. possibly-antagonistic quality attributes (e.g. cost of manufacturing vs. speed of execution). We propose a model-driven framework to assist engineers in this choice. It captures high-level specifications of the system in the form of variable dataflows and configurable hardware platforms. A mapping algorithm then derives the design space, i.e. the set of compatible pairs of application and platform variants, and a variability-aware executable model, which encodes the functional and non-functional behaviour of all viable system variants. Novel verification algorithms then pinpoint the optimal system variants efficiently. The benefits of our approach are evaluated through a real-world case study from the automotive industry.
{"title":"Multifaceted Automated Analyses for Variability-Intensive Embedded Systems","authors":"Sami Lazreg, Maxime Cordy, P. Collet, P. Heymans, Sébastien Mosser","doi":"10.1109/ICSE.2019.00092","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00092","url":null,"abstract":"Embedded systems, like those found in the automotive domain, must comply with stringent functional and non-functional requirements. To fulfil these requirements, engineers are confronted with a plethora of design alternatives both at the software and hardware level, out of which they must select the optimal solution wrt. possibly-antagonistic quality attributes (e.g. cost of manufacturing vs. speed of execution). We propose a model-driven framework to assist engineers in this choice. It captures high-level specifications of the system in the form of variable dataflows and configurable hardware platforms. A mapping algorithm then derives the design space, i.e. the set of compatible pairs of application and platform variants, and a variability-aware executable model, which encodes the functional and non-functional behaviour of all viable system variants. Novel verification algorithms then pinpoint the optimal system variants efficiently. The benefits of our approach are evaluated through a real-world case study from the automotive industry.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"463 1","pages":"854-865"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80651119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep learning (DL) systems are widely used in domains including aircraft collision avoidance systems, Alzheimer's disease diagnosis, and autonomous driving cars. Despite the requirement for high reliability, DL systems are difficult to test. Existing DL testing work focuses on testing the DL models, not the implementations (e.g., DL software libraries) of the models. One key challenge of testing DL libraries is the difficulty of knowing the expected output of DL libraries given an input instance. Fortunately, there are multiple implementations of the same DL algorithms in different DL libraries. Thus, we propose CRADLE, a new approach that focuses on finding and localizing bugs in DL software libraries. CRADLE (1) performs cross-implementation inconsistency checking to detect bugs in DL libraries, and (2) leverages anomaly propagation tracking and analysis to localize faulty functions in DL libraries that cause the bugs. We evaluate CRADLE on three libraries (TensorFlow, CNTK, and Theano), 11 datasets (including ImageNet, MNIST, and KGS Go game), and 30 pre-trained models. CRADLE detects 12 bugs and 104 unique inconsistencies, and highlights functions relevant to the causes of inconsistencies for all 104 unique inconsistencies.
{"title":"CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries","authors":"H. Pham, Thibaud Lutellier, Weizhen Qi, Lin Tan","doi":"10.1109/ICSE.2019.00107","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00107","url":null,"abstract":"Deep learning (DL) systems are widely used in domains including aircraft collision avoidance systems, Alzheimer's disease diagnosis, and autonomous driving cars. Despite the requirement for high reliability, DL systems are difficult to test. Existing DL testing work focuses on testing the DL models, not the implementations (e.g., DL software libraries) of the models. One key challenge of testing DL libraries is the difficulty of knowing the expected output of DL libraries given an input instance. Fortunately, there are multiple implementations of the same DL algorithms in different DL libraries. Thus, we propose CRADLE, a new approach that focuses on finding and localizing bugs in DL software libraries. CRADLE (1) performs cross-implementation inconsistency checking to detect bugs in DL libraries, and (2) leverages anomaly propagation tracking and analysis to localize faulty functions in DL libraries that cause the bugs. We evaluate CRADLE on three libraries (TensorFlow, CNTK, and Theano), 11 datasets (including ImageNet, MNIST, and KGS Go game), and 30 pre-trained models. CRADLE detects 12 bugs and 104 unique inconsistencies, and highlights functions relevant to the causes of inconsistencies for all 104 unique inconsistencies.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"40 1","pages":"1027-1038"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81531561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Over the past decades, numerous approaches were proposed to help practitioner to predict or locate defective files. These techniques often use syntactic dependency, history co-change relation, or semantic similarity. The problem is that, it remains unclear whether these different dependency relations will present similar accuracy in terms of defect prediction and localization. In this paper, we present our systematic investigation of this question from the perspective of software architecture. Considering files involved in each dependency type as an individual design space, we model such a design space using one DRSpace. We derived 3 DRSpaces for each of the 117 Apache open source projects, with 643,079 revision commits and 101,364 bug reports in total, and calculated their interactions with defective files. The experiment results are surprising: the three dependency types present significantly different architectural views, and their interactions with defective files are also drastically different. Intuitively, they play completely different roles when used for defect prediction/localization. The good news is that the combination of these structures has the potential to improve the accuracy of defect prediction/localization. In summary, our work provides a new perspective regarding to which type(s) of relations should be used for the task of defect prediction/localization. These quantitative and qualitative results also advance our knowledge of the relationship between software quality and architectural views formed using different dependency types.
{"title":"Investigating the Impact of Multiple Dependency Structures on Software Defects","authors":"Di Cui, Ting Liu, Yuanfang Cai, Q. Zheng, Qiong Feng, Wuxia Jin, Jiaqi Guo, YunHuan Qu","doi":"10.1109/ICSE.2019.00069","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00069","url":null,"abstract":"Over the past decades, numerous approaches were proposed to help practitioner to predict or locate defective files. These techniques often use syntactic dependency, history co-change relation, or semantic similarity. The problem is that, it remains unclear whether these different dependency relations will present similar accuracy in terms of defect prediction and localization. In this paper, we present our systematic investigation of this question from the perspective of software architecture. Considering files involved in each dependency type as an individual design space, we model such a design space using one DRSpace. We derived 3 DRSpaces for each of the 117 Apache open source projects, with 643,079 revision commits and 101,364 bug reports in total, and calculated their interactions with defective files. The experiment results are surprising: the three dependency types present significantly different architectural views, and their interactions with defective files are also drastically different. Intuitively, they play completely different roles when used for defect prediction/localization. The good news is that the combination of these structures has the potential to improve the accuracy of defect prediction/localization. In summary, our work provides a new perspective regarding to which type(s) of relations should be used for the task of defect prediction/localization. These quantitative and qualitative results also advance our knowledge of the relationship between software quality and architectural views formed using different dependency types.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"31 1","pages":"584-595"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89912721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}