We report on an exploratory study, which aims at understanding how development tutorials are structured, what types of tutorials exist, and how official tutorials differ from tutorials written by development communities. We analyzed over 1.200 tutorials for mobile application development provided by six different sources for the three major platforms: Android, Apple iOS, and Windows Phone. We found that a typical tutorial contains around 2700 words distributed over 4 pages and including a list of instructions with 18 items. Overall, 70% of the tutorials contain source code examples and a similar fraction contain images. On average, one tutorial has 6 images. When analyzing the images, we found that the studied iOS community posted the largest number of images, 14 images per tutorial, on average, from which 74% are plain images, i.e., mainly screenshots without stencils, diagrams, or highlights. In contrast, 36% of the images included in the official tutorials by Apple were diagrams or images with stencils. Community sites seem to follow a similar structure to the official sites but include items and images which are rather underrepresented in the official tutorials. From the analysis of the tutorials content by means of natural language processing combined with manual content analysis, we derived four categories for mobile development tutorials: infrastructure and design, application and services, distribution and maintenance, and development platform. Our categorization can help tutorial writers to better organize and evaluate the content of their tutorials and identify missing tutorials.
{"title":"How does a typical tutorial for mobile development look like?","authors":"Rebecca Tiarks, W. Maalej","doi":"10.1145/2597073.2597106","DOIUrl":"https://doi.org/10.1145/2597073.2597106","url":null,"abstract":"We report on an exploratory study, which aims at understanding how development tutorials are structured, what types of tutorials exist, and how official tutorials differ from tutorials written by development communities. We analyzed over 1.200 tutorials for mobile application development provided by six different sources for the three major platforms: Android, Apple iOS, and Windows Phone. We found that a typical tutorial contains around 2700 words distributed over 4 pages and including a list of instructions with 18 items. Overall, 70% of the tutorials contain source code examples and a similar fraction contain images. On average, one tutorial has 6 images. When analyzing the images, we found that the studied iOS community posted the largest number of images, 14 images per tutorial, on average, from which 74% are plain images, i.e., mainly screenshots without stencils, diagrams, or highlights. In contrast, 36% of the images included in the official tutorials by Apple were diagrams or images with stencils. Community sites seem to follow a similar structure to the official sites but include items and images which are rather underrepresented in the official tutorials. From the analysis of the tutorials content by means of natural language processing combined with manual content analysis, we derived four categories for mobile development tutorials: infrastructure and design, application and services, distribution and maintenance, and development platform. Our categorization can help tutorial writers to better organize and evaluate the content of their tutorials and identify missing tutorials.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"7 1","pages":"272-281"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74531301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The history of software systems tracked by version control systems is often incomplete because many file movements are not recorded. However, static code analyses that mine the file history, such as change frequency or code churn, produce precise results only if the complete history of a source code file is available. In this paper, we show that up to 38.9% of the files in open source systems have an incomplete history, and we propose an incremental, commit-based approach to reconstruct the history based on clone information and name similarity. With this approach, the history of a file can be reconstructed across repository boundaries and thus provides accurate information for any source code analysis. We evaluate the approach in terms of correctness, completeness, performance, and relevance with a case study among seven open source systems and a developer survey.
{"title":"Incremental origin analysis of source code files","authors":"Daniela Steidl, B. Hummel, Elmar Jürgens","doi":"10.1145/2597073.2597111","DOIUrl":"https://doi.org/10.1145/2597073.2597111","url":null,"abstract":"The history of software systems tracked by version control systems is often incomplete because many file movements are not recorded. However, static code analyses that mine the file history, such as change frequency or code churn, produce precise results only if the complete history of a source code file is available. In this paper, we show that up to 38.9% of the files in open source systems have an incomplete history, and we propose an incremental, commit-based approach to reconstruct the history based on clone information and name similarity. With this approach, the history of a file can be reconstructed across repository boundaries and thus provides accurate information for any source code analysis. We evaluate the approach in terms of correctness, completeness, performance, and relevance with a case study among seven open source systems and a developer survey.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"09 1","pages":"42-51"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85056969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Issue reporting and resolution is a software engineering process supported by tools such as Issue Tracking System (ITS), Peer Code Review (PCR) system and Version Control System (VCS). Several open source software projects such as Google Chromium and Android follow process in which a defect or feature enhancement request is reported to an issue tracker followed by source-code change or patch review and patch commit using a version control system. We present an application of process mining three software repositories (ITS, PCR and VCS) from control flow and organizational perspective for effective process management. ITS, PCR and VCS are not explicitly linked so we implement regular expression based heuristics to integrate data from three repositories for Google Chromium project. We define activities such as bug reporting, bug fixing, bug verification, patch submission, patch review, and source code commit and create an event log of the bug resolution process. The extracted event log contains audit trail data such as caseID, timestamp, activity name and performer. We discover runtime process model for bug resolution process spanning three repositories using process mining tool, Disco, and conduct process performance and efficiency analysis. We identify bottlenecks, define and detect basic and composite anti-patterns. In addition to control flow analysis, we mine event log to perform organizational analysis and discover metrics such as handover of work, subcontracting, joint cases and joint activities.
{"title":"Process mining multiple repositories for software defect resolution from control and organizational perspective","authors":"Monika Gupta, A. Sureka, S. Padmanabhuni","doi":"10.1145/2597073.2597081","DOIUrl":"https://doi.org/10.1145/2597073.2597081","url":null,"abstract":"Issue reporting and resolution is a software engineering process supported by tools such as Issue Tracking System (ITS), Peer Code Review (PCR) system and Version Control System (VCS). Several open source software projects such as Google Chromium and Android follow process in which a defect or feature enhancement request is reported to an issue tracker followed by source-code change or patch review and patch commit using a version control system. We present an application of process mining three software repositories (ITS, PCR and VCS) from control flow and organizational perspective for effective process management. ITS, PCR and VCS are not explicitly linked so we implement regular expression based heuristics to integrate data from three repositories for Google Chromium project. We define activities such as bug reporting, bug fixing, bug verification, patch submission, patch review, and source code commit and create an event log of the bug resolution process. The extracted event log contains audit trail data such as caseID, timestamp, activity name and performer. We discover runtime process model for bug resolution process spanning three repositories using process mining tool, Disco, and conduct process performance and efficiency analysis. We identify bottlenecks, define and detect basic and composite anti-patterns. In addition to control flow analysis, we mine event log to perform organizational analysis and discover metrics such as handover of work, subcontracting, joint cases and joint activities.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"81 1","pages":"122-131"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80877785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Over the past decade, several research efforts have studied the quality of software systems by looking at post-release bugs. However, these studies do not account for bugs that remain dormant (i.e., introduced in a version of the software system, but are not found until much later) for years and across many versions. Such dormant bugs skew our under- standing of the software quality. In this paper we study dormant bugs against non-dormant bugs using data from 20 different open-source Apache foundation software systems. We find that 33% of the bugs introduced in a version are not reported till much later (i.e., they are reported in future versions as dormant bugs). Moreover, we find that 18.9% of the reported bugs in a version are not even introduced in that version (i.e., they are dormant bugs from prior versions). In short, the use of reported bugs to judge the quality of a specific version might be misleading. Exploring the fix process for dormant bugs, we find that they are fixed faster (median fix time of 5 days) than non- dormant bugs (median fix time of 8 days), and are fixed by more experienced developers (median commit counts of developers who fix dormant bug is 169% higher). Our results highlight that dormant bugs are different from non-dormant bugs in many perspectives and that future research in software quality should carefully study and consider dormant bugs.
{"title":"An empirical study of dormant bugs","authors":"T. Chen, M. Nagappan, Emad Shihab, A. Hassan","doi":"10.1145/2597073.2597108","DOIUrl":"https://doi.org/10.1145/2597073.2597108","url":null,"abstract":"Over the past decade, several research efforts have studied the quality of software systems by looking at post-release bugs. However, these studies do not account for bugs that remain dormant (i.e., introduced in a version of the software system, but are not found until much later) for years and across many versions. Such dormant bugs skew our under- standing of the software quality. In this paper we study dormant bugs against non-dormant bugs using data from 20 different open-source Apache foundation software systems. We find that 33% of the bugs introduced in a version are not reported till much later (i.e., they are reported in future versions as dormant bugs). Moreover, we find that 18.9% of the reported bugs in a version are not even introduced in that version (i.e., they are dormant bugs from prior versions). In short, the use of reported bugs to judge the quality of a specific version might be misleading. Exploring the fix process for dormant bugs, we find that they are fixed faster (median fix time of 5 days) than non- dormant bugs (median fix time of 8 days), and are fixed by more experienced developers (median commit counts of developers who fix dormant bug is 169% higher). Our results highlight that dormant bugs are different from non-dormant bugs in many perspectives and that future research in software quality should carefully study and consider dormant bugs.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"38 1","pages":"82-91"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88376507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Merten, Bastian Mager, Simone Bürsner, B. Paech
Software repository data, for example in issue tracking systems, include natural language text and technical information, which includes anything from log files via code snippets to stack traces. However, data mining is often only interested in one of the two types e.g. in natural language text when looking at text mining. Regardless of which type is being investigated, any techniques used have to deal with noise caused by fragments of the other type i.e. methods interested in natural language have to deal with technical fragments and vice versa. This paper proposes an approach to classify unstructured data, e.g. development documents, into natural language text and technical information using a mixture of text heuristics and agglomerative hierarchical clustering. The approach was evaluated using 225 manually annotated text passages from developer emails and issue tracker data. Using white space tokenization as a basis, the overall precision of the approach is 0.84 and the recall is 0.85.
{"title":"Classifying unstructured data into natural language text and technical information","authors":"T. Merten, Bastian Mager, Simone Bürsner, B. Paech","doi":"10.1145/2597073.2597112","DOIUrl":"https://doi.org/10.1145/2597073.2597112","url":null,"abstract":"Software repository data, for example in issue tracking systems, include natural language text and technical information, which includes anything from log files via code snippets to stack traces. \u0000 However, data mining is often only interested in one of the two types e.g. in natural language text when looking at text mining. Regardless of which type is being investigated, any techniques used have to deal with noise caused by fragments of the other type i.e. methods interested in natural language have to deal with technical fragments and vice versa. \u0000 This paper proposes an approach to classify unstructured data, e.g. development documents, into natural language text and technical information using a mixture of text heuristics and agglomerative hierarchical clustering. \u0000 The approach was evaluated using 225 manually annotated text passages from developer emails and issue tracker data. Using white space tokenization as a basis, the overall precision of the approach is 0.84 and the recall is 0.85.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"41 1","pages":"300-303"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75569328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Vásquez, Andrew Holtzhauer, Carlos Bernal-Cárdenas, D. Poshyvanyk
In the recent years, studies of design and programming practices in mobile development are gaining more attention from researchers. Several such empirical studies used Android applications (paid, free, and open source) to analyze factors such as size, quality, dependencies, reuse, and cloning. Most of the studies use executable files of the apps (APK files), instead of source code because of availability issues (most of free apps available at the Android official market are not open-source, but still can be downloaded and analyzed in APK format). However, using only APK files in empirical studies comes with some threats to the validity of the results. In this paper, we analyze some of these pertinent threats. In particular, we analyzed the impact of third-party libraries and code obfuscation practices on estimating the amount of reuse by class cloning in Android apps. When including and excluding third-party libraries from the analysis, we found statistically significant differences in the amount of class cloning 24,379 free Android apps. Also, we found some evidence that obfuscation is responsible for increasing a number of false positives when detecting class clones. Finally, based on our findings, we provide a list of actionable guidelines for mining and analyzing large repositories of Android applications and minimizing these threats to validity
{"title":"Revisiting Android reuse studies in the context of code obfuscation and library usages","authors":"M. Vásquez, Andrew Holtzhauer, Carlos Bernal-Cárdenas, D. Poshyvanyk","doi":"10.1145/2597073.2597109","DOIUrl":"https://doi.org/10.1145/2597073.2597109","url":null,"abstract":"In the recent years, studies of design and programming practices in mobile development are gaining more attention from researchers. Several such empirical studies used Android applications (paid, free, and open source) to analyze factors such as size, quality, dependencies, reuse, and cloning. Most of the studies use executable files of the apps (APK files), instead of source code because of availability issues (most of free apps available at the Android official market are not open-source, but still can be downloaded and analyzed in APK format). However, using only APK files in empirical studies comes with some threats to the validity of the results. In this paper, we analyze some of these pertinent threats. In particular, we analyzed the impact of third-party libraries and code obfuscation practices on estimating the amount of reuse by class cloning in Android apps. When including and excluding third-party libraries from the analysis, we found statistically significant differences in the amount of class cloning 24,379 free Android apps. Also, we found some evidence that obfuscation is responsible for increasing a number of false positives when detecting class clones. Finally, based on our findings, we provide a list of actionable guidelines for mining and analyzing large repositories of Android applications and minimizing these threats to validity","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"265 1","pages":"242-251"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72766230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software regression testing is an integral part of most major software projects. As projects grow larger and the number of tests increases, performing regression testing becomes more costly. If software engineers can identify and run tests that are more likely to detect failures during regression testing, they may be able to better manage their regression testing activities. In this paper, to help identify such test cases, we developed techniques that utilizes various types of information in software repositories. To assess our techniques, we conducted an empirical study using an industrial software product, Microsoft Dynamics AX, which contains real faults. Our results show that the proposed techniques can be effective in identifying test cases that are likely to detect failures.
{"title":"Improving the effectiveness of test suite through mining historical data","authors":"Jeff Anderson, Saeed Salem, Hyunsook Do","doi":"10.1145/2597073.2597084","DOIUrl":"https://doi.org/10.1145/2597073.2597084","url":null,"abstract":"Software regression testing is an integral part of most major software projects. As projects grow larger and the number of tests increases, performing regression testing becomes more costly. If software engineers can identify and run tests that are more likely to detect failures during regression testing, they may be able to better manage their regression testing activities. In this paper, to help identify such test cases, we developed techniques that utilizes various types of information in software repositories. To assess our techniques, we conducted an empirical study using an industrial software product, Microsoft Dynamics AX, which contains real faults. Our results show that the proposed techniques can be effective in identifying test cases that are likely to detect failures.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"93 1","pages":"142-151"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84198609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software engineering is a maturing discipline which has seen many drastic advances in the last years. However, some studies still point to the lack of rigorous and mathematically grounded methods to raise the field to a new emerging science, with proper and reproducible foundations to build upon. Indeed, mathematicians and statisticians do not necessarily have software engineering knowledge, while software engineers and practitioners do not necessarily have a mathematical background. The Maisqual research project intends to fill the gap between both fields by proposing a controlled and peer-reviewed data set series ready to use and study. These data sets feature metrics from different repositories, from source code to mail activity and configuration management meta data. Metrics are described and commented, and all the steps followed for their extraction and treatment are described with contextual information about the data and its meaning. This article introduces the Apache Ant weekly data set, featuring 636 extracts of the project over 12 years at different levels of artefacts – application, files, functions. By associating community and process related information to code extracts, this data set unveils interesting perspectives on the evolution of one of the great success stories of open source.
{"title":"Understanding software evolution: the maisqual ant data set","authors":"B. Baldassari, P. Preux","doi":"10.1145/2597073.2597136","DOIUrl":"https://doi.org/10.1145/2597073.2597136","url":null,"abstract":"Software engineering is a maturing discipline which has seen many drastic advances in the last years. However, some studies still point to the lack of rigorous and mathematically grounded methods to raise the field to a new emerging science, with proper and reproducible foundations to build upon. Indeed, mathematicians and statisticians do not necessarily have software engineering knowledge, while software engineers and practitioners do not necessarily have a mathematical background. \u0000 The Maisqual research project intends to fill the gap between both fields by proposing a controlled and peer-reviewed data set series ready to use and study. These data sets feature metrics from different repositories, from source code to mail activity and configuration management meta data. Metrics are described and commented, and all the steps followed for their extraction and treatment are described with contextual information about the data and its meaning. \u0000 This article introduces the Apache Ant weekly data set, featuring 636 extracts of the project over 12 years at different levels of artefacts – application, files, functions. By associating community and process related information to code extracts, this data set unveils interesting perspectives on the evolution of one of the great success stories of open source.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"41 1","pages":"424-427"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78821177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To predict files with defects, a suitable prediction model must be built for a software project from either itself (within-project) or other projects (cross-project). A universal defect prediction model that is built from the entire set of diverse projects would relieve the need for building models for an individual project. A universal model could also be interpreted as a basic relationship between software metrics and defects. However, the variations in the distribution of predictors pose a formidable obstacle to build a universal model. Such variations exist among projects with different context factors (e.g., size and programming language). To overcome this challenge, we propose context-aware rank transformations for predictors. We cluster projects based on the similarity of the distribution of 26 predictors, and derive the rank transformations using quantiles of predictors for a cluster. We then fit the universal model on the transformed data of 1,398 open source projects hosted on SourceForge and GoogleCode. Adding context factors to the universal model improves the predictive power. The universal model obtains prediction performance comparable to the within-project models and yields similar results when applied on five external projects (one Apache and four Eclipse projects). These results suggest that a universal defect prediction model may be an achievable goal.
{"title":"Towards building a universal defect prediction model","authors":"Feng Zhang, A. Mockus, I. Keivanloo, Ying Zou","doi":"10.1145/2597073.2597078","DOIUrl":"https://doi.org/10.1145/2597073.2597078","url":null,"abstract":"To predict files with defects, a suitable prediction model must be built for a software project from either itself (within-project) or other projects (cross-project). A universal defect prediction model that is built from the entire set of diverse projects would relieve the need for building models for an individual project. A universal model could also be interpreted as a basic relationship between software metrics and defects. However, the variations in the distribution of predictors pose a formidable obstacle to build a universal model. Such variations exist among projects with different context factors (e.g., size and programming language). To overcome this challenge, we propose context-aware rank transformations for predictors. We cluster projects based on the similarity of the distribution of 26 predictors, and derive the rank transformations using quantiles of predictors for a cluster. We then fit the universal model on the transformed data of 1,398 open source projects hosted on SourceForge and GoogleCode. Adding context factors to the universal model improves the predictive power. The universal model obtains prediction performance comparable to the within-project models and yields similar results when applied on five external projects (one Apache and four Eclipse projects). These results suggest that a universal defect prediction model may be an achievable goal.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"42 1","pages":"182-191"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89724424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prior research suggests that predicting defect-inducing changes, i.e., Just-In-Time (JIT) defect prediction is a more practical alternative to traditional defect prediction techniques, providing immediate feedback while design decisions are still fresh in the minds of developers. Unfortunately, similar to traditional defect prediction models, JIT models require a large amount of training data, which is not available when projects are in initial development phases. To address this flaw in traditional defect prediction, prior work has proposed cross-project models, i.e., models learned from older projects with sufficient history. However, cross-project models have not yet been explored in the context of JIT prediction. Therefore, in this study, we empirically evaluate the performance of JIT cross-project models. Through a case study on 11 open source projects, we find that in a JIT cross-project context: (1) high performance within-project models rarely perform well; (2) models trained on projects that have similar correlations between predictor and dependent variables often perform well; and (3) ensemble learning techniques that leverage historical data from several other projects (e.g., voting experts) often perform well. Our findings empirically confirm that JIT cross-project models learned using other projects are a viable solution for projects with little historical data. However, JIT cross-project models perform best when the data used to learn them is carefully selected.
{"title":"An empirical study of just-in-time defect prediction using cross-project models","authors":"Takafumi Fukushima, Yasutaka Kamei, Shane McIntosh, Kazuhiro Yamashita, Naoyasu Ubayashi","doi":"10.1145/2597073.2597075","DOIUrl":"https://doi.org/10.1145/2597073.2597075","url":null,"abstract":"Prior research suggests that predicting defect-inducing changes, i.e., Just-In-Time (JIT) defect prediction is a more practical alternative to traditional defect prediction techniques, providing immediate feedback while design decisions are still fresh in the minds of developers. Unfortunately, similar to traditional defect prediction models, JIT models require a large amount of training data, which is not available when projects are in initial development phases. To address this flaw in traditional defect prediction, prior work has proposed cross-project models, i.e., models learned from older projects with sufficient history. However, cross-project models have not yet been explored in the context of JIT prediction. Therefore, in this study, we empirically evaluate the performance of JIT cross-project models. Through a case study on 11 open source projects, we find that in a JIT cross-project context: (1) high performance within-project models rarely perform well; (2) models trained on projects that have similar correlations between predictor and dependent variables often perform well; and (3) ensemble learning techniques that leverage historical data from several other projects (e.g., voting experts) often perform well. Our findings empirically confirm that JIT cross-project models learned using other projects are a viable solution for projects with little historical data. However, JIT cross-project models perform best when the data used to learn them is carefully selected.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"2068 1","pages":"172-181"},"PeriodicalIF":0.0,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86549900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}