Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081861
Manishankar Mondal, C. Roy, Kevin A. Schneider
Code cloning has both positive and negative impacts on software maintenance and evolution. Focusing on the issues related to code cloning, researchers suggest to manage code clones through refactoring and tracking. However, it is impractical to refactor or track all clones in a software system. Thus, it is essential to identify which clones are important for refactoring and also, which clones are important for tracking. In this paper, we present a tool called SPCP-Miner which is the pioneer one to automatically identify and rank the important refactoring as well as important tracking candidates from the whole set of clones in a software system. SPCP-Miner implements the existing techniques that we used to conduct a large scale empirical study on SPCP clones (i.e., the clones that evolved following a Similarity Preserving Change Pattern called SPCP). We believe that SPCP-Miner can help us in better management of code clones by suggesting important clones for refactoring or tracking.
{"title":"SPCP-Miner: A tool for mining code clones that are important for refactoring or tracking","authors":"Manishankar Mondal, C. Roy, Kevin A. Schneider","doi":"10.1109/SANER.2015.7081861","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081861","url":null,"abstract":"Code cloning has both positive and negative impacts on software maintenance and evolution. Focusing on the issues related to code cloning, researchers suggest to manage code clones through refactoring and tracking. However, it is impractical to refactor or track all clones in a software system. Thus, it is essential to identify which clones are important for refactoring and also, which clones are important for tracking. In this paper, we present a tool called SPCP-Miner which is the pioneer one to automatically identify and rank the important refactoring as well as important tracking candidates from the whole set of clones in a software system. SPCP-Miner implements the existing techniques that we used to conduct a large scale empirical study on SPCP clones (i.e., the clones that evolved following a Similarity Preserving Change Pattern called SPCP). We believe that SPCP-Miner can help us in better management of code clones by suggesting important clones for refactoring or tracking.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"60 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128526274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081815
Shouzheng Yang, A. Manzer, Vassilios Tzerpos
Detecting design patterns in large software systems is a common reverse engineering task that can help the comprehension process of the system's design. While several design pattern detection tools presented in the literature are capable of detecting design patterns automatically, evaluating these detection results is usually done in a manual and subjective fashion. Differences in design pattern definitions, as well as pattern instance counting and presenting, exacerbate the difficulty of evaluating design pattern detection results. In this paper, we present a novel approach to evaluating and comparing design pattern detection results. Our approach, called MoRe, introduces a novel way to present design pattern instances in a uniform fashion. Based on this characterization of design pattern instances, we propose four measures for design pattern detection evaluation that convey a concise assessment of the quality of the results produced by a given detection method. We have implemented these measures, and present case studies that showcase their usefulness.
{"title":"Measuring the quality of design pattern detection results","authors":"Shouzheng Yang, A. Manzer, Vassilios Tzerpos","doi":"10.1109/SANER.2015.7081815","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081815","url":null,"abstract":"Detecting design patterns in large software systems is a common reverse engineering task that can help the comprehension process of the system's design. While several design pattern detection tools presented in the literature are capable of detecting design patterns automatically, evaluating these detection results is usually done in a manual and subjective fashion. Differences in design pattern definitions, as well as pattern instance counting and presenting, exacerbate the difficulty of evaluating design pattern detection results. In this paper, we present a novel approach to evaluating and comparing design pattern detection results. Our approach, called MoRe, introduces a novel way to present design pattern instances in a uniform fashion. Based on this characterization of design pattern instances, we propose four measures for design pattern detection evaluation that convey a concise assessment of the quality of the results produced by a given detection method. We have implemented these measures, and present case studies that showcase their usefulness.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115846491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081832
Marwan Abi-Antoun, Yibin Wang, E. Khalaj, Andrew Giang, V. Rajlich
During impact analysis on object-oriented code, statically extracting dependencies is often complicated by subclassing, programming to interfaces, aliasing, and collections, among others. When a tool recommends a large number of types or does not rank its recommendations, it may lead developers to explore more irrelevant code. We propose to mine and rank dependencies based on a global, hierarchical points-to graph that is extracted using abstract interpretation. A previous whole-program static analysis interprets a program enriched with annotations that express hierarchy, and over-approximates all the objects that may be created at runtime and how they may communicate. In this paper, an analysis mines the hierarchy and the edges in the graph to extract and rank dependencies such as the most important classes related to a class, or the most important classes behind an interface. An evaluation using two case studies on two systems totaling 10,000 lines of code and five completed code modification tasks shows that following dependencies based on abstract interpretation achieves higher effectiveness compared to following dependencies extracted from the abstract syntax tree. As a result, developers explore less irrelevant code.
{"title":"Impact analysis based on a global hierarchical Object Graph","authors":"Marwan Abi-Antoun, Yibin Wang, E. Khalaj, Andrew Giang, V. Rajlich","doi":"10.1109/SANER.2015.7081832","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081832","url":null,"abstract":"During impact analysis on object-oriented code, statically extracting dependencies is often complicated by subclassing, programming to interfaces, aliasing, and collections, among others. When a tool recommends a large number of types or does not rank its recommendations, it may lead developers to explore more irrelevant code. We propose to mine and rank dependencies based on a global, hierarchical points-to graph that is extracted using abstract interpretation. A previous whole-program static analysis interprets a program enriched with annotations that express hierarchy, and over-approximates all the objects that may be created at runtime and how they may communicate. In this paper, an analysis mines the hierarchy and the edges in the graph to extract and rank dependencies such as the most important classes related to a class, or the most important classes behind an interface. An evaluation using two case studies on two systems totaling 10,000 lines of code and five completed code modification tasks shows that following dependencies based on abstract interpretation achieves higher effectiveness compared to following dependencies extracted from the abstract syntax tree. As a result, developers explore less irrelevant code.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125820928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081859
Hiroaki Murakami, Yoshiki Higo, S. Kusumoto
Programmers often copy and paste code fragments when they would like to reuse them. Although copy-and-paste operations enable programmers to realize rapid developments of software systems, it makes code clones. Some clones have negative impacts on software developments. For example, if we modify a code fragment, we have to check whether its clones need the same modification. In this case, programmers often use tools that take a code fragment as input and take its clones as output. However, when programmers use such existing tools, programmers have to open a number of source code and move up/down a scroll bar for browsing the detected clones. In order to reduce the cost of browsing the detected clones, we developed a tool that visualizes clones by using Circle Packing, named ClonePacker. As a result of an experiment with participants, we confirmed that participants using ClonePacker reported the locations of clones faster than an existing tool.
{"title":"ClonePacker: A tool for clone set visualization","authors":"Hiroaki Murakami, Yoshiki Higo, S. Kusumoto","doi":"10.1109/SANER.2015.7081859","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081859","url":null,"abstract":"Programmers often copy and paste code fragments when they would like to reuse them. Although copy-and-paste operations enable programmers to realize rapid developments of software systems, it makes code clones. Some clones have negative impacts on software developments. For example, if we modify a code fragment, we have to check whether its clones need the same modification. In this case, programmers often use tools that take a code fragment as input and take its clones as output. However, when programmers use such existing tools, programmers have to open a number of source code and move up/down a scroll bar for browsing the detected clones. In order to reduce the cost of browsing the detected clones, we developed a tool that visualizes clones by using Circle Packing, named ClonePacker. As a result of an experiment with participants, we confirmed that participants using ClonePacker reported the locations of clones faster than an existing tool.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127481350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081864
Valerio Cosentino, Javier Luis Cánovas Izquierdo, Jordi Cabot
Software development projects face a lot of risks (requirements inflation, poor scheduling, technical problems, etc.). Underestimating those risks may put in danger the project success. One of the most critical risks is the employee turnover, that is the risk of key personnel leaving the project. A good indicator to evaluate this risk is to measure the concentration of information in individual developers. This is also popularly known as the bus factor (“number of key developers who would need to be incapacitated, i.e. hit by a bus, to make a project unable to proceed”). Despite the simplicity of the concept, calculating the actual bus factor for specific projects can quickly turn into an error-prone and time-consuming activity as soon as the size of the project and development team increase. In order to help project managers to assess the bus factor of their projects, in this paper we present a tool that, given a Git-based repository, automatically measures the bus factor for any file, directory and branch in the repository and for the project itself. You can also simulate with the tool what would happen to the project (e.g., which files would become orphans) if one or more developers disappeared.
{"title":"Assessing the bus factor of Git repositories","authors":"Valerio Cosentino, Javier Luis Cánovas Izquierdo, Jordi Cabot","doi":"10.1109/SANER.2015.7081864","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081864","url":null,"abstract":"Software development projects face a lot of risks (requirements inflation, poor scheduling, technical problems, etc.). Underestimating those risks may put in danger the project success. One of the most critical risks is the employee turnover, that is the risk of key personnel leaving the project. A good indicator to evaluate this risk is to measure the concentration of information in individual developers. This is also popularly known as the bus factor (“number of key developers who would need to be incapacitated, i.e. hit by a bus, to make a project unable to proceed”). Despite the simplicity of the concept, calculating the actual bus factor for specific projects can quickly turn into an error-prone and time-consuming activity as soon as the size of the project and development team increase. In order to help project managers to assess the bus factor of their projects, in this paper we present a tool that, given a Git-based repository, automatically measures the bus factor for any file, directory and branch in the repository and for the project itself. You can also simulate with the tool what would happen to the project (e.g., which files would become orphans) if one or more developers disappeared.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117134369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081816
L. Eshkevari, F. D. Santos, J. Cordy, G. Antoniol
PHP is by far the most popular WEB scripting language, accounting for more than 80% of existing websites. PHP is dynamically typed, which means that variables take on the type of the objects that they are assigned, and may change type as execution proceeds. While some type changes are likely not harmful, others involving function calls and global variables may be more difficult to understand and the source of many bugs. Hack, a new PHP variant endorsed by Facebook, attempts to address this problem by adding static typing to PHP variables, which limits them to a single consistent type throughout execution. This paper defines an empirical taxonomy of PHP type changes along three dimensions: the complexity or burden imposed to understand the type change; whether or not the change is potentially harmful; and the actual types changed. We apply static and dynamic analyses to three widely used WEB applications coded in PHP (WordPress, Drupal and phpBB) to investigate (1) to what extent developers really use dynamic typing, (2) what kinds of type changes are actually encountered; and (3) how difficult it might be to refactor the code to avoid type changes, and thus meet the constraints of Hack's static typing. We report evidence that dynamic typing is actually a relatively uncommon practice in production PHP programs, and that most dynamic type changes are simple representational changes, such as between strings and integers. We observe that most PHP type changes in these programs are relatively simple, and that the largest proportion of them are easy to refactor to consistent static typing using simple local renaming transformations. Overall, the paper casts doubt on the usefulness of dynamic typing in PHP, and indicates that for many production applications, conversion to Hack's static typing may not be very difficult.
{"title":"Are PHP applications ready for Hack?","authors":"L. Eshkevari, F. D. Santos, J. Cordy, G. Antoniol","doi":"10.1109/SANER.2015.7081816","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081816","url":null,"abstract":"PHP is by far the most popular WEB scripting language, accounting for more than 80% of existing websites. PHP is dynamically typed, which means that variables take on the type of the objects that they are assigned, and may change type as execution proceeds. While some type changes are likely not harmful, others involving function calls and global variables may be more difficult to understand and the source of many bugs. Hack, a new PHP variant endorsed by Facebook, attempts to address this problem by adding static typing to PHP variables, which limits them to a single consistent type throughout execution. This paper defines an empirical taxonomy of PHP type changes along three dimensions: the complexity or burden imposed to understand the type change; whether or not the change is potentially harmful; and the actual types changed. We apply static and dynamic analyses to three widely used WEB applications coded in PHP (WordPress, Drupal and phpBB) to investigate (1) to what extent developers really use dynamic typing, (2) what kinds of type changes are actually encountered; and (3) how difficult it might be to refactor the code to avoid type changes, and thus meet the constraints of Hack's static typing. We report evidence that dynamic typing is actually a relatively uncommon practice in production PHP programs, and that most dynamic type changes are simple representational changes, such as between strings and integers. We observe that most PHP type changes in these programs are relatively simple, and that the largest proportion of them are easy to refactor to consistent static typing using simple local renaming transformations. Overall, the paper casts doubt on the usefulness of dynamic typing in PHP, and indicates that for many production applications, conversion to Hack's static typing may not be very difficult.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123547484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081876
Jiajun Hu, Xiaobing Sun, Bin Li
Software repositories such as revision control systems and bug tracking systems are usually used to manage the changes of software projects. During software maintenance and evolution, software developers and stakeholders need to investigate these repositories to identify what tasks were worked on in a particular time interval and how much effort was devoted to them. A typical way of mining software repositories is to use topic analysis models, e.g., Latent Dirichlet Allocation (LDA), to identify and organize the underlying structure in software documents to understand the evolution of development topics. These previously LDA-based topic analysis models can capture either changes on the strength (popularity) of various development topics over time (i.e., strength evolution) or changes in the content (the words that form the topic) of existing topics over time (i.e., content evolution). Unfortunately, few techniques can capture both strength and content evolution simultaneously. However, both pieces of information are necessary for developers to fully understand how software evolves. In this paper, we propose a novel approach to analyze commit messages within a project's lifetime to capture both strength and content evolution simultaneously via Online Latent Dirichlet Allocation (On-Line LDA). Moreover, the proposed approach also provides an efficient way to detect emerging topics in real development iteration when a new feature request arrives at a particular time, thus helping project stakeholds progress their projects smoothly.
{"title":"Explore the evolution of development topics via on-line LDA","authors":"Jiajun Hu, Xiaobing Sun, Bin Li","doi":"10.1109/SANER.2015.7081876","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081876","url":null,"abstract":"Software repositories such as revision control systems and bug tracking systems are usually used to manage the changes of software projects. During software maintenance and evolution, software developers and stakeholders need to investigate these repositories to identify what tasks were worked on in a particular time interval and how much effort was devoted to them. A typical way of mining software repositories is to use topic analysis models, e.g., Latent Dirichlet Allocation (LDA), to identify and organize the underlying structure in software documents to understand the evolution of development topics. These previously LDA-based topic analysis models can capture either changes on the strength (popularity) of various development topics over time (i.e., strength evolution) or changes in the content (the words that form the topic) of existing topics over time (i.e., content evolution). Unfortunately, few techniques can capture both strength and content evolution simultaneously. However, both pieces of information are necessary for developers to fully understand how software evolves. In this paper, we propose a novel approach to analyze commit messages within a project's lifetime to capture both strength and content evolution simultaneously via Online Latent Dirichlet Allocation (On-Line LDA). Moreover, the proposed approach also provides an efficient way to detect emerging topics in real development iteration when a new feature request arrives at a particular time, thus helping project stakeholds progress their projects smoothly.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128383674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081881
Csaba Nagy, L. Meurice, Anthony Cleve
Concept location in software engineering is the process of identifying where a specific concept is implemented in the source code of a software system. It is a very common task performed by developers during development or maintenance, and many techniques have been studied by researchers to make it more efficient. However, most of the current techniques ignore the role of a database in the architecture of a system, which is also an important source of concepts or dependencies among them. In this paper, we present a concept location technique for data-intensive systems, as systems with at least one database server in their architecture which is intensively used by its clients. Specifically, we present a static technique for identifying the exact source code location from where a given SQL query was sent to the database. We evaluate our technique by collecting and locating SQL queries from testing scenarios of two open source Java systems under active development. With our technique, we are able to successfully identify the source of most of these queries.
{"title":"Where was this SQL query executed? a static concept location approach","authors":"Csaba Nagy, L. Meurice, Anthony Cleve","doi":"10.1109/SANER.2015.7081881","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081881","url":null,"abstract":"Concept location in software engineering is the process of identifying where a specific concept is implemented in the source code of a software system. It is a very common task performed by developers during development or maintenance, and many techniques have been studied by researchers to make it more efficient. However, most of the current techniques ignore the role of a database in the architecture of a system, which is also an important source of concepts or dependencies among them. In this paper, we present a concept location technique for data-intensive systems, as systems with at least one database server in their architecture which is intensively used by its clients. Specifically, we present a static technique for identifying the exact source code location from where a given SQL query was sent to the database. We evaluate our technique by collecting and locating SQL queries from testing scenarios of two open source Java systems under active development. With our technique, we are able to successfully identify the source of most of these queries.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116597852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081824
Patanamon Thongtanunam, C. Tantithamthavorn, R. Kula, Norihiro Yoshida, Hajimu Iida, Ken-ichi Matsumoto
Software code review is an inspection of a code change by an independent third-party developer in order to identify and fix defects before an integration. Effectively performing code review can improve the overall software quality. In recent years, Modern Code Review (MCR), a lightweight and tool-based code inspection, has been widely adopted in both proprietary and open-source software systems. Finding appropriate code-reviewers in MCR is a necessary step of reviewing a code change. However, little research is known the difficulty of finding code-reviewers in a distributed software development and its impact on reviewing time. In this paper, we investigate the impact of reviews with code-reviewer assignment problem has on reviewing time. We find that reviews with code-reviewer assignment problem take 12 days longer to approve a code change. To help developers find appropriate code-reviewers, we propose RevFinder, a file location-based code-reviewer recommendation approach. We leverage a similarity of previously reviewed file path to recommend an appropriate code-reviewer. The intuition is that files that are located in similar file paths would be managed and reviewed by similar experienced code-reviewers. Through an empirical evaluation on a case study of 42,045 reviews of Android Open Source Project (AOSP), OpenStack, Qt and LibreOffice projects, we find that RevFinder accurately recommended 79% of reviews with a top 10 recommendation. RevFinder also correctly recommended the code-reviewers with a median rank of 4. The overall ranking of RevFinder is 3 times better than that of a baseline approach. We believe that RevFinder could be applied to MCR in order to help developers find appropriate code-reviewers and speed up the overall code review process.
{"title":"Who should review my code? A file location-based code-reviewer recommendation approach for Modern Code Review","authors":"Patanamon Thongtanunam, C. Tantithamthavorn, R. Kula, Norihiro Yoshida, Hajimu Iida, Ken-ichi Matsumoto","doi":"10.1109/SANER.2015.7081824","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081824","url":null,"abstract":"Software code review is an inspection of a code change by an independent third-party developer in order to identify and fix defects before an integration. Effectively performing code review can improve the overall software quality. In recent years, Modern Code Review (MCR), a lightweight and tool-based code inspection, has been widely adopted in both proprietary and open-source software systems. Finding appropriate code-reviewers in MCR is a necessary step of reviewing a code change. However, little research is known the difficulty of finding code-reviewers in a distributed software development and its impact on reviewing time. In this paper, we investigate the impact of reviews with code-reviewer assignment problem has on reviewing time. We find that reviews with code-reviewer assignment problem take 12 days longer to approve a code change. To help developers find appropriate code-reviewers, we propose RevFinder, a file location-based code-reviewer recommendation approach. We leverage a similarity of previously reviewed file path to recommend an appropriate code-reviewer. The intuition is that files that are located in similar file paths would be managed and reviewed by similar experienced code-reviewers. Through an empirical evaluation on a case study of 42,045 reviews of Android Open Source Project (AOSP), OpenStack, Qt and LibreOffice projects, we find that RevFinder accurately recommended 79% of reviews with a top 10 recommendation. RevFinder also correctly recommended the code-reviewers with a median rank of 4. The overall ranking of RevFinder is 3 times better than that of a baseline approach. We believe that RevFinder could be applied to MCR in order to help developers find appropriate code-reviewers and speed up the overall code review process.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129734191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-03-02DOI: 10.1109/SANER.2015.7081893
Md Tajmilur Rahman
In my PhD research I will focus on modern release engineering practices. First, I have quantified the time and effort that is involved in stabilizing a release. I found that despite using rapid release, the Chrome and Linux projects still have a period where they rush changes into a release. Second, developers typically isolate unrelated changes on branches. However, developers at major companies, such as Google and Facebook, commit all changes to a single branch. They isolate unrelated changes using feature-flags, which allows them to disable works in progress. My goal is to empirically determine the best practices when using flags and identify dead code. Finally, I will develop tool support to manage feature flags.
{"title":"Investigating modern release engineering practices","authors":"Md Tajmilur Rahman","doi":"10.1109/SANER.2015.7081893","DOIUrl":"https://doi.org/10.1109/SANER.2015.7081893","url":null,"abstract":"In my PhD research I will focus on modern release engineering practices. First, I have quantified the time and effort that is involved in stabilizing a release. I found that despite using rapid release, the Chrome and Linux projects still have a period where they rush changes into a release. Second, developers typically isolate unrelated changes on branches. However, developers at major companies, such as Google and Facebook, commit all changes to a single branch. They isolate unrelated changes using feature-flags, which allows them to disable works in progress. My goal is to empirically determine the best practices when using flags and identify dead code. Finally, I will develop tool support to manage feature flags.","PeriodicalId":355949,"journal":{"name":"2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130175046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}