Function Insight is a tool for visualizing run traces at the functional level. A key feature is the ability to bring the user's attention to particularly important sections of code based on rule-, machine learning- and data mining-based heuristics or a user-defined "interest metric."
Function Insight是一个在功能级别可视化运行轨迹的工具。一个关键特性是能够将用户的注意力吸引到基于规则、机器学习和数据挖掘的启发式方法或用户定义的“兴趣度量”的特别重要的代码部分。
{"title":"Function Insight: Highlighting Suspicious Sections in Binary Run Traces","authors":"M. Cheatham, J. Raber","doi":"10.1109/WCRE.2011.63","DOIUrl":"https://doi.org/10.1109/WCRE.2011.63","url":null,"abstract":"Function Insight is a tool for visualizing run traces at the functional level. A key feature is the ability to bring the user's attention to particularly important sections of code based on rule-, machine learning- and data mining-based heuristics or a user-defined \"interest metric.\"","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125033398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A crucial step in understanding the life cycle of software bugs is identifying their origin. Unfortunately this information is not usually recorded and recovering it at a later date is challenging. Recently two approaches have been developed that attempt to solve this problem: the text approach and the dependency approach. However only limited evaluation has been carried out on their effectiveness so far, partially due to the lack of data sets linking bugs to their introduction. Producing such data sets is both time-consuming and challenging due to the subjective nature of the problem. To improve this, the origins of 166 bugs in two open-source projects were manually identified. These were then compared to a simulation of the approaches. The results show that both approaches were partially successful across a variety of different types of bugs. They achieved a precision of 29% -- 79% and a recall of 40% -- 70%, and could perform better when combined. However there remain a number of challenges to overcome in future development -- large commits, unrelated changes and large numbers of versions between the origin and the fix all reduce their effectiveness.
{"title":"A Preliminary Evaluation of Text-based and Dependency-based Techniques for Determining the Origins of Bugs","authors":"Steven Davies, M. Roper, M. Wood","doi":"10.1109/WCRE.2011.32","DOIUrl":"https://doi.org/10.1109/WCRE.2011.32","url":null,"abstract":"A crucial step in understanding the life cycle of software bugs is identifying their origin. Unfortunately this information is not usually recorded and recovering it at a later date is challenging. Recently two approaches have been developed that attempt to solve this problem: the text approach and the dependency approach. However only limited evaluation has been carried out on their effectiveness so far, partially due to the lack of data sets linking bugs to their introduction. Producing such data sets is both time-consuming and challenging due to the subjective nature of the problem. To improve this, the origins of 166 bugs in two open-source projects were manually identified. These were then compared to a simulation of the approaches. The results show that both approaches were partially successful across a variety of different types of bugs. They achieved a precision of 29% -- 79% and a recall of 40% -- 70%, and could perform better when combined. However there remain a number of challenges to overcome in future development -- large commits, unrelated changes and large numbers of versions between the origin and the fix all reduce their effectiveness.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129769661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many empirical studies have shown that defect prediction models built on product metrics can be used to assess the quality of software modules. So far, most methods proposed in this direction predict defects by class or file. In this paper, we propose a novel software defect prediction method based on functional clusters of programs to improve the performance, especially the effort-aware performance, of defect prediction. In the method, we use proper-grained and problem-oriented program clusters as the basic units of defect prediction. To evaluate the effectiveness of the method, we conducted an experimental study on Eclipse 3.0. We found that, comparing with class-based models, cluster-based prediction models can significantly improve the recall (from 31.6% to 99.2%) and precision (from 73.8% to 91.6%) of defect prediction. According to the effort-aware evaluation, the effort needed to review code to find half of the total defects can be reduced by 6% if using cluster-based prediction models.
{"title":"Assessing Software Quality by Program Clustering and Defect Prediction","authors":"X. Tan, Xin Peng, Sen Pan, Wenyun Zhao","doi":"10.1109/WCRE.2011.37","DOIUrl":"https://doi.org/10.1109/WCRE.2011.37","url":null,"abstract":"Many empirical studies have shown that defect prediction models built on product metrics can be used to assess the quality of software modules. So far, most methods proposed in this direction predict defects by class or file. In this paper, we propose a novel software defect prediction method based on functional clusters of programs to improve the performance, especially the effort-aware performance, of defect prediction. In the method, we use proper-grained and problem-oriented program clusters as the basic units of defect prediction. To evaluate the effectiveness of the method, we conducted an experimental study on Eclipse 3.0. We found that, comparing with class-based models, cluster-based prediction models can significantly improve the recall (from 31.6% to 99.2%) and precision (from 73.8% to 91.6%) of defect prediction. According to the effort-aware evaluation, the effort needed to review code to find half of the total defects can be reduced by 6% if using cluster-based prediction models.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124100725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amir Aryani, F. Perin, M. Lungu, A. Mahmood, Oscar Nierstrasz
Software dependencies play a vital role in program comprehension, change impact analysis and other software maintenance activities. Traditionally, these activities are supported by source code analysis, however, the source code is sometimes inaccessible, and not all stakeholders have adequate knowledge to perform such analysis. For example, non-technical domain experts and consultants raise most maintenance requests, however, they cannot predict the cost and impact of the requested changes without the support of the developers. We propose a novel approach to predict software dependencies by exploiting coupling present in domain-level information. Our approach is independent of the software implementation, hence, it can be used to evaluate architectural dependencies without access to the source code or the database. We evaluate our approach with a case study on a large-scale enterprise system, in which we demonstrate how up to 68% of the source code dependencies and 77% of the database dependencies are predicted solely based on domain information.
{"title":"Can We Predict Dependencies Using Domain information?","authors":"Amir Aryani, F. Perin, M. Lungu, A. Mahmood, Oscar Nierstrasz","doi":"10.1109/WCRE.2011.17","DOIUrl":"https://doi.org/10.1109/WCRE.2011.17","url":null,"abstract":"Software dependencies play a vital role in program comprehension, change impact analysis and other software maintenance activities. Traditionally, these activities are supported by source code analysis, however, the source code is sometimes inaccessible, and not all stakeholders have adequate knowledge to perform such analysis. For example, non-technical domain experts and consultants raise most maintenance requests, however, they cannot predict the cost and impact of the requested changes without the support of the developers. We propose a novel approach to predict software dependencies by exploiting coupling present in domain-level information. Our approach is independent of the software implementation, hence, it can be used to evaluate architectural dependencies without access to the source code or the database. We evaluate our approach with a case study on a large-scale enterprise system, in which we demonstrate how up to 68% of the source code dependencies and 77% of the database dependencies are predicted solely based on domain information.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132397897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Corrective maintenance activities (bug fixing) can be performed a long time after a bug introduction, or shortly after it. Such a time interval, i.e., the bug survival time, may depend on many factors, e.g., the bug severity/harmfulness, but also on how likely does the bug manifest itself and how difficult was to fix it. This paper proposes the use of survival analysis aimed at determining the relationship between the risk of not fixing a bug within a given time frame and specific source code constructs-e.g., expression operators or programming language constructs-changed when fixing the bug. We estimate the survival time by extracting, from versioning repositories, changes introducing and fixing bugs, and then correlate such a time-by means of survival models-with the constructs changed during bug-fixing. Results of a study performed on data extracted from the versioning repository of four open source projects-Eclipse, Mozilla, Open LDAP, and Vuze-indicate that long-lived bugs can be characterized by changes to specific code constructs.
{"title":"How Long Does a Bug Survive? An Empirical Study","authors":"G. Canfora, M. Ceccarelli, L. Cerulo, M. D. Penta","doi":"10.1109/WCRE.2011.31","DOIUrl":"https://doi.org/10.1109/WCRE.2011.31","url":null,"abstract":"Corrective maintenance activities (bug fixing) can be performed a long time after a bug introduction, or shortly after it. Such a time interval, i.e., the bug survival time, may depend on many factors, e.g., the bug severity/harmfulness, but also on how likely does the bug manifest itself and how difficult was to fix it. This paper proposes the use of survival analysis aimed at determining the relationship between the risk of not fixing a bug within a given time frame and specific source code constructs-e.g., expression operators or programming language constructs-changed when fixing the bug. We estimate the survival time by extracting, from versioning repositories, changes introducing and fixing bugs, and then correlate such a time-by means of survival models-with the constructs changed during bug-fixing. Results of a study performed on data extracted from the versioning repository of four open source projects-Eclipse, Mozilla, Open LDAP, and Vuze-indicate that long-lived bugs can be characterized by changes to specific code constructs.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134497361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Albert, Israel Cabanas, Antonio Flores-Montoya, M. Gómez-Zamalloa, Sergio Gutierrez
We present jPET, a white box test-case generator (TCG) which can be used during software development of Java applications within the Eclipse environment. jPET builds on top of PET, a TCG which automatically obtains test-cases from the byte code associated to a Java program. jPET performs reverse engineering of the test-cases obtained at the byte code level by PET in order to yield this information to the user at the source code level. This allows understanding the information gathered at the lower level and using it to test source Java programs.
{"title":"jPET: An Automatic Test-Case Generator for Java","authors":"E. Albert, Israel Cabanas, Antonio Flores-Montoya, M. Gómez-Zamalloa, Sergio Gutierrez","doi":"10.1109/WCRE.2011.67","DOIUrl":"https://doi.org/10.1109/WCRE.2011.67","url":null,"abstract":"We present jPET, a white box test-case generator (TCG) which can be used during software development of Java applications within the Eclipse environment. jPET builds on top of PET, a TCG which automatically obtains test-cases from the byte code associated to a Java program. jPET performs reverse engineering of the test-cases obtained at the byte code level by PET in order to yield this information to the user at the source code level. This allows understanding the information gathered at the lower level and using it to test source Java programs.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"30 4 Pt 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129384326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew D. Beard, Nicholas A. Kraft, L. Etzkorn, Stacy K. Lukins
Bug localization involves using information about a bug to locate affected code sections. Several automated bug localization techniques based on information retrieval (IR) models have been constructed recently. The "gold standard" of measuring an IR technique's accuracy considers the technique's ability to locate a "first relevant method." However, the question remains -- does finding this single method enable the location of a complete set of affected methods? Previous arguments assume this to be true, however, few analyses of this assumption have been performed. In this paper, we perform a case study to test the reliability of this "gold standard" assumption. To further measure IR accuracy in the context of bug localization, we analyze the relevance of the IR model's "first method returned." We use various structural analysis techniques to extend relevant methods located by IR techniques and determine accuracy and reliability of these assumptions.
{"title":"Measuring the Accuracy of Information Retrieval Based Bug Localization Techniques","authors":"Matthew D. Beard, Nicholas A. Kraft, L. Etzkorn, Stacy K. Lukins","doi":"10.1109/WCRE.2011.23","DOIUrl":"https://doi.org/10.1109/WCRE.2011.23","url":null,"abstract":"Bug localization involves using information about a bug to locate affected code sections. Several automated bug localization techniques based on information retrieval (IR) models have been constructed recently. The \"gold standard\" of measuring an IR technique's accuracy considers the technique's ability to locate a \"first relevant method.\" However, the question remains -- does finding this single method enable the location of a complete set of affected methods? Previous arguments assume this to be true, however, few analyses of this assumption have been performed. In this paper, we perform a case study to test the reliability of this \"gold standard\" assumption. To further measure IR accuracy in the context of bug localization, we analyze the relevance of the IR model's \"first method returned.\" We use various structural analysis techniques to extend relevant methods located by IR techniques and determine accuracy and reliability of these assumptions.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126131803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Businesses are more and more modernising the legacy systems they developed with Rapid Application Development (RAD) environments, so that they can benefit from the new platforms and technologies. When facing the modernisation of applications developed with RAD environments, developers must deal with event handling code that typically mixes concerns such as GUI and business logic. In this paper we propose a model-based approach to tackle the reverse engineering of event handlers in RAD-based applications. Event handling code is transformed into an intermediate representation that captures the high-level behaviour of the code. From this representation, some modernisation tasks can be automated, and we propose the migration to a different architecture as an example. In particular, from PL/SQL code, a new Ajax application will be generated, where business logic, user interface and control code have been separated.
{"title":"Reverse Engineering of Event Handlers of RAD-Based Applications","authors":"Ó. Ramón, J. Cuadrado, J. Molina","doi":"10.1109/WCRE.2011.43","DOIUrl":"https://doi.org/10.1109/WCRE.2011.43","url":null,"abstract":"Businesses are more and more modernising the legacy systems they developed with Rapid Application Development (RAD) environments, so that they can benefit from the new platforms and technologies. When facing the modernisation of applications developed with RAD environments, developers must deal with event handling code that typically mixes concerns such as GUI and business logic. In this paper we propose a model-based approach to tackle the reverse engineering of event handlers in RAD-based applications. Event handling code is transformed into an intermediate representation that captures the high-level behaviour of the code. From this representation, some modernisation tasks can be automated, and we propose the migration to a different architecture as an example. In particular, from PL/SQL code, a new Ajax application will be generated, where business logic, user interface and control code have been separated.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129523854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There is a perception that when new features are added to a system that those added and modified parts of the source-code are more fault prone. Many have argued that new code and new features are defect prone due to immaturity, lack of testing, as well unstable requirements. Unfortunately most previous work does not investigate the link between a concrete requirement or new feature and the defects it causes, in particular the feature, the changed code and the subsequent defects are rarely investigated. In this paper we investigate the relationship between improvements, new features and defects recorded within an issue tracker. A manual case study is performed to validate the accuracy of these issue types. We combine defect issues and new feature issues with the code from version-control systems that introduces these features, we then explore the relationship of new features with the fault-proneness of their implementations. We describe properties and produce models of the relationship between new features and fault proneness, based on the analysis of issue trackers and version-control systems. We find, surprisingly, that neither improvements nor new features have any significant effect on later defect counts, when controlling for size and total number of changes.
{"title":"Got Issues? Do New Features and Code Improvements Affect Defects?","authors":"Daryl Posnett, Abram Hindle, Premkumar T. Devanbu","doi":"10.1109/WCRE.2011.33","DOIUrl":"https://doi.org/10.1109/WCRE.2011.33","url":null,"abstract":"There is a perception that when new features are added to a system that those added and modified parts of the source-code are more fault prone. Many have argued that new code and new features are defect prone due to immaturity, lack of testing, as well unstable requirements. Unfortunately most previous work does not investigate the link between a concrete requirement or new feature and the defects it causes, in particular the feature, the changed code and the subsequent defects are rarely investigated. In this paper we investigate the relationship between improvements, new features and defects recorded within an issue tracker. A manual case study is performed to validate the accuracy of these issue types. We combine defect issues and new feature issues with the code from version-control systems that introduces these features, we then explore the relationship of new features with the fault-proneness of their implementations. We describe properties and produce models of the relationship between new features and fault proneness, based on the analysis of issue trackers and version-control systems. We find, surprisingly, that neither improvements nor new features have any significant effect on later defect counts, when controlling for size and total number of changes.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"149 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129856980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Very often a defect must be corrected not only in the current version of a program at one particular place but in many places and many other versions -- possibly even in different development branches. Consequently, we need a technique to efficiently locate all approximate matches of an arbitrary defective code fragment in the program's history as they may need to be fixed as well. This paper presents an approximate whole-program code search in multiple releases and branches. We evaluate this technique for real-world defects of various large and realistic programs having multiple releases and branches. We report runtime measurements and recall using varying levels of allowable differences of the approximate search.
{"title":"Approximate Code Search in Program Histories","authors":"Saman Bazrafshan, R. Koschke, Nils Göde","doi":"10.1109/WCRE.2011.22","DOIUrl":"https://doi.org/10.1109/WCRE.2011.22","url":null,"abstract":"Very often a defect must be corrected not only in the current version of a program at one particular place but in many places and many other versions -- possibly even in different development branches. Consequently, we need a technique to efficiently locate all approximate matches of an arbitrary defective code fragment in the program's history as they may need to be fixed as well. This paper presents an approximate whole-program code search in multiple releases and branches. We evaluate this technique for real-world defects of various large and realistic programs having multiple releases and branches. We report runtime measurements and recall using varying levels of allowable differences of the approximate search.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"153 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134545536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}