Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306292
T. Parveen, S. Tilley, Nigel Daley, Pedro Morales
JUnit is the de-facto framework for creating and executing unit tests. This paper introduces a distributed execution framework for JUnit test cases called HadoopUnit, which is built upon Hadoop: an open-source platform for running applications that process vast amounts of data on large clusters built of commodity hardware. The primary motivation behind developing HadoopUnit was to test Hadoop production code using the Hadoop platform itself: existing approaches to testing the system were taking too long to run to completion, and were unable to provide feedback to the developers in a timely manner. Preliminary results suggest that HadoopUnit can reduce test execution time significantly: a 150-node cluster has produced a 30x improvement. HadoopUnit can be used by anyone for testing an application as long as they have access to a computing cluster with the Hadoop software installed and running.
{"title":"Towards a distributed execution framework for JUnit test cases","authors":"T. Parveen, S. Tilley, Nigel Daley, Pedro Morales","doi":"10.1109/ICSM.2009.5306292","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306292","url":null,"abstract":"JUnit is the de-facto framework for creating and executing unit tests. This paper introduces a distributed execution framework for JUnit test cases called HadoopUnit, which is built upon Hadoop: an open-source platform for running applications that process vast amounts of data on large clusters built of commodity hardware. The primary motivation behind developing HadoopUnit was to test Hadoop production code using the Hadoop platform itself: existing approaches to testing the system were taking too long to run to completion, and were unable to provide feedback to the developers in a timely manner. Preliminary results suggest that HadoopUnit can reduce test execution time significantly: a 150-node cluster has produced a 30x improvement. HadoopUnit can be used by anyone for testing an application as long as they have access to a computing cluster with the Hadoop software installed and running.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130078861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306283
T. Nguyen, H. Nguyen, Jafar M. Al-Kofahi, N. Pham, T. Nguyen
Code clone management has been shown to have several benefits for software developers. When source code evolves, clone management requires a mechanism to efficiently and incrementally detect code clones in the new revision. This paper introduces an incremental clone detection tool, called ClemanX. Our tool represents code fragments as subtrees of Abstract Syntax Trees (ASTs), measures their similarity levels based on their characteristic vectors of structural features, and solves the task of incrementally detecting similar code as an incremental distance-based clustering problem. Our empirical evaluation on large-scale software projects shows the usefulness and good performance of ClemanX.
{"title":"Scalable and incremental clone detection for evolving software","authors":"T. Nguyen, H. Nguyen, Jafar M. Al-Kofahi, N. Pham, T. Nguyen","doi":"10.1109/ICSM.2009.5306283","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306283","url":null,"abstract":"Code clone management has been shown to have several benefits for software developers. When source code evolves, clone management requires a mechanism to efficiently and incrementally detect code clones in the new revision. This paper introduces an incremental clone detection tool, called ClemanX. Our tool represents code fragments as subtrees of Abstract Syntax Trees (ASTs), measures their similarity levels based on their characteristic vectors of structural features, and solves the task of incrementally detecting similar code as an incremental distance-based clustering problem. Our empirical evaluation on large-scale software projects shows the usefulness and good performance of ClemanX.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124498941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306332
Marios Fokaefs, Nikolaos Tsantalis, A. Chatzigeorgiou, J. Sander
Software can be considered a live entity, as it undergoes many alterations throughout its lifecycle. Furthermore, developers do not usually retain a good design in favor of adding new features, comply with requirements or meet deadlines. For these reasons, code can become rather complex and difficult to understand. More particularly in object-oriented systems, classes may become very large and less cohesive. In order to identify such problematic cases, existing approaches have proposed the use of cohesion metrics. However, while metrics can identify classes with low cohesion, they cannot identify new or independent concepts. Moreover, these methods require a lot of human interpretation to identify the respective design flaws. In this paper, we propose a class decomposition method using an agglomerative clustering algorithm based on the Jaccard distance between class members. Our methodology is able to identify new concepts and rank the solutions according to their impact on the design quality of the system. Finally, our method has been evaluated by two independent designers who were asked to comment on the suggestions produced by our technique on their projects. The designers provided feedback on the ability of the method to identify new concepts and improve the design quality of the system in terms of cohesion.
{"title":"Decomposing object-oriented class modules using an agglomerative clustering technique","authors":"Marios Fokaefs, Nikolaos Tsantalis, A. Chatzigeorgiou, J. Sander","doi":"10.1109/ICSM.2009.5306332","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306332","url":null,"abstract":"Software can be considered a live entity, as it undergoes many alterations throughout its lifecycle. Furthermore, developers do not usually retain a good design in favor of adding new features, comply with requirements or meet deadlines. For these reasons, code can become rather complex and difficult to understand. More particularly in object-oriented systems, classes may become very large and less cohesive. In order to identify such problematic cases, existing approaches have proposed the use of cohesion metrics. However, while metrics can identify classes with low cohesion, they cannot identify new or independent concepts. Moreover, these methods require a lot of human interpretation to identify the respective design flaws. In this paper, we propose a class decomposition method using an agglomerative clustering algorithm based on the Jaccard distance between class members. Our methodology is able to identify new concepts and rank the solutions according to their impact on the design quality of the system. Finally, our method has been evaluated by two independent designers who were asked to comment on the suggestions produced by our technique on their projects. The designers provided feedback on the ability of the method to identify new concepts and improve the design quality of the system in terms of cohesion.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"444 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133568075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306286
Qing Xie, M. Grechanik, Chen Fu, Chad M. Cumby
Applications with Graphical User Interfaces (GUIs) are ubiquitous. Nontrivial GUI-based applications (GAPs) evolve frequently, and understanding how GUIs of different versions of GAPs differ is crucial for various tasks such as testing and project effort estimation. We offer a novel approach for comparing GUIs. We built a tool called GUI DifferEntiator (GUIDE) that allows users to visualize differences between GUIs of running GAPs automatically, so that the identified changes can be served as guidance to test the new release of the GAP or to estimate the effort of a project.
{"title":"Guide: A GUI differentiator","authors":"Qing Xie, M. Grechanik, Chen Fu, Chad M. Cumby","doi":"10.1109/ICSM.2009.5306286","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306286","url":null,"abstract":"Applications with Graphical User Interfaces (GUIs) are ubiquitous. Nontrivial GUI-based applications (GAPs) evolve frequently, and understanding how GUIs of different versions of GAPs differ is crucial for various tasks such as testing and project effort estimation. We offer a novel approach for comparing GUIs. We built a tool called GUI DifferEntiator (GUIDE) that allows users to visualize differences between GUIs of running GAPs automatically, so that the identified changes can be served as guidance to test the new release of the GAP or to estimate the effort of a project.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131821885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306288
Chao Huang, Jianling Sun, Xinyu Wang, Yuanjie Si, Di Wu
Little is known about how to preprocess the noise in legacy access control data when migrating to the Role-based Access Control model. This paper presents an experience report on the noise preprocess method of the legacy access control data of several financial systems. The main goal of this project is to improve the quality of the roles obtained via role mining.
{"title":"Preprocessing the noise in legacy user permission assignment data for role mining — An industrial practice","authors":"Chao Huang, Jianling Sun, Xinyu Wang, Yuanjie Si, Di Wu","doi":"10.1109/ICSM.2009.5306288","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306288","url":null,"abstract":"Little is known about how to preprocess the noise in legacy access control data when migrating to the Role-based Access Control model. This paper presents an experience report on the noise preprocess method of the legacy access control data of several financial systems. The main goal of this project is to improve the quality of the roles obtained via role mining.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133461642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306303
D. Jeffrey, Neelam Gupta, Rajiv Gupta
We previously presented a fault localization technique called Value Replacement that repeatedly alters the state of an executing program to locate a faulty statement [9]. The technique searches for program statements involving values that can be altered during runtime to cause the incorrect output of a failing run to become correct. We showed that highly effective fault localization results could be achieved by the technique on programs containing single faults. In the current work, we generalize Value Replacement so that it can also perform effectively in the presence of multiple faults. We improve scalability by describing two techniques that significantly improve the efficiency of Value Replacement. In our experimental study, our generalized technique effectively isolates multiple simultaneous faults in time on the order of minutes in each case, whereas in [9], the technique had sometimes required time on the order of hours to isolate only single faults.
{"title":"Effective and efficient localization of multiple faults using value replacement","authors":"D. Jeffrey, Neelam Gupta, Rajiv Gupta","doi":"10.1109/ICSM.2009.5306303","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306303","url":null,"abstract":"We previously presented a fault localization technique called Value Replacement that repeatedly alters the state of an executing program to locate a faulty statement [9]. The technique searches for program statements involving values that can be altered during runtime to cause the incorrect output of a failing run to become correct. We showed that highly effective fault localization results could be achieved by the technique on programs containing single faults. In the current work, we generalize Value Replacement so that it can also perform effectively in the presence of multiple faults. We improve scalability by describing two techniques that significantly improve the efficiency of Value Replacement. In our experimental study, our generalized technique effectively isolates multiple simultaneous faults in time on the order of minutes in each case, whereas in [9], the technique had sometimes required time on the order of hours to isolate only single faults.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115577073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306276
Daqing Hou, Yuejiao Wang
Integrated Development Environments (IDEs) help increase programmer productivity by automating much clerical and administrative work. Thus, it is of great research and practical interest to learn about the characteristics on how IDE features change and mature. To this end, we have conducted an empirical study, analyzing a total of 645 “What's New” release note entries in 7 releases of the Eclipse IDE both quantitatively and qualitatively. It is found that majority of the changes are refinements or incremental additions to the feature architecture set up in early releases (1.0 and 2.0). Motivated by this, a further analysis on usability is performed to characterize how these changes impact programmers effectiveness in using the IDE. We summarize our study methodology and lessons learned.
{"title":"Analyzing the evolution of user-visible features: A case study with Eclipse","authors":"Daqing Hou, Yuejiao Wang","doi":"10.1109/ICSM.2009.5306276","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306276","url":null,"abstract":"Integrated Development Environments (IDEs) help increase programmer productivity by automating much clerical and administrative work. Thus, it is of great research and practical interest to learn about the characteristics on how IDE features change and mature. To this end, we have conducted an empirical study, analyzing a total of 645 “What's New” release note entries in 7 releases of the Eclipse IDE both quantitatively and qualitatively. It is found that majority of the changes are refinements or incremental additions to the feature architecture set up in early releases (1.0 and 2.0). Motivated by this, a further analysis on usability is performed to characterize how these changes impact programmers effectiveness in using the IDE. We summarize our study methodology and lessons learned.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129714627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306279
Sven Wenzel, J. Koch, U. Kelter, A. Kolb
Large software systems typically exist in many revisions. In order to analyze such systems, their evolution needs to be analyzed, too. The main challenge in this context is to cope with the large volume of data and to visualize the system and its evolution as a whole. This paper presents an approach for visualizing the evolution of large-scale systems which uses three different, tightly integrated visualizations that are 3-dimensional and/or animated and which support different analysis tasks. According to a first empirical study, all tasks are supported well by at least one visualization.
{"title":"Evolution analysis with animated and 3D-visualizations","authors":"Sven Wenzel, J. Koch, U. Kelter, A. Kolb","doi":"10.1109/ICSM.2009.5306279","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306279","url":null,"abstract":"Large software systems typically exist in many revisions. In order to analyze such systems, their evolution needs to be analyzed, too. The main challenge in this context is to cope with the large volume of data and to visualize the system and its evolution as a whole. This paper presents an approach for visualizing the evolution of large-scale systems which uses three different, tightly integrated visualizations that are 3-dimensional and/or animated and which support different analysis tasks. According to a first empirical study, all tasks are supported well by at least one visualization.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"10 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126205564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306295
L. Nigul, Ernest Mah
Java annotations and their predecessors XDoclet annotations are well positioned to offer a substantial benefit for developing and maintaining software by combining relevant metadata together with the code that makes use of them. In this paper we share the insights that we gained by developing the J2EE Connector Tools suite, which is part of IBM Rational Application Developer. This tools suite is used to generate annotated Java code capable of connecting to Enterprise Information Systems through the Java EE Connector APIs. The generated code is annotation driven, and in the paper we show how the annotations in the generated code helped us to achieve a high degree of efficient code maintenance for the tools that generate the code as well as for the generated code itself.
Java注释及其前身XDoclet注释通过将相关元数据与使用它们的代码组合在一起,很好地为开发和维护软件提供了实质性的好处。在本文中,我们将分享我们通过开发J2EE连接器工具套件(它是IBM Rational Application Developer的一部分)所获得的见解。该工具套件用于生成带注释的Java代码,这些代码能够通过Java EE Connector api连接到企业信息系统。生成的代码是由注释驱动的,在本文中,我们展示了生成代码中的注释如何帮助我们实现对生成代码的工具以及生成的代码本身的高度高效的代码维护。
{"title":"Software maintainability benefits from annotation-driven code","authors":"L. Nigul, Ernest Mah","doi":"10.1109/ICSM.2009.5306295","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306295","url":null,"abstract":"Java annotations and their predecessors XDoclet annotations are well positioned to offer a substantial benefit for developing and maintaining software by combining relevant metadata together with the code that makes use of them. In this paper we share the insights that we gained by developing the J2EE Connector Tools suite, which is part of IBM Rational Application Developer. This tools suite is used to generate annotated Java code capable of connecting to Enterprise Information Systems through the Java EE Connector APIs. The generated code is annotation driven, and in the paper we show how the annotations in the generated code helped us to achieve a high degree of efficient code maintenance for the tools that generate the code as well as for the generated code itself.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128125769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-30DOI: 10.1109/ICSM.2009.5306383
Nicolas Bettenburg, Emad Shihab, A. Hassan
Mailing list repositories contain valuable information about the history of a project. Research is starting to mine this information to support developers and maintainers of long-lived software projects. However, such information exists as unstructured data that needs special processing before it can be studied. In this paper, we identify several challenges that arise when using off-the-shelf techniques for processing mailing list data. Our study highlights the importance of proper processing of mailing list data to ensure accurate research results.
{"title":"An empirical study on the risks of using off-the-shelf techniques for processing mailing list data","authors":"Nicolas Bettenburg, Emad Shihab, A. Hassan","doi":"10.1109/ICSM.2009.5306383","DOIUrl":"https://doi.org/10.1109/ICSM.2009.5306383","url":null,"abstract":"Mailing list repositories contain valuable information about the history of a project. Research is starting to mine this information to support developers and maintainers of long-lived software projects. However, such information exists as unstructured data that needs special processing before it can be studied. In this paper, we identify several challenges that arise when using off-the-shelf techniques for processing mailing list data. Our study highlights the importance of proper processing of mailing list data to ensure accurate research results.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130003015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}