Dominik Franke, Corinna Elsemann, S. Kowalewski, Carsten Weise
In mobile applications, the application lifecycle consists of the process-related states (e.g. suspended, ready, running) and the transitions between them. A faulty or insufficient implementation of the mobile application lifecycle can be the source of many problematic faults, e.g. loss of data. Thus for a software developer, understanding and mastering the mobile application lifecycle is essential for high quality software. In our work with various mobile platforms, we found that the given lifecycle models and corresponding documentation are often inconsistent, incomplete and incorrect. In this paper we present a way to reverse-engineer application lifecycles of mobile platforms by testing. Within a case study we apply the presented concept to three mobile platforms: Android, iOS and Java ME. We further show how developers of mobile applications can use our results to get correct lifecycle models for these platforms.
{"title":"Reverse Engineering of Mobile Application Lifecycles","authors":"Dominik Franke, Corinna Elsemann, S. Kowalewski, Carsten Weise","doi":"10.1109/WCRE.2011.42","DOIUrl":"https://doi.org/10.1109/WCRE.2011.42","url":null,"abstract":"In mobile applications, the application lifecycle consists of the process-related states (e.g. suspended, ready, running) and the transitions between them. A faulty or insufficient implementation of the mobile application lifecycle can be the source of many problematic faults, e.g. loss of data. Thus for a software developer, understanding and mastering the mobile application lifecycle is essential for high quality software. In our work with various mobile platforms, we found that the given lifecycle models and corresponding documentation are often inconsistent, incomplete and incorrect. In this paper we present a way to reverse-engineer application lifecycles of mobile platforms by testing. Within a case study we apply the presented concept to three mobile platforms: Android, iOS and Java ME. We further show how developers of mobile applications can use our results to get correct lifecycle models for these platforms.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130096031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Migrating legacy systems towards Service-Oriented Architectures requires the identification of legacy code that is able to implement the new services. This paper proposes an approach combining dynamic analysis and data mining techniques to map legacy code to business processes and to identify code for service implementations based on this mapping. Validating the clustering solution in a first case study resulted in values of 70,6% in precision and 83,5% in recall.
{"title":"Using Dynamic Analysis and Clustering for Implementing Services by Reusing Legacy Code","authors":"Andreas Fuhr, Tassilo Horn, Volker Riediger","doi":"10.1109/WCRE.2011.41","DOIUrl":"https://doi.org/10.1109/WCRE.2011.41","url":null,"abstract":"Migrating legacy systems towards Service-Oriented Architectures requires the identification of legacy code that is able to implement the new services. This paper proposes an approach combining dynamic analysis and data mining techniques to map legacy code to business processes and to identify code for service implementations based on this mapping. Validating the clustering solution in a first case study resulted in values of 70,6% in precision and 83,5% in recall.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131900342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software security has made great progress, code analysis tools are widely-used in industry for detecting common implementation-level security bugs. However, given the fact that we must deal with legacy code we plead to employ the techniques long been developed in the research area of program comprehension for software security. In cooperation with a security expert, we carried out a case study with the mobile phone platform Android, and employed the reverse engineering tool-suite Bauhaus for this security assessment. During the investigation we found some inconsistencies in the implementation of the Android security concepts. Based on the lessons learned from the case study, we propose several research topics in the area of reverse engineering that would support a security analyst during security assessments.
{"title":"An Android Security Case Study with Bauhaus","authors":"Bernhard J. Berger, Michaela Bunke, K. Sohr","doi":"10.1109/WCRE.2011.29","DOIUrl":"https://doi.org/10.1109/WCRE.2011.29","url":null,"abstract":"Software security has made great progress, code analysis tools are widely-used in industry for detecting common implementation-level security bugs. However, given the fact that we must deal with legacy code we plead to employ the techniques long been developed in the research area of program comprehension for software security. In cooperation with a security expert, we carried out a case study with the mobile phone platform Android, and employed the reverse engineering tool-suite Bauhaus for this security assessment. During the investigation we found some inconsistencies in the implementation of the Android security concepts. Based on the lessons learned from the case study, we propose several research topics in the area of reverse engineering that would support a security analyst during security assessments.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131794748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present the results of a controlled experiment and a differentiated replication that have been carried out to assess the effect of the documentation of design patterns on the comprehension of source code. The two experiments involved Master Students in Computer Science at the University of Basilicas a and at University of Salerno, respectively. The participants to the original experiment performed a comprehension task with and without graphically-documented design patterns. Textually documented design patterns were provided or not to the participants to perform a comprehension task within the replication. The data analysis revealed that participants employing graphically documented design patterns achieved significantly better performances than the participants provided with source code alone. Conversely, the effect of textually documented design patterns was not statistically significant. A further analysis revealed that the documentation type (textual and graphical) does not significantly affect the performance, when participants correctly recognize design pattern instances.
{"title":"Does the Documentation of Design Pattern Instances Impact on Source Code Comprehension? Results from Two Controlled Experiments","authors":"C. Gravino, M. Risi, G. Scanniello, G. Tortora","doi":"10.1109/WCRE.2011.18","DOIUrl":"https://doi.org/10.1109/WCRE.2011.18","url":null,"abstract":"We present the results of a controlled experiment and a differentiated replication that have been carried out to assess the effect of the documentation of design patterns on the comprehension of source code. The two experiments involved Master Students in Computer Science at the University of Basilicas a and at University of Salerno, respectively. The participants to the original experiment performed a comprehension task with and without graphically-documented design patterns. Textually documented design patterns were provided or not to the participants to perform a comprehension task within the replication. The data analysis revealed that participants employing graphically documented design patterns achieved significantly better performances than the participants provided with source code alone. Conversely, the effect of textually documented design patterns was not statistically significant. A further analysis revealed that the documentation type (textual and graphical) does not significantly affect the performance, when participants correctly recognize design pattern instances.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126306613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bug localization involves the use of information about a bug to assist in locating sections of code that must be modified to fix the bug. Such a task can involve a considerable amount of time and effort on the part of software developers and/or maintainers. Recently, several automated bug localization techniques based on information retrieval (IR) models have been developed to speed the process of bug localization. Another code analysis technique involves locating duplicated sections of code in software projects, called code clones. We examine the application of code clone location techniques in the context of bug localization. We attempt to determine the advantages of extending existing code clone location techniques through the inclusion of IR models in the analysis process. We also examine a technique for extending the use of bug logging repositories and version control systems by analyzing the two using IR techniques.
{"title":"Extending Bug Localization Using Information Retrieval and Code Clone Location Techniques","authors":"Matthew D. Beard","doi":"10.1109/WCRE.2011.61","DOIUrl":"https://doi.org/10.1109/WCRE.2011.61","url":null,"abstract":"Bug localization involves the use of information about a bug to assist in locating sections of code that must be modified to fix the bug. Such a task can involve a considerable amount of time and effort on the part of software developers and/or maintainers. Recently, several automated bug localization techniques based on information retrieval (IR) models have been developed to speed the process of bug localization. Another code analysis technique involves locating duplicated sections of code in software projects, called code clones. We examine the application of code clone location techniques in the context of bug localization. We attempt to determine the advantages of extending existing code clone location techniques through the inclusion of IR models in the analysis process. We also examine a technique for extending the use of bug logging repositories and version control systems by analyzing the two using IR techniques.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128840038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software engineers are coerced to deal with a large amount of information about source code. Appropriate tools could assist to handle it, but existing tools are not capable of processing and presenting such a large amount of information sufficiently. With the advent of in-memory column-oriented databases the performance of some data-intensive applications could be significantly improved. This has resulted in a completely new user experience of those applications and enabled new use-cases. This PhD thesis investigates the applicability of in-memory column-oriented databases for supporting daily software engineering activities. The major research question addressed in this thesis is as follows: does in-memory column-oriented database technology provide the necessary performance advantages for working interactively with large amounts of fine-grained structural information about source code? To investigate this research question two scenarios have been selected that particularly suffer from low performance. The first selected scenario is source code search. Existing source code repositories contain a large amount of structural data. Interface definitions, abstract syntax trees, and call graphs are examples of such structural data. Existing tools have solved the performance problems either by reducing the amount of data because of using a coarse-grained representation, or by preparing answers to developers' questions in advance, or by reducing the scope of search. All currently existing alternatives result in the loss of developers' productivity. The second scenario is source code analytics. To complete reverse engineering tasks software engineers often are required to analyze a number of atomic facts that have been extracted from source code. Examples of such atomic facts are occurrences of certain syntactic patterns in code, software product metrics or violations of development guidelines. Each fact typically has several characteristics, such as the type of the fact, the location in code where found, and some attributes. Particularly, analysis of large software systems requires the ability to process a large amount of such facts efficiently. During industrial experiments conducted for this thesis it was evidenced that in-memory technology provides performance gains that improve developers' productivity and enable scenarios previously not possible. This thesis overlaps both software engineering and database technology. From the viewpoint of software engineering, it seeks to find a way to support developers in dealing with a large amount of structural data. From the viewpoint of database technology, source code search and analytics are domains for studying fundamental issues of storing and querying structural data.
{"title":"In-Memory Database Support for Source Code Search and Analytics","authors":"O. Panchenko","doi":"10.1109/WCRE.2011.60","DOIUrl":"https://doi.org/10.1109/WCRE.2011.60","url":null,"abstract":"Software engineers are coerced to deal with a large amount of information about source code. Appropriate tools could assist to handle it, but existing tools are not capable of processing and presenting such a large amount of information sufficiently. With the advent of in-memory column-oriented databases the performance of some data-intensive applications could be significantly improved. This has resulted in a completely new user experience of those applications and enabled new use-cases. This PhD thesis investigates the applicability of in-memory column-oriented databases for supporting daily software engineering activities. The major research question addressed in this thesis is as follows: does in-memory column-oriented database technology provide the necessary performance advantages for working interactively with large amounts of fine-grained structural information about source code? To investigate this research question two scenarios have been selected that particularly suffer from low performance. The first selected scenario is source code search. Existing source code repositories contain a large amount of structural data. Interface definitions, abstract syntax trees, and call graphs are examples of such structural data. Existing tools have solved the performance problems either by reducing the amount of data because of using a coarse-grained representation, or by preparing answers to developers' questions in advance, or by reducing the scope of search. All currently existing alternatives result in the loss of developers' productivity. The second scenario is source code analytics. To complete reverse engineering tasks software engineers often are required to analyze a number of atomic facts that have been extracted from source code. Examples of such atomic facts are occurrences of certain syntactic patterns in code, software product metrics or violations of development guidelines. Each fact typically has several characteristics, such as the type of the fact, the location in code where found, and some attributes. Particularly, analysis of large software systems requires the ability to process a large amount of such facts efficiently. During industrial experiments conducted for this thesis it was evidenced that in-memory technology provides performance gains that improve developers' productivity and enable scenarios previously not possible. This thesis overlaps both software engineering and database technology. From the viewpoint of software engineering, it seeks to find a way to support developers in dealing with a large amount of structural data. From the viewpoint of database technology, source code search and analytics are domains for studying fundamental issues of storing and querying structural data.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126931412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Source code contains textual, structural, and semantic information, which can all be leveraged for effective search. Some studies have proposed semantic code search where users can specify query topics in a natural language. Other studies can search through system dependence graphs. In this paper, we propose a semantic dependence search engine that integrates both kinds of techniques and can retrieve code snippets based on expressive user queries describing both topics and dependencies. Users can specify their search targets in a free form format describing desired topics (i.e., high-level semantic or functionality of the target code); a specialized graph query language allows users to describe low-level data and control dependencies in code and thus helps to refine the queries described in the free format. Our empirical evaluation on a number of software maintenance tasks shows that our search engine can efficiently locate desired code fragments accurately.
{"title":"Code Search via Topic-Enriched Dependence Graph Matching","authors":"Shaowei Wang, D. Lo, Lingxiao Jiang","doi":"10.1109/WCRE.2011.69","DOIUrl":"https://doi.org/10.1109/WCRE.2011.69","url":null,"abstract":"Source code contains textual, structural, and semantic information, which can all be leveraged for effective search. Some studies have proposed semantic code search where users can specify query topics in a natural language. Other studies can search through system dependence graphs. In this paper, we propose a semantic dependence search engine that integrates both kinds of techniques and can retrieve code snippets based on expressive user queries describing both topics and dependencies. Users can specify their search targets in a free form format describing desired topics (i.e., high-level semantic or functionality of the target code); a specialized graph query language allows users to describe low-level data and control dependencies in code and thus helps to refine the queries described in the free format. Our empirical evaluation on a number of software maintenance tasks shows that our search engine can efficiently locate desired code fragments accurately.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121383450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Herraiz, Emad Shihab, Thanh H. D. Nguyen, A. Hassan
Software defects are generally used to indicate software quality. However, due to the nature of software, we are often only able to know about the defects found and reported, either following the testing process or after being deployed. In software research studies, it is assumed that a higher amount of defect reports represents a higher amount of defects in the software system. In this paper, we argue that widely deployed programs have more reported defects, regardless of their actual number of defects. To address this question, we perform a case study on the Debian GNU/Linux distribution, a well-known free / open source software collection. We compare the defects reported for all the software packages in Debian with their popularity. We find that the number of reported defects for a Debian package is limited by its popularity. This finding has implications on defect prediction studies, showing that they need to consider the impact of popularity on perceived quality, otherwise they might be risking bias.
{"title":"Impact of Installation Counts on Perceived Quality: A Case Study on Debian","authors":"I. Herraiz, Emad Shihab, Thanh H. D. Nguyen, A. Hassan","doi":"10.1109/WCRE.2011.34","DOIUrl":"https://doi.org/10.1109/WCRE.2011.34","url":null,"abstract":"Software defects are generally used to indicate software quality. However, due to the nature of software, we are often only able to know about the defects found and reported, either following the testing process or after being deployed. In software research studies, it is assumed that a higher amount of defect reports represents a higher amount of defects in the software system. In this paper, we argue that widely deployed programs have more reported defects, regardless of their actual number of defects. To address this question, we perform a case study on the Debian GNU/Linux distribution, a well-known free / open source software collection. We compare the defects reported for all the software packages in Debian with their popularity. We find that the number of reported defects for a Debian package is limited by its popularity. This finding has implications on defect prediction studies, showing that they need to consider the impact of popularity on perceived quality, otherwise they might be risking bias.","PeriodicalId":350863,"journal":{"name":"2011 18th Working Conference on Reverse Engineering","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132805998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}