Pub Date : 2013-11-21DOI: 10.1109/WCRE.2013.6671275
A. Memon, Ishan Banerjee, Bao-Ngoc Nguyen, Bryan J. Robbins
This paper provides a retrospective examination of GUI Ripping - reverse engineering a workflow model of the graphical user interface of a software application - born a decade ago out of recognition of the severe need for improving the then largely manual state-of-the-practice of functional GUI testing. In these last 10 years, GUI ripping has turned out to be an enabler for much research, both within our group at Maryland and other groups. Researchers have found new and unique applications of GUI ripping, ranging from measuring human performance to re-engineering legacy user interfaces. GUI ripping has also enabled large-scale experimentation involving millions of test cases, thereby helping to understand the nature of GUI faults and characteristics of test cases to detect them. It has resulted in large multi-institutional Government-sponsored research projects on test automation and benchmarking. GUI ripping tools have been ported to many platforms, including Java AWT and Swing, iOS, Android, UNO, Microsoft Windows, and web. In essence, the technology has transformed the way researchers and practitioners think about the nature of GUI testing, no longer considered a manual activity; rather, thanks largely to GUI Ripping, automation has become the primary focus of current GUI testing techniques.
{"title":"The first decade of GUI ripping: Extensions, applications, and broader impacts","authors":"A. Memon, Ishan Banerjee, Bao-Ngoc Nguyen, Bryan J. Robbins","doi":"10.1109/WCRE.2013.6671275","DOIUrl":"https://doi.org/10.1109/WCRE.2013.6671275","url":null,"abstract":"This paper provides a retrospective examination of GUI Ripping - reverse engineering a workflow model of the graphical user interface of a software application - born a decade ago out of recognition of the severe need for improving the then largely manual state-of-the-practice of functional GUI testing. In these last 10 years, GUI ripping has turned out to be an enabler for much research, both within our group at Maryland and other groups. Researchers have found new and unique applications of GUI ripping, ranging from measuring human performance to re-engineering legacy user interfaces. GUI ripping has also enabled large-scale experimentation involving millions of test cases, thereby helping to understand the nature of GUI faults and characteristics of test cases to detect them. It has resulted in large multi-institutional Government-sponsored research projects on test automation and benchmarking. GUI ripping tools have been ported to many platforms, including Java AWT and Swing, iOS, Android, UNO, Microsoft Windows, and web. In essence, the technology has transformed the way researchers and practitioners think about the nature of GUI testing, no longer considered a manual activity; rather, thanks largely to GUI Ripping, automation has become the primary focus of current GUI testing techniques.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121501991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-01DOI: 10.1109/WCRE.2013.6671307
Mathieu Nayrolles, Naouel Moha, Petko Valtchev
Service Based Systems (SBSs), like other software systems, evolve due to changes in both user requirements and execution contexts. Continuous evolution could easily deteriorate the design and reduce the Quality of Service (QoS) of SBSs and may result in poor design solutions, commonly known as SOA antipatterns. SOA antipatterns lead to a reduced maintainability and reusability of SBSs. It is therefore important to first detect and then remove them. However, techniques for SOA antipattern detection are still in their infancy, and there are hardly any tools for their automatic detection. In this paper, we propose a new and innovative approach for SOA antipattern detection called SOMAD (Service Oriented Mining for Antipattern Detection) which is an evolution of the previously published SODA (Service Oriented Detection For Antpatterns) tool. SOMAD improves SOA antipattern detection by mining execution traces: It detects strong associations between sequences of service/method calls and further filters them using a suite of dedicated metrics. We first present the underlying association mining model and introduce the SBS-oriented rule metrics. We then describe a validating application of SOMAD to two independently developed SBSs. A comparison of our new tool with SODA reveals superiority of the former: Its precision is better by a margin ranging from 2.6% to 16.67% while the recall remains optimal at 100% and the speed is significantly reduces (2.5+ times on the same test subjects).
与其他软件系统一样,基于服务的系统(Service - Based Systems, SBSs)会随着用户需求和执行上下文的变化而发展。持续的演化很容易使设计恶化,降低sbs的服务质量(QoS),并可能导致糟糕的设计解决方案,即通常所说的SOA反模式。SOA反模式会降低sbs的可维护性和可重用性。因此,重要的是首先发现并清除它们。然而,SOA反模式检测技术仍处于起步阶段,几乎没有任何工具可以自动检测它们。在本文中,我们提出了一种新的、创新的SOA反模式检测方法,称为SOMAD(面向服务的反模式检测挖掘),它是先前发布的SODA(面向服务的反模式检测)工具的发展。SOMAD通过挖掘执行跟踪改进了SOA反模式检测:它检测服务/方法调用序列之间的强关联,并使用一套专用指标进一步过滤它们。我们首先介绍底层关联挖掘模型,并介绍面向sbs的规则度量。然后,我们对两个独立开发的sbs描述了一个验证SOMAD的应用程序。我们的新工具与SODA的比较揭示了前者的优越性:它的精度在2.6%到16.67%之间,而召回率在100%保持最佳,速度显着降低(在相同的测试对象上降低2.5倍以上)。
{"title":"Improving SOA antipatterns detection in Service Based Systems by mining execution traces","authors":"Mathieu Nayrolles, Naouel Moha, Petko Valtchev","doi":"10.1109/WCRE.2013.6671307","DOIUrl":"https://doi.org/10.1109/WCRE.2013.6671307","url":null,"abstract":"Service Based Systems (SBSs), like other software systems, evolve due to changes in both user requirements and execution contexts. Continuous evolution could easily deteriorate the design and reduce the Quality of Service (QoS) of SBSs and may result in poor design solutions, commonly known as SOA antipatterns. SOA antipatterns lead to a reduced maintainability and reusability of SBSs. It is therefore important to first detect and then remove them. However, techniques for SOA antipattern detection are still in their infancy, and there are hardly any tools for their automatic detection. In this paper, we propose a new and innovative approach for SOA antipattern detection called SOMAD (Service Oriented Mining for Antipattern Detection) which is an evolution of the previously published SODA (Service Oriented Detection For Antpatterns) tool. SOMAD improves SOA antipattern detection by mining execution traces: It detects strong associations between sequences of service/method calls and further filters them using a suite of dedicated metrics. We first present the underlying association mining model and introduce the SBS-oriented rule metrics. We then describe a validating application of SOMAD to two independently developed SBSs. A comparison of our new tool with SODA reveals superiority of the former: Its precision is better by a margin ranging from 2.6% to 16.67% while the recall remains optimal at 100% and the speed is significantly reduces (2.5+ times on the same test subjects).","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127365195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-01DOI: 10.1109/WCRE.2013.6671333
Alberto Bacchelli, Nicolas Bettenburg, Latifa Guerrouj, S. Haiduc
Software development knowledge resides in the source code and in a number of other artefacts produced during the development process. To extract such a knowledge, past software engineering research has extensively focused on mining the source code, i.e., the final product of the development effort. Currently, we witness an emerging trend where researchers strive to exploit the information captured in artifacts such as emails and bug reports, free-form text requirements and specifications, comments and identifiers. Being often expressed in natural language, and not having a well-defined structure, the information stored in these artifacts is defined as unstructured data. Although research communities in Information Retrieval, Data Mining and Natural Language Processing have devised techniques to deal with unstructured data, these techniques are usually limited in scope (i.e., designed for English language text found in newspaper articles) and intended for use in specific scenarios, thus failing to achieve their full potential in a software development context. The workshop on Mining Unstructured Data (MUD) aims to provide a common venue for researchers and practitioners across software engineering, information retrieval and data mining research domains, to share new approaches and emerging results in mining unstructured data.
{"title":"3rd workshop on Mining Unstructured Data","authors":"Alberto Bacchelli, Nicolas Bettenburg, Latifa Guerrouj, S. Haiduc","doi":"10.1109/WCRE.2013.6671333","DOIUrl":"https://doi.org/10.1109/WCRE.2013.6671333","url":null,"abstract":"Software development knowledge resides in the source code and in a number of other artefacts produced during the development process. To extract such a knowledge, past software engineering research has extensively focused on mining the source code, i.e., the final product of the development effort. Currently, we witness an emerging trend where researchers strive to exploit the information captured in artifacts such as emails and bug reports, free-form text requirements and specifications, comments and identifiers. Being often expressed in natural language, and not having a well-defined structure, the information stored in these artifacts is defined as unstructured data. Although research communities in Information Retrieval, Data Mining and Natural Language Processing have devised techniques to deal with unstructured data, these techniques are usually limited in scope (i.e., designed for English language text found in newspaper articles) and intended for use in specific scenarios, thus failing to achieve their full potential in a software development context. The workshop on Mining Unstructured Data (MUD) aims to provide a common venue for researchers and practitioners across software engineering, information retrieval and data mining research domains, to share new approaches and emerging results in mining unstructured data.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130044046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-01DOI: 10.1109/WCRE.2013.6671311
Nasir Ali, Fehmi Jaafar, A. Hassan
Requirements traceability (RT) links requirements to the corresponding source code entities, which implement them. Information Retrieval (IR) based RT links recovery approaches are often used to automatically recover RT links. However, such approaches exhibit low accuracy, in terms of precision, recall, and ranking. This paper presents an approach (CoChaIR), complementary to existing IR-based RT links recovery approaches. CoChaIR leverages historical co-change information of files to improve the accuracy of IR-based RT links recovery approaches. We evaluated the effectiveness of CoChaIR on three datasets, i.e., iTrust, Pooka, and SIP Communicator. We compared CoChaIR with two different IR-based RT links recovery approaches, i.e., vector space model and Jensen-Shannon divergence model. Our study results show that CoChaIR significantly improves precision and recall by up to 12.38% and 5.67% respectively; while decreasing the rank of true positive links by up to 48% and reducing false positive links by up to 44%.
{"title":"Leveraging historical co-change information for requirements traceability","authors":"Nasir Ali, Fehmi Jaafar, A. Hassan","doi":"10.1109/WCRE.2013.6671311","DOIUrl":"https://doi.org/10.1109/WCRE.2013.6671311","url":null,"abstract":"Requirements traceability (RT) links requirements to the corresponding source code entities, which implement them. Information Retrieval (IR) based RT links recovery approaches are often used to automatically recover RT links. However, such approaches exhibit low accuracy, in terms of precision, recall, and ranking. This paper presents an approach (CoChaIR), complementary to existing IR-based RT links recovery approaches. CoChaIR leverages historical co-change information of files to improve the accuracy of IR-based RT links recovery approaches. We evaluated the effectiveness of CoChaIR on three datasets, i.e., iTrust, Pooka, and SIP Communicator. We compared CoChaIR with two different IR-based RT links recovery approaches, i.e., vector space model and Jensen-Shannon divergence model. Our study results show that CoChaIR significantly improves precision and recall by up to 12.38% and 5.67% respectively; while decreasing the rank of true positive links by up to 48% and reducing false positive links by up to 44%.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132987585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-01DOI: 10.1109/WCRE.2013.6671325
Elizabeth P. Antony, Manar H. Alalfi, J. Cordy
In this paper we present an approach for identifying near-miss interaction clones in reverse-engineered UML behavioural models. Our goal is to identify patterns of interaction (“conversations”) that can be used to characterize and abstract the run-time behaviour of web applications and other interactive systems. In order to leverage robust near-miss code clone technology, our approach is text-based, working on the level of XMI, the standard interchange serialization for UML. Behavioural model clone detection presents several challenges - first, it is not clear how to break a continuous stream of interaction between lifelines into meaningful conversational units. Second, unlike programming languages, the XMI text representation for UML is highly non-local, using attributes to reference information in the model file remotely. In this work we use a set of contextualizing source transformations on the XMI text representation to reveal the hidden hierarchical structure of the model and granularize behavioural interactions into conversational units. Then we adapt NiCad, a near-miss code clone detection tool, to help us identify conversational clones in reverse-engineered behavioural models.
{"title":"An approach to clone detection in behavioural models","authors":"Elizabeth P. Antony, Manar H. Alalfi, J. Cordy","doi":"10.1109/WCRE.2013.6671325","DOIUrl":"https://doi.org/10.1109/WCRE.2013.6671325","url":null,"abstract":"In this paper we present an approach for identifying near-miss interaction clones in reverse-engineered UML behavioural models. Our goal is to identify patterns of interaction (“conversations”) that can be used to characterize and abstract the run-time behaviour of web applications and other interactive systems. In order to leverage robust near-miss code clone technology, our approach is text-based, working on the level of XMI, the standard interchange serialization for UML. Behavioural model clone detection presents several challenges - first, it is not clear how to break a continuous stream of interaction between lifelines into meaningful conversational units. Second, unlike programming languages, the XMI text representation for UML is highly non-local, using attributes to reference information in the model file remotely. In this work we use a set of contextualizing source transformations on the XMI text representation to reveal the hidden hierarchical structure of the model and granularize behavioural interactions into conversational units. Then we adapt NiCad, a near-miss code clone detection tool, to help us identify conversational clones in reverse-engineered behavioural models.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133266438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-01DOI: 10.1109/WCRE.2013.6671282
Xin Xia, D. Lo, Xinyu Wang, Bo Zhou
Bug resolution refers to the activity that developers perform to diagnose, fix, test, and document bugs during software development and maintenance. It is a collaborative activity among developers who contribute their knowledge, ideas, and expertise to resolve bugs. Given a bug report, we would like to recommend the set of bug resolvers that could potentially contribute their knowledge to fix it. We refer to this problem as developer recommendation for bug resolution. In this paper, we propose a new and accurate method named DevRec for the developer recommendation problem. DevRec is a composite method which performs two kinds of analysis: bug reports based analysis (BR-Based analysis), and developer based analysis (D-Based analysis). In the BR-Based analysis, we characterize a new bug report based on past bug reports that are similar to it. Appropriate developers of the new bug report are found by investigating the developers of similar bug reports appearing in the past. In the D-Based analysis, we compute the affinity of each developer to a bug report based on the characteristics of bug reports that have been fixed by the developer before. This affinity is then used to find a set of developers that are “close” to a new bug report. We evaluate our solution on 5 large bug report datasets including GCC, OpenOffice, Mozilla, Netbeans, and Eclipse containing a total of 107,875 bug reports. We show that DevRec could achieve recall@5 and recall@10 scores of 0.4826-0.7989, and 0.6063-0.8924, respectively. We also compare DevRec with other state-of-art methods, such as Bugzie and DREX. The results show that DevRec on average improves recall@5 and recall@10 scores of Bugzie by 57.55% and 39.39% respectively. DevRec also outperforms DREX by improving the average recall@5 and recall@10 scores by 165.38% and 89.36%, respectively.
{"title":"Accurate developer recommendation for bug resolution","authors":"Xin Xia, D. Lo, Xinyu Wang, Bo Zhou","doi":"10.1109/WCRE.2013.6671282","DOIUrl":"https://doi.org/10.1109/WCRE.2013.6671282","url":null,"abstract":"Bug resolution refers to the activity that developers perform to diagnose, fix, test, and document bugs during software development and maintenance. It is a collaborative activity among developers who contribute their knowledge, ideas, and expertise to resolve bugs. Given a bug report, we would like to recommend the set of bug resolvers that could potentially contribute their knowledge to fix it. We refer to this problem as developer recommendation for bug resolution. In this paper, we propose a new and accurate method named DevRec for the developer recommendation problem. DevRec is a composite method which performs two kinds of analysis: bug reports based analysis (BR-Based analysis), and developer based analysis (D-Based analysis). In the BR-Based analysis, we characterize a new bug report based on past bug reports that are similar to it. Appropriate developers of the new bug report are found by investigating the developers of similar bug reports appearing in the past. In the D-Based analysis, we compute the affinity of each developer to a bug report based on the characteristics of bug reports that have been fixed by the developer before. This affinity is then used to find a set of developers that are “close” to a new bug report. We evaluate our solution on 5 large bug report datasets including GCC, OpenOffice, Mozilla, Netbeans, and Eclipse containing a total of 107,875 bug reports. We show that DevRec could achieve recall@5 and recall@10 scores of 0.4826-0.7989, and 0.6063-0.8924, respectively. We also compare DevRec with other state-of-art methods, such as Bugzie and DREX. The results show that DevRec on average improves recall@5 and recall@10 scores of Bugzie by 57.55% and 39.39% respectively. DevRec also outperforms DREX by improving the average recall@5 and recall@10 scores by 165.38% and 89.36%, respectively.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132519767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-10-01DOI: 10.1109/WCRE.2013.6671319
B. Schoenmakers, N. V. D. Broek, I. Nagy, Bogdan Vasilescu, Alexander Serebrenik
Modern software development frequently involves developing multiple codelines simultaneously. Improvements to one codeline should often be applied to other codelines as well, which is typically a time consuming and error-prone process. In order to reduce this (manual) effort, changes are applied to the system's modules and those affected modules are upgraded on the target system. This is a more coarse-grained approach than upgrading the affected files only. However, when a module is upgraded, one must make sure that all its dependencies are still satisfied. This paper proposes an approach to assess the ease of upgrading a software system. An algorithm was developed to compute the smallest set of upgrade dependencies, given the current version of a module and the version it has to be upgraded to. Furthermore, a visualization has been designed to explain why upgrading one module requires upgrading many additional modules. A case study has been performed at ASML to study the ease of upgrading the TwinScan software. The analysis shows that removing elements from interfaces leads to many additional upgrade dependencies. Moreover, based on our analysis we have formulated a number improvement suggestions such as a clear separation between the test code and the production code as well as an introduction of a structured process of symbols deprecation and removal.
{"title":"Assessing the complexity of upgrading software modules","authors":"B. Schoenmakers, N. V. D. Broek, I. Nagy, Bogdan Vasilescu, Alexander Serebrenik","doi":"10.1109/WCRE.2013.6671319","DOIUrl":"https://doi.org/10.1109/WCRE.2013.6671319","url":null,"abstract":"Modern software development frequently involves developing multiple codelines simultaneously. Improvements to one codeline should often be applied to other codelines as well, which is typically a time consuming and error-prone process. In order to reduce this (manual) effort, changes are applied to the system's modules and those affected modules are upgraded on the target system. This is a more coarse-grained approach than upgrading the affected files only. However, when a module is upgraded, one must make sure that all its dependencies are still satisfied. This paper proposes an approach to assess the ease of upgrading a software system. An algorithm was developed to compute the smallest set of upgrade dependencies, given the current version of a module and the version it has to be upgraded to. Furthermore, a visualization has been designed to explain why upgrading one module requires upgrading many additional modules. A case study has been performed at ASML to study the ease of upgrading the TwinScan software. The analysis shows that removing elements from interfaces leads to many additional upgrade dependencies. Moreover, based on our analysis we have formulated a number improvement suggestions such as a clear separation between the test code and the production code as well as an introduction of a structured process of symbols deprecation and removal.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130717782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bug reporting is essentially an uncoordinated process. The same bugs could be repeatedly reported because users or testers are unaware of previously reported bugs. As a result, extra time could be spent on bug triaging and fixing. In order to reduce redundant effort, it is important to provide bug reporters with the ability to search for previously reported bugs. The search functions provided by the existing bug tracking systems are using relatively simple ranking functions, which often produce unsatisfactory results. In this paper, we adopt Ranking SVM, a Learning to Rank technique to construct a ranking model for effective bug report search. We also propose to use the knowledge of Wikipedia to discover the semantic relations among words and documents. Given a user query, the constructed ranking model can search for relevant bug reports in a bug tracking system. Unlike related works on duplicate bug report detection, our approach retrieves existing bug reports based on short user queries, before the complete bug report is submitted. We perform evaluations on more than 16,340 Eclipse and Mozilla bug reports. The evaluation results show that the proposed approach can achieve better search results than the existing search functions provided by Bugzilla and Lucene. We believe our work can help users and testers locate potential relevant bug reports more precisely.
{"title":"Has this bug been reported?","authors":"K. Liu, Hee Beng Kuan Tan, Hongyu Zhang","doi":"10.1145/2393596.2393628","DOIUrl":"https://doi.org/10.1145/2393596.2393628","url":null,"abstract":"Bug reporting is essentially an uncoordinated process. The same bugs could be repeatedly reported because users or testers are unaware of previously reported bugs. As a result, extra time could be spent on bug triaging and fixing. In order to reduce redundant effort, it is important to provide bug reporters with the ability to search for previously reported bugs. The search functions provided by the existing bug tracking systems are using relatively simple ranking functions, which often produce unsatisfactory results. In this paper, we adopt Ranking SVM, a Learning to Rank technique to construct a ranking model for effective bug report search. We also propose to use the knowledge of Wikipedia to discover the semantic relations among words and documents. Given a user query, the constructed ranking model can search for relevant bug reports in a bug tracking system. Unlike related works on duplicate bug report detection, our approach retrieves existing bug reports based on short user queries, before the complete bug report is submitted. We perform evaluations on more than 16,340 Eclipse and Mozilla bug reports. The evaluation results show that the proposed approach can achieve better search results than the existing search functions provided by Bugzilla and Lucene. We believe our work can help users and testers locate potential relevant bug reports more precisely.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114833724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/WCRE.2013.6671281
Shivani Rao, Henry Medeiros, A. Kak
Information Retrieval (IR) based bug localization techniques use a bug reports to query a software repository to retrieve relevant source files. These techniques index the source files in the software repository and train a model which is then queried for retrieval purposes. Much of the current research is focused on improving the retrieval effectiveness of these methods. However, little consideration has been given to the efficiency of such approaches for software repositories that are constantly evolving. As the software repository evolves, the index creation and model learning have to be repeated to ensure accuracy of retrieval for each new bug. In doing so, the query latency may be unreasonably high, and also, re-computing the index and the model for files that did not change is computationally redundant. We propose an incremental update framework to continuously update the index and the model using the changes made at each commit. We demonstrate that the same retrieval accuracy can be achieved but with a fraction of the time needed by current approaches. Our results are based on two basic IR modeling techniques - Vector Space Model (VSM) and Smoothed Unigram Model (SUM). The dataset we used in our validation experiments was created by tracking commit history of AspectJ and JodaTime software libraries over a span of 10 years.
{"title":"An incremental update framework for efficient retrieval from software libraries for bug localization","authors":"Shivani Rao, Henry Medeiros, A. Kak","doi":"10.1109/WCRE.2013.6671281","DOIUrl":"https://doi.org/10.1109/WCRE.2013.6671281","url":null,"abstract":"Information Retrieval (IR) based bug localization techniques use a bug reports to query a software repository to retrieve relevant source files. These techniques index the source files in the software repository and train a model which is then queried for retrieval purposes. Much of the current research is focused on improving the retrieval effectiveness of these methods. However, little consideration has been given to the efficiency of such approaches for software repositories that are constantly evolving. As the software repository evolves, the index creation and model learning have to be repeated to ensure accuracy of retrieval for each new bug. In doing so, the query latency may be unreasonably high, and also, re-computing the index and the model for files that did not change is computationally redundant. We propose an incremental update framework to continuously update the index and the model using the changes made at each commit. We demonstrate that the same retrieval accuracy can be achieved but with a fraction of the time needed by current approaches. Our results are based on two basic IR modeling techniques - Vector Space Model (VSM) and Smoothed Unigram Model (SUM). The dataset we used in our validation experiments was created by tracking commit history of AspectJ and JodaTime software libraries over a span of 10 years.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129995389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}