Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00077
Sergio Flores-Ruiz, Ricardo Pérez-Castillo, Christoph Domann, Simona Puica
Companies possess a history and large array of legacy information systems that consume a great part of their IT budget in operations and maintenance. These systems are mission-critical, and they cannot be fully discarded since they retain business rules and provide information that is not available anywhere else. Unfortunately, decades-old legacy systems cannot easily withstand modification. Mainframes specifically conglomerate most of these legacy systems. Although there are some white-box solutions for migrating mainframe systems, such solutions lack systematicity and do not provide mechanisms for verifying business rules preservation. Hence, this paper presents a black-box solution (ignoring the internal structure of COBOL programs) which uses a screen scraping technique for migrating mainframe systems toward JavaFX and relational databases. Together with this solution, this paper provides an automatic verification technique to check if the recreated system reflects all the embedded business logic. This proposal has been designed and developed in the context of an industrial project, in which the solution has already migrated 43,000,000 mainframe screens from four systems. The main implication for researchers and practitioners is that screen scraping has proved to be feasible for migrating mainframe systems in large-scale projects within a manageable time-frame while preserving business.
{"title":"Mainframe Migration Based on Screen Scraping","authors":"Sergio Flores-Ruiz, Ricardo Pérez-Castillo, Christoph Domann, Simona Puica","doi":"10.1109/ICSME.2018.00077","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00077","url":null,"abstract":"Companies possess a history and large array of legacy information systems that consume a great part of their IT budget in operations and maintenance. These systems are mission-critical, and they cannot be fully discarded since they retain business rules and provide information that is not available anywhere else. Unfortunately, decades-old legacy systems cannot easily withstand modification. Mainframes specifically conglomerate most of these legacy systems. Although there are some white-box solutions for migrating mainframe systems, such solutions lack systematicity and do not provide mechanisms for verifying business rules preservation. Hence, this paper presents a black-box solution (ignoring the internal structure of COBOL programs) which uses a screen scraping technique for migrating mainframe systems toward JavaFX and relational databases. Together with this solution, this paper provides an automatic verification technique to check if the recreated system reflects all the embedded business logic. This proposal has been designed and developed in the context of an industrial project, in which the solution has already migrated 43,000,000 mainframe screens from four systems. The main implication for researchers and practitioners is that screen scraping has proved to be feasible for migrating mainframe systems in large-scale projects within a manageable time-frame while preserving business.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"282 4 1","pages":"675-684"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86589147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00023
Dileep Ramachandrarao Krishna Murthy, Michael Pradel
Dynamic analysis is a powerful technique to detect correctness, performance, and security problems, in particular for programs written in dynamic languages, such as JavaScript. To catch mistakes as early as possible, developers should run such analyses regularly, e.g., by analyzing the execution of a regression test suite before each commit. Unfortunately, the high overhead of these analyses make this approach prohibitively expensive, hindering developers from benefiting from the power of heavyweight dynamic analysis. This paper presents change-aware dynamic program analysis, an approach to make a common class of dynamic analyses change-aware. The key idea is to identify parts of the code affected by a change through a lightweight static change impact analysis, and to focus the dynamic analysis on these affected parts. We implement the idea based on the dynamic analysis framework Jalangi and evaluate it with 46 checkers from the DLint and JITProf tools. Our results show that change-aware dynamic analysis reduces the overall analysis time by 40%, on average, and by at least 80% for 31% of all commits.
{"title":"Change-Aware Dynamic Program Analysis for JavaScript","authors":"Dileep Ramachandrarao Krishna Murthy, Michael Pradel","doi":"10.1109/ICSME.2018.00023","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00023","url":null,"abstract":"Dynamic analysis is a powerful technique to detect correctness, performance, and security problems, in particular for programs written in dynamic languages, such as JavaScript. To catch mistakes as early as possible, developers should run such analyses regularly, e.g., by analyzing the execution of a regression test suite before each commit. Unfortunately, the high overhead of these analyses make this approach prohibitively expensive, hindering developers from benefiting from the power of heavyweight dynamic analysis. This paper presents change-aware dynamic program analysis, an approach to make a common class of dynamic analyses change-aware. The key idea is to identify parts of the code affected by a change through a lightweight static change impact analysis, and to focus the dynamic analysis on these affected parts. We implement the idea based on the dynamic analysis framework Jalangi and evaluate it with 46 checkers from the DLint and JITProf tools. Our results show that change-aware dynamic analysis reduces the overall analysis time by 40%, on average, and by at least 80% for 31% of all commits.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"239 1","pages":"127-137"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80433576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00026
Alexander Schlie, Sandro Schulze, Ina Schaefer
Model-based languages such as MATLAB/Simulink are crucial for the development of embedded software systems. To adapt to changing requirements, engineers commonly copy and modify existing systems to create new variants. Commonly referred to as clone-and-own, this reuse strategy is easy to apply and beneficial in the short term, but it entails severe maintenance and consistency issues in the long term, leading to a huge amount of redundant and similar assets. Moreover, a later transition towards structured reuse such as with software product lines inevitably requires the comparison of all existing variants prior to the actual migration. However, current work mostly revolves around the comparison of only two systems and despite approaches proposed that can cope with more, such are not applicable to embedded software systems such as MATLAB/Simulink. In this paper, we bridge this gap and propose Static Connectivity Matrix Analysis (SCMA), a novel comparison procedure that allows for the evaluation of multiple MATLAB/Simulink model variants at once. In particular, we transform models into a matrix form which is used to compare all models and to identify all similar structures between them, even with model parts being completely relocated during clone-and-own. We allow engineers to tailor results and to focus on any arbitrary variant subset, enabling individual reasoning prior to migration. We provide a feasibility study from the automotive domain, showing our matrix representation to be suitable and our technique to be fast and precise.
{"title":"Comparing Multiple MATLAB/Simulink Models Using Static Connectivity Matrix Analysis","authors":"Alexander Schlie, Sandro Schulze, Ina Schaefer","doi":"10.1109/ICSME.2018.00026","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00026","url":null,"abstract":"Model-based languages such as MATLAB/Simulink are crucial for the development of embedded software systems. To adapt to changing requirements, engineers commonly copy and modify existing systems to create new variants. Commonly referred to as clone-and-own, this reuse strategy is easy to apply and beneficial in the short term, but it entails severe maintenance and consistency issues in the long term, leading to a huge amount of redundant and similar assets. Moreover, a later transition towards structured reuse such as with software product lines inevitably requires the comparison of all existing variants prior to the actual migration. However, current work mostly revolves around the comparison of only two systems and despite approaches proposed that can cope with more, such are not applicable to embedded software systems such as MATLAB/Simulink. In this paper, we bridge this gap and propose Static Connectivity Matrix Analysis (SCMA), a novel comparison procedure that allows for the evaluation of multiple MATLAB/Simulink model variants at once. In particular, we transform models into a matrix form which is used to compare all models and to identify all similar structures between them, even with model parts being completely relocated during clone-and-own. We allow engineers to tailor results and to focus on any arbitrary variant subset, enabling individual reasoning prior to migration. We provide a feasibility study from the automotive domain, showing our matrix representation to be suitable and our technique to be fast and precise.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"57 1","pages":"160-171"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80436343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00036
Veit Frick, Thomas Grassauer, Fabian Beck, M. Pinzger
For analyzing changes in source code, edit scriptsare used to describe the differences between two versions of afile. These scripts consist of a list of actions that, applied to thesource file, result in the new version of the file. In contrast toline-based source code differencing, tree-based approaches suchas GumTree, MTDIFF, or ChangeDistiller extract changes bycomparing the abstract syntax trees (AST) of two versions of asource file. One benefit of tree-based approaches is their abilityto capture moved (sub) trees in the AST. Our approach, theIterative Java Matcher (IJM), builds upon GumTree and aims atgenerating more accurate and compact edit scripts that capturethe developer's intent. This is achieved by improving the qualityof the generated move and update actions, which are the mainsource of inaccurate actions generated by previous approaches. To evaluate our approach, we conducted a study with 11 external experts and manually analyzed the accuracy of 2400 randomly selected editactions. Comparing IJM to GumTree and MTDIFF, the resultsshow that IJM provides better accuracy for move and updateactions and is more beneficial to understanding the changes.
{"title":"Generating Accurate and Compact Edit Scripts Using Tree Differencing","authors":"Veit Frick, Thomas Grassauer, Fabian Beck, M. Pinzger","doi":"10.1109/ICSME.2018.00036","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00036","url":null,"abstract":"For analyzing changes in source code, edit scriptsare used to describe the differences between two versions of afile. These scripts consist of a list of actions that, applied to thesource file, result in the new version of the file. In contrast toline-based source code differencing, tree-based approaches suchas GumTree, MTDIFF, or ChangeDistiller extract changes bycomparing the abstract syntax trees (AST) of two versions of asource file. One benefit of tree-based approaches is their abilityto capture moved (sub) trees in the AST. Our approach, theIterative Java Matcher (IJM), builds upon GumTree and aims atgenerating more accurate and compact edit scripts that capturethe developer's intent. This is achieved by improving the qualityof the generated move and update actions, which are the mainsource of inaccurate actions generated by previous approaches. To evaluate our approach, we conducted a study with 11 external experts and manually analyzed the accuracy of 2400 randomly selected editactions. Comparing IJM to GumTree and MTDIFF, the resultsshow that IJM provides better accuracy for move and updateactions and is more beneficial to understanding the changes.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"40 1","pages":"264-274"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80358409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00065
Santiago Linan, Laura Bello-Jiménez, Maria Arevalo, M. Linares-Vásquez
Mobile software development involves significant challenges to developers such as device fragmentation (i.e., enormous hardware and software diversity), event-driven programming (i.e., programming based on user interactions, sensor readings and other events where the program must react) and continuous evolving platforms (i.e., fast changing mobile frameworks and technologies). This can lead programmers to error-prone code, because of the multiple combinations of external variables that must be taken into account in an app development process. Thus, testing is an underlying necessity in mobile applications to deliver high quality apps. However, defining tests suites for app development is a difficult task that requires a lot of effort, because it must consider all the possible states of an app, its context (e.g., device in which is running, sensors, touch gestures, screen proportions, connectivity), and a large combination of mobile devices and operating systems. Previous efforts have been done to extract models that support automated testing. However, as of today there is not a single model that synthesizes different aspects in mobile apps such as domain, usage, context and GUI-related information. These aspects represent complementary information that can be mixed into a single and enriched model. In this paper, we propose a multi-model representation that combines information extracted statically and dynamically from Android apps. Our approach allows practitioners to automatically extract augmented models that combine different types of information, and could help them during comprehension and testing tasks.
{"title":"Automated Extraction of Augmented Models for Android Apps","authors":"Santiago Linan, Laura Bello-Jiménez, Maria Arevalo, M. Linares-Vásquez","doi":"10.1109/ICSME.2018.00065","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00065","url":null,"abstract":"Mobile software development involves significant challenges to developers such as device fragmentation (i.e., enormous hardware and software diversity), event-driven programming (i.e., programming based on user interactions, sensor readings and other events where the program must react) and continuous evolving platforms (i.e., fast changing mobile frameworks and technologies). This can lead programmers to error-prone code, because of the multiple combinations of external variables that must be taken into account in an app development process. Thus, testing is an underlying necessity in mobile applications to deliver high quality apps. However, defining tests suites for app development is a difficult task that requires a lot of effort, because it must consider all the possible states of an app, its context (e.g., device in which is running, sensors, touch gestures, screen proportions, connectivity), and a large combination of mobile devices and operating systems. Previous efforts have been done to extract models that support automated testing. However, as of today there is not a single model that synthesizes different aspects in mobile apps such as domain, usage, context and GUI-related information. These aspects represent complementary information that can be mixed into a single and enriched model. In this paper, we propose a multi-model representation that combines information extracted statically and dynamically from Android apps. Our approach allows practitioners to automatically extract augmented models that combine different types of information, and could help them during comprehension and testing tasks.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"17 1","pages":"549-553"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78385529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00048
Dawn J Lawrie, D. Binkley
Software engineering researchers have been applying tools and techniques from information retrieval (IR) to problems such as bug localization to lower the manual effort required to perform maintenance tasks. The central challenge when using an IR-based tool is the formation of a high-quality query. When performing bug localization, one easily accessible source of query words is the bug report. A recent paper investigated the sufficiency of this source by using a genetic algorithm (GA) to build high quality queries. Unfortunately, the GA in essence "cheats" as it makes use of query performance when evolving a good query. This raises the question, is it feasible to attain similar results without "cheating?" One approach to providing cheat-free queries is to employ automatic summarization. The performance of the resulting summaries calls into question the sufficiency of the bug reports as a source of query words. To better understand the situation, Information Need Analysis (INA) is applied to quantify both how well the GA is performing and, perhaps more importantly, how well a bug report captures the vocabulary needed to perform IR-based bug localization. The results find that summarization shows potential to produce high-quality queries, but it requires more training data. Furthermore, while bug reports provide a useful source of query words, they are rather limited and thus query expansion techniques, perhaps in combination with summarization, will likely produce higher-quality queries.
软件工程研究人员一直在将信息检索(IR)中的工具和技术应用于诸如bug定位之类的问题,以降低执行维护任务所需的人工工作量。使用基于ir的工具时的主要挑战是形成高质量的查询。在执行bug本地化时,一个容易访问的查询词来源是bug报告。最近的一篇论文通过使用遗传算法(GA)来构建高质量查询来研究该来源的充分性。不幸的是,遗传算法在本质上“作弊”,因为它在进化一个好的查询时利用了查询性能。这就提出了一个问题,在没有“作弊”的情况下获得类似的结果是否可行?提供无作弊查询的一种方法是使用自动摘要。结果摘要的性能使人们对bug报告作为查询词来源的充分性产生疑问。为了更好地理解这种情况,应用信息需求分析(Information Need Analysis, INA)来量化遗传算法的执行情况,以及(可能更重要的)bug报告捕获执行基于ir的bug本地化所需词汇表的程度。结果发现,摘要显示出产生高质量查询的潜力,但它需要更多的训练数据。此外,虽然bug报告提供了有用的查询词来源,但它们相当有限,因此查询扩展技术(也许与摘要结合使用)可能会产生更高质量的查询。
{"title":"On the Value of Bug Reports for Retrieval-Based Bug Localization","authors":"Dawn J Lawrie, D. Binkley","doi":"10.1109/ICSME.2018.00048","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00048","url":null,"abstract":"Software engineering researchers have been applying tools and techniques from information retrieval (IR) to problems such as bug localization to lower the manual effort required to perform maintenance tasks. The central challenge when using an IR-based tool is the formation of a high-quality query. When performing bug localization, one easily accessible source of query words is the bug report. A recent paper investigated the sufficiency of this source by using a genetic algorithm (GA) to build high quality queries. Unfortunately, the GA in essence \"cheats\" as it makes use of query performance when evolving a good query. This raises the question, is it feasible to attain similar results without \"cheating?\" One approach to providing cheat-free queries is to employ automatic summarization. The performance of the resulting summaries calls into question the sufficiency of the bug reports as a source of query words. To better understand the situation, Information Need Analysis (INA) is applied to quantify both how well the GA is performing and, perhaps more importantly, how well a bug report captures the vocabulary needed to perform IR-based bug localization. The results find that summarization shows potential to produce high-quality queries, but it requires more training data. Furthermore, while bug reports provide a useful source of query words, they are rather limited and thus query expansion techniques, perhaps in combination with summarization, will likely produce higher-quality queries.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"9 1","pages":"524-528"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78796752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00051
Terese Besker, A. Martini, Rumesh Edirisooriya Lokuge, Kelly Blincoe, J. Bosch
Software startups are typically under extreme pressure to get to market quickly with limited resources and high uncertainty. This pressure and uncertainty is likely to cause startups to accumulate technical debt as they make decisions that are more focused on the short-term than the long-term health of the codebase. However, most research on technical debt has been focused on more mature software teams, who may have less pressure and, therefore, reason about technical debt very differently than software startups. In this study, we seek to understand the organizational factors that lead to and the benefits and challenges associated with the intentional accumulation of technical debt in software startups. We interviewed 16 professionals involved in seven different software startups. We find that the startup phase, the experience of the developers, software knowledge of the founders, and level of employee growth are some of the organizational factors that influence the intentional accumulation of technical debt. In addition, we find the software startups are typically driven to achieve a "good enough level," and this guides the amount of technical debt that they intentionally accumulate to balance the benefits of speed to market and reduced resources with the challenges of later addressing technical debt.
{"title":"Embracing Technical Debt, from a Startup Company Perspective","authors":"Terese Besker, A. Martini, Rumesh Edirisooriya Lokuge, Kelly Blincoe, J. Bosch","doi":"10.1109/ICSME.2018.00051","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00051","url":null,"abstract":"Software startups are typically under extreme pressure to get to market quickly with limited resources and high uncertainty. This pressure and uncertainty is likely to cause startups to accumulate technical debt as they make decisions that are more focused on the short-term than the long-term health of the codebase. However, most research on technical debt has been focused on more mature software teams, who may have less pressure and, therefore, reason about technical debt very differently than software startups. In this study, we seek to understand the organizational factors that lead to and the benefits and challenges associated with the intentional accumulation of technical debt in software startups. We interviewed 16 professionals involved in seven different software startups. We find that the startup phase, the experience of the developers, software knowledge of the founders, and level of employee growth are some of the organizational factors that influence the intentional accumulation of technical debt. In addition, we find the software startups are typically driven to achieve a \"good enough level,\" and this guides the amount of technical debt that they intentionally accumulate to balance the benefits of speed to market and reduced resources with the challenges of later addressing technical debt.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"215 1","pages":"415-425"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76669390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00063
Mohammed Hassan, Emily Hill
Novice programmers sometimes need to understand code written by others. Unfortunately, most software projects lack comments suitable for novices. The lack of comments have been addressed through automated techniques of generating comments based on program statements. However, these techniques lacked the context of how these statements function since they were aimed toward experienced programmers. In this paper, we present a novel technique towards automatically generating comments for Java statements suitable for novice programmers. Our technique not only goes beyond existing approaches to method summarization to meet the needs of novices, it also leverages API documentation when available. In an experimental study of 30 computer science undergraduate students, we observed explanations based on our technique to be preferred over an existing approach.
{"title":"Toward Automatic Summarization of Arbitrary Java Statements for Novice Programmers","authors":"Mohammed Hassan, Emily Hill","doi":"10.1109/ICSME.2018.00063","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00063","url":null,"abstract":"Novice programmers sometimes need to understand code written by others. Unfortunately, most software projects lack comments suitable for novices. The lack of comments have been addressed through automated techniques of generating comments based on program statements. However, these techniques lacked the context of how these statements function since they were aimed toward experienced programmers. In this paper, we present a novel technique towards automatically generating comments for Java statements suitable for novice programmers. Our technique not only goes beyond existing approaches to method summarization to meet the needs of novices, it also leverages API documentation when available. In an experimental study of 30 computer science undergraduate students, we observed explanations based on our technique to be preferred over an existing approach.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"55 1","pages":"539-543"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90971590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00037
Kui Liu, Dongsun Kim, Anil Koyuncu, Li Li, Tegawendé F. Bissyandé, Yves Le Traon
Bug fixing is a time-consuming and tedious task. To reduce the manual efforts in bug fixing, researchers have presented automated approaches to software repair. Unfortunately, recent studies have shown that the state-of-the-art techniques in automated repair tend to generate patches only for a small number of bugs even with quality issues (e.g., incorrect behavior and nonsensical changes). To improve automated program repair (APR) techniques, the community should deepen its knowledge on repair actions from real-world patches since most of the techniques rely on patches written by human developers. Previous investigations on real-world patches are limited to statement level that is not sufficiently fine-grained to build this knowledge. In this work, we contribute to building this knowledge via a systematic and fine-grained study of 16,450 bug fix commits from seven Java open-source projects. We find that there are opportunities for APR techniques to improve their effectiveness by looking at code elements that have not yet been investigated. We also discuss nine insights into tuning automated repair tools. For example, a small number of statement and expression types are recurrently impacted by real-world patches, and expression-level granularity could reduce search space of finding fix ingredients, where previous studies never explored.
{"title":"A Closer Look at Real-World Patches","authors":"Kui Liu, Dongsun Kim, Anil Koyuncu, Li Li, Tegawendé F. Bissyandé, Yves Le Traon","doi":"10.1109/ICSME.2018.00037","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00037","url":null,"abstract":"Bug fixing is a time-consuming and tedious task. To reduce the manual efforts in bug fixing, researchers have presented automated approaches to software repair. Unfortunately, recent studies have shown that the state-of-the-art techniques in automated repair tend to generate patches only for a small number of bugs even with quality issues (e.g., incorrect behavior and nonsensical changes). To improve automated program repair (APR) techniques, the community should deepen its knowledge on repair actions from real-world patches since most of the techniques rely on patches written by human developers. Previous investigations on real-world patches are limited to statement level that is not sufficiently fine-grained to build this knowledge. In this work, we contribute to building this knowledge via a systematic and fine-grained study of 16,450 bug fix commits from seven Java open-source projects. We find that there are opportunities for APR techniques to improve their effectiveness by looking at code elements that have not yet been investigated. We also discuss nine insights into tuning automated repair tools. For example, a small number of statement and expression types are recurrently impacted by real-world patches, and expression-level granularity could reduce search space of finding fix ingredients, where previous studies never explored.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"56 1","pages":"275-286"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90991495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICSME.2018.00039
Haoren Wang, Huzefa H. Kagdi
Bugs dominate the corrective maintenance and evolutionary changes in large-scale software systems. The topic of bugs has been extensively investigated and reported in the literature. Unfortunately, the existential question of all "whether a reported bug will be fixed or not" has not received much attention. The paper presents an empirical study on four open source projects to examine the factors that influence the likelihood of a bug getting fixed or not. Overall, our study can be contextualized as a conceptual replication of a previous study on Microsoft systems from a commercial domain. The similarities and differences in terms of the design, execution, and results between the two studies are discussed. It was observed from these systems that the reputations of the reporter and assigned developer to fix it, and the number of comments on a bug have the most substantial impact on its probability to get fixed. Moreover, we formulated a predictive model from features available as soon as a bug is reported to estimate whether it will be fixed or not. Intra and inter (cross) project validations were performed. Precision and Recall metrics were used to assess the predictive model. Their values were recorded in the 60% to 70% range.
{"title":"A Conceptual Replication Study on Bugs that Get Fixed in Open Source Software","authors":"Haoren Wang, Huzefa H. Kagdi","doi":"10.1109/ICSME.2018.00039","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00039","url":null,"abstract":"Bugs dominate the corrective maintenance and evolutionary changes in large-scale software systems. The topic of bugs has been extensively investigated and reported in the literature. Unfortunately, the existential question of all \"whether a reported bug will be fixed or not\" has not received much attention. The paper presents an empirical study on four open source projects to examine the factors that influence the likelihood of a bug getting fixed or not. Overall, our study can be contextualized as a conceptual replication of a previous study on Microsoft systems from a commercial domain. The similarities and differences in terms of the design, execution, and results between the two studies are discussed. It was observed from these systems that the reputations of the reporter and assigned developer to fix it, and the number of comments on a bug have the most substantial impact on its probability to get fixed. Moreover, we formulated a predictive model from features available as soon as a bug is reported to estimate whether it will be fixed or not. Intra and inter (cross) project validations were performed. Precision and Recall metrics were used to assess the predictive model. Their values were recorded in the 60% to 70% range.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"127 1","pages":"299-310"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85713468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}