André C. Hora, M. T. Valente, R. Robbes, N. Anquetil
Commonly, software systems have public (and stable) interfaces, and internal (and possibly unstable) interfaces. Despite being discouraged, client developers often use internal interfaces, which may cause their systems to fail when they evolve. To overcome this problem, API producers may promote internal interfaces to public. In practice, however, API producers have no assistance to identify public interface candidates. In this paper, we study the transition from internal to public interfaces. We aim to help API producers to deliver a better product and API clients to benefit sooner from public interfaces. Our empirical investigation on five widely adopted Java systems present the following observations. First, we identified 195 promotions from 2,722 internal interfaces. Second, we found that promoted internal interfaces have more clients. Third, we predicted internal interface promotion with precision between 50%-80%, recall 26%-82%, and AUC 74%-85%. Finally, by applying our predictor on the last version of the analyzed systems, we automatically detected 382 public interface candidates.
{"title":"When should internal interfaces be promoted to public?","authors":"André C. Hora, M. T. Valente, R. Robbes, N. Anquetil","doi":"10.1145/2950290.2950306","DOIUrl":"https://doi.org/10.1145/2950290.2950306","url":null,"abstract":"Commonly, software systems have public (and stable) interfaces, and internal (and possibly unstable) interfaces. Despite being discouraged, client developers often use internal interfaces, which may cause their systems to fail when they evolve. To overcome this problem, API producers may promote internal interfaces to public. In practice, however, API producers have no assistance to identify public interface candidates. In this paper, we study the transition from internal to public interfaces. We aim to help API producers to deliver a better product and API clients to benefit sooner from public interfaces. Our empirical investigation on five widely adopted Java systems present the following observations. First, we identified 195 promotions from 2,722 internal interfaces. Second, we found that promoted internal interfaces have more clients. Third, we predicted internal interface promotion with precision between 50%-80%, recall 26%-82%, and AUC 74%-85%. Finally, by applying our predictor on the last version of the analyzed systems, we automatically detected 382 public interface candidates.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79648972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software development is typically a collaborative activity. Development in large projects often involves branches, where changes are performed in parallel and merged periodically. While, there is no consensus on who should perform the merge, team members typically try to find someone with enough knowledge about the changes in the branches. This task can be difficult in cases where many different developers have made significant changes. My research proposes an approach, TIPMerge, to help select the most appropriate developers to participate in a collaborative merge session, such that we maximize the knowledge spread across changes. The goal of this research is to select a specified number of developers with the highest joint coverage. We use an optimization algorithm to find which developers form the best team together to deal with a specific merge case. We have implemented all the steps of our approach and evaluate parts of them.
{"title":"Identifying participants for collaborative merge","authors":"Catarina Costa","doi":"10.1145/2950290.2983963","DOIUrl":"https://doi.org/10.1145/2950290.2983963","url":null,"abstract":"Software development is typically a collaborative activity. Development in large projects often involves branches, where changes are performed in parallel and merged periodically. While, there is no consensus on who should perform the merge, team members typically try to find someone with enough knowledge about the changes in the branches. This task can be difficult in cases where many different developers have made significant changes. My research proposes an approach, TIPMerge, to help select the most appropriate developers to participate in a collaborative merge session, such that we maximize the knowledge spread across changes. The goal of this research is to select a specified number of developers with the highest joint coverage. We use an optimization algorithm to find which developers form the best team together to deal with a specific merge case. We have implemented all the steps of our approach and evaluate parts of them.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82226546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In a test suite, all the tests should be independent: no test should affect another test's result, and running the tests in any order should yield the same test results. The assumption of such test independence is important so that tests behave consistently as designed. However, this critical assumption often does not hold in practice due to test dependence. Test dependence causes two serious problems: a dependent test may spuriously fail even when the software is correct (a false positive alarm), or it may spuriously pass even when a bug exists in the software (a false negative). Existing approaches to cope with test dependence require tests to be executed in a given order or for each test to be executed in a separate virtual machine. This paper presents an approach that can automatically repair test dependence so that each test in a suite yields the same result regardless of their execution order. At compile time, the approach refactors code under test and test code to eliminate test dependence and prevent spurious test successes or failures. We develop a prototype of our approach to handle one of the most common causes of test dependence and evaluate the prototype on five subject programs. In our experimental evaluation, our prototype is capable of eliminating up to 12.5% of the test dependence in the subject programs.
{"title":"Repairing test dependence","authors":"Wing Lam","doi":"10.1145/2950290.2983969","DOIUrl":"https://doi.org/10.1145/2950290.2983969","url":null,"abstract":"In a test suite, all the tests should be independent: no test should affect another test's result, and running the tests in any order should yield the same test results. The assumption of such test independence is important so that tests behave consistently as designed. However, this critical assumption often does not hold in practice due to test dependence. Test dependence causes two serious problems: a dependent test may spuriously fail even when the software is correct (a false positive alarm), or it may spuriously pass even when a bug exists in the software (a false negative). Existing approaches to cope with test dependence require tests to be executed in a given order or for each test to be executed in a separate virtual machine. This paper presents an approach that can automatically repair test dependence so that each test in a suite yields the same result regardless of their execution order. At compile time, the approach refactors code under test and test code to eliminate test dependence and prevent spurious test successes or failures. We develop a prototype of our approach to handle one of the most common causes of test dependence and evaluate the prototype on five subject programs. In our experimental evaluation, our prototype is capable of eliminating up to 12.5% of the test dependence in the subject programs.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83725250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Web design is a visual art, but web designs are code. Designers work visually and must then manually translate their designs to code. We propose using synthesis to automatically translate the storyboards designers already produce into CSS stylesheets and JavaScript code. To build a synthesis tool for this complex domain, we will use a novel composition mechanism that allows splitting the synthesis task among domain-specific solvers. We have built a domain-specific solver for CSS stylesheets; solvers for DOM actions and JavaScript code can be built with similar techniques. To compose the three domain-specific solvers, we propose using partial counterexamples to exchange information between different domains. Early results suggest that this composition mechanism is fast and allows specializing each solver to its domain.
{"title":"Generating interactive web pages from storyboards","authors":"P. Panchekha","doi":"10.1145/2950290.2983948","DOIUrl":"https://doi.org/10.1145/2950290.2983948","url":null,"abstract":"Web design is a visual art, but web designs are code. Designers work visually and must then manually translate their designs to code. We propose using synthesis to automatically translate the storyboards designers already produce into CSS stylesheets and JavaScript code. To build a synthesis tool for this complex domain, we will use a novel composition mechanism that allows splitting the synthesis task among domain-specific solvers. We have built a domain-specific solver for CSS stylesheets; solvers for DOM actions and JavaScript code can be built with similar techniques. To compose the three domain-specific solvers, we propose using partial counterexamples to exchange information between different domains. Early results suggest that this composition mechanism is fast and allows specializing each solver to its domain.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75233409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuepeng Wang, Yu Feng, R. Martins, Arati Kaushik, Işıl Dillig, S. Reiss
In many common scenarios, programmers need to implement functionality that is already provided by some third party library. This paper presents a tool called Hunter that facilitates code reuse by finding relevant methods in large code bases and automatically synthesizing any necessary wrapper code. Since Hunter internally uses advanced program synthesis technology, it can automatically reuse existing methods even when code adaptation is necessary. We have implemented Hunter as an Eclipse plug-in and evaluate it by (a) comparing it against S6, a state-of-the-art code reuse tool, and (b) performing a user study. Our evaluation shows that Hunter compares favorably with S6 and increases programmer productivity.
{"title":"Hunter: next-generation code reuse for Java","authors":"Yuepeng Wang, Yu Feng, R. Martins, Arati Kaushik, Işıl Dillig, S. Reiss","doi":"10.1145/2950290.2983934","DOIUrl":"https://doi.org/10.1145/2950290.2983934","url":null,"abstract":"In many common scenarios, programmers need to implement functionality that is already provided by some third party library. This paper presents a tool called Hunter that facilitates code reuse by finding relevant methods in large code bases and automatically synthesizing any necessary wrapper code. Since Hunter internally uses advanced program synthesis technology, it can automatically reuse existing methods even when code adaptation is necessary. We have implemented Hunter as an Eclipse plug-in and evaluate it by (a) comparing it against S6, a state-of-the-art code reuse tool, and (b) performing a user study. Our evaluation shows that Hunter compares favorably with S6 and increases programmer productivity.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75239513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Despite the advanced static analysis tools available within modern integrated development environments (IDEs), the error messages these tools produce remain perplexing for developers to comprehend. This research postulates that tools can computationally expose their internal reasoning processes to generate assistive error explanations that more closely align with how developers explain errors to themselves.
{"title":"How should static analysis tools explain anomalies to developers?","authors":"Titus Barik","doi":"10.1145/2950290.2983968","DOIUrl":"https://doi.org/10.1145/2950290.2983968","url":null,"abstract":"Despite the advanced static analysis tools available within modern integrated development environments (IDEs), the error messages these tools produce remain perplexing for developers to comprehend. This research postulates that tools can computationally expose their internal reasoning processes to generate assistive error explanations that more closely align with how developers explain errors to themselves.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74416814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The large body of existing research in Test Case Prioritization (TCP) techniques, can be broadly classified into two categories: dynamic techniques (that rely on run-time execution information) and static techniques (that operate directly on source and test code). Absent from this current body of work is a comprehensive study aimed at understanding and evaluating the static approaches and comparing them to dynamic approaches on a large set of projects. In this work, we perform the first extensive study aimed at empirically evaluating four static TCP techniques comparing them with state-of-research dynamic TCP techniques at different test-case granularities (e.g., method and class-level) in terms of effectiveness, efficiency and similarity of faults detected. This study was performed on 30 real-word Java programs encompassing 431 KLoC. In terms of effectiveness, we find that the static call-graph-based technique outperforms the other static techniques at test-class level, but the topic-model-based technique performs better at test-method level. In terms of efficiency, the static call-graph-based technique is also the most efficient when compared to other static techniques. When examining the similarity of faults detected for the four static techniques compared to the four dynamic ones, we find that on average, the faults uncovered by these two groups of techniques are quite dissimilar, with the top 10% of test cases agreeing on only 25% - 30% of detected faults. This prompts further research into the severity/importance of faults uncovered by these techniques, and into the potential for combining static and dynamic information for more effective approaches.
{"title":"A large-scale empirical comparison of static and dynamic test case prioritization techniques","authors":"Qi Luo, Kevin Moran, D. Poshyvanyk","doi":"10.1145/2950290.2950344","DOIUrl":"https://doi.org/10.1145/2950290.2950344","url":null,"abstract":"The large body of existing research in Test Case Prioritization (TCP) techniques, can be broadly classified into two categories: dynamic techniques (that rely on run-time execution information) and static techniques (that operate directly on source and test code). Absent from this current body of work is a comprehensive study aimed at understanding and evaluating the static approaches and comparing them to dynamic approaches on a large set of projects. In this work, we perform the first extensive study aimed at empirically evaluating four static TCP techniques comparing them with state-of-research dynamic TCP techniques at different test-case granularities (e.g., method and class-level) in terms of effectiveness, efficiency and similarity of faults detected. This study was performed on 30 real-word Java programs encompassing 431 KLoC. In terms of effectiveness, we find that the static call-graph-based technique outperforms the other static techniques at test-class level, but the topic-model-based technique performs better at test-method level. In terms of efficiency, the static call-graph-based technique is also the most efficient when compared to other static techniques. When examining the similarity of faults detected for the four static techniques compared to the four dynamic ones, we find that on average, the faults uncovered by these two groups of techniques are quite dissimilar, with the top 10% of test cases agreeing on only 25% - 30% of detected faults. This prompts further research into the severity/importance of faults uncovered by these techniques, and into the potential for combining static and dynamic information for more effective approaches.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77460949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GUI input generation tools for Android apps, such as Android's Monkey, are useful for automatically producing test inputs, but these tests are generally orders of magnitude larger than necessary, making them difficult for humans to understand. We present a technique for minimizing the output of such tools. Our technique accounts for the non-deterministic behavior of mobile apps, producing small event traces that reach a desired activity with high probability. We propose a variant of delta debugging, augmented to handle non-determinism, to solve the problem of trace minimization. We evaluate our algorithm on two sets of commercial and open-source Android applications, showing that we can minimize large event traces reaching a particular application activity, producing traces that are, on average, less than 2% the size of the original traces.
{"title":"Minimizing GUI event traces","authors":"Lazaro Clapp, O. Bastani, Saswat Anand, A. Aiken","doi":"10.1145/2950290.2950342","DOIUrl":"https://doi.org/10.1145/2950290.2950342","url":null,"abstract":"GUI input generation tools for Android apps, such as Android's Monkey, are useful for automatically producing test inputs, but these tests are generally orders of magnitude larger than necessary, making them difficult for humans to understand. We present a technique for minimizing the output of such tools. Our technique accounts for the non-deterministic behavior of mobile apps, producing small event traces that reach a desired activity with high probability. We propose a variant of delta debugging, augmented to handle non-determinism, to solve the problem of trace minimization. We evaluate our algorithm on two sets of commercial and open-source Android applications, showing that we can minimize large event traces reaching a particular application activity, producing traces that are, on average, less than 2% the size of the original traces.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82501919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Functional testing is widespread and supported by a multitude of tools, including tools to mine functional specifications. In contrast, non-functional attributes like performance are often less well understood and tested. While many profiling tools are available to gather raw performance data, interpreting this raw data requires expert knowledge and a thorough understanding of the underlying software and hardware infrastructure. In this work we present an approach that mines performance specifications from running systems autonomously. The tool creates performance models during runtime. The mined models are analyzed further to create compact and comprehensive performance assertions. The resulting assertions can be used as an evidence-based performance specification for performance regression testing, performance monitoring, or as a foundation for more formal performance specifications.
{"title":"Mining performance specifications","authors":"Marc Brünink, David S. Rosenblum","doi":"10.1145/2950290.2950314","DOIUrl":"https://doi.org/10.1145/2950290.2950314","url":null,"abstract":"Functional testing is widespread and supported by a multitude of tools, including tools to mine functional specifications. In contrast, non-functional attributes like performance are often less well understood and tested. While many profiling tools are available to gather raw performance data, interpreting this raw data requires expert knowledge and a thorough understanding of the underlying software and hardware infrastructure. In this work we present an approach that mines performance specifications from running systems autonomously. The tool creates performance models during runtime. The mined models are analyzed further to create compact and comprehensive performance assertions. The resulting assertions can be used as an evidence-based performance specification for performance regression testing, performance monitoring, or as a foundation for more formal performance specifications.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82531001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chungha Sung, Markus Kusano, Nishant Sinha, Chao Wang
The number and complexity of JavaScript-based web applications are rapidly increasing, but methods and tools for automatically testing them are lagging behind, primarily due to the difficulty in analyzing the subtle interactions between the applications and the event-driven execution environment. Although static analysis techniques have been routinely used on software written in traditional programming languages, such as Java and C++, adapting them to handle JavaScript code and the HTML DOM is difficult. In this work, we propose the first constraint-based declarative program analysis procedure for computing dependencies over program variables as well as event-handler functions of the various DOM elements, which is crucial for analyzing the behavior of a client-side web application. We implemented the method in a software tool named JSDEP and evaluated it in ARTEMIS, a platform for automated web application testing. Our experiments on a large set of web applications show the new method can significantly reduce the number of redundant test sequences and significantly increase test coverage with minimal overhead.
{"title":"Static DOM event dependency analysis for testing web applications","authors":"Chungha Sung, Markus Kusano, Nishant Sinha, Chao Wang","doi":"10.1145/2950290.2950292","DOIUrl":"https://doi.org/10.1145/2950290.2950292","url":null,"abstract":"The number and complexity of JavaScript-based web applications are rapidly increasing, but methods and tools for automatically testing them are lagging behind, primarily due to the difficulty in analyzing the subtle interactions between the applications and the event-driven execution environment. Although static analysis techniques have been routinely used on software written in traditional programming languages, such as Java and C++, adapting them to handle JavaScript code and the HTML DOM is difficult. In this work, we propose the first constraint-based declarative program analysis procedure for computing dependencies over program variables as well as event-handler functions of the various DOM elements, which is crucial for analyzing the behavior of a client-side web application. We implemented the method in a software tool named JSDEP and evaluated it in ARTEMIS, a platform for automated web application testing. Our experiments on a large set of web applications show the new method can significantly reduce the number of redundant test sequences and significantly increase test coverage with minimal overhead.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86645969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}