S. Shamshiri, René Just, J. Rojas, G. Fraser, Phil McMinn, Andrea Arcuri
Rather than tediously writing unit tests manually, tools can be used to generate them automatically - sometimes even resulting in higher code coverage than manual testing. But how good are these tests at actually finding faults? To answer this question, we applied three state-of-the-art unit test generation tools for Java (Randoop, EvoSuite, and Agitar) to the 357 real faults in the Defects4J dataset and investigated how well the generated test suites perform at detecting these faults. Although the automatically generated test suites detected 55.7% of the faults overall, only 19.9% of all the individual test suites detected a fault. By studying the effectiveness and problems of the individual tools and the tests they generate, we derive insights to support the development of automated unit test generators that achieve a higher fault detection rate. These insights include 1) improving the obtained code coverage so that faulty statements are executed in the first instance, 2) improving the propagation of faulty program states to an observable output, coupled with the generation of more sensitive assertions, and 3) improving the simulation of the execution environment to detect faults that are dependent on external factors such as date and time.
{"title":"Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T)","authors":"S. Shamshiri, René Just, J. Rojas, G. Fraser, Phil McMinn, Andrea Arcuri","doi":"10.1109/ASE.2015.86","DOIUrl":"https://doi.org/10.1109/ASE.2015.86","url":null,"abstract":"Rather than tediously writing unit tests manually, tools can be used to generate them automatically - sometimes even resulting in higher code coverage than manual testing. But how good are these tests at actually finding faults? To answer this question, we applied three state-of-the-art unit test generation tools for Java (Randoop, EvoSuite, and Agitar) to the 357 real faults in the Defects4J dataset and investigated how well the generated test suites perform at detecting these faults. Although the automatically generated test suites detected 55.7% of the faults overall, only 19.9% of all the individual test suites detected a fault. By studying the effectiveness and problems of the individual tools and the tests they generate, we derive insights to support the development of automated unit test generators that achieve a higher fault detection rate. These insights include 1) improving the obtained code coverage so that faulty statements are executed in the first instance, 2) improving the propagation of faulty program states to an observable output, coupled with the generation of more sensitive assertions, and 3) improving the simulation of the execution environment to detect faults that are dependent on external factors such as date and time.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"55 1","pages":"201-211"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87321845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexandros Tzannes, Stephen Heumann, Lamyaa Eloussi, Mohsen Vakilian, Vikram S. Adve, Michael Han
In this paper, we present the first full regions-and-effects inference algorithm for explicitly parallel fork-join programs. We infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++. We chose the DPJ annotations because they give the strongest safety guarantees of any existing concurrency-checking approach we know of, static or dynamic, and it is also the most expressive static checking system we know of that gives strong safety guarantees. This expressiveness, however, makes manual annotation difficult and tedious, which motivates the need for automatic inference, but it also makes the inference problem very challenging: the code may use region polymorphism, imperative updates with complex aliasing, arbitrary recursion, hierarchical region specifications, and wildcard elements to describe potentially infinite sets of regions. We express the inference as a constraint satisfaction problem and develop, implement, and evaluate an algorithm for solving it. The region and effect annotations inferred by the algorithm constitute a checkable proof of safe parallelism, and it can be recorded both for documentation and for fast and modular safety checking.
{"title":"Region and Effect Inference for Safe Parallelism (T)","authors":"Alexandros Tzannes, Stephen Heumann, Lamyaa Eloussi, Mohsen Vakilian, Vikram S. Adve, Michael Han","doi":"10.1109/ASE.2015.59","DOIUrl":"https://doi.org/10.1109/ASE.2015.59","url":null,"abstract":"In this paper, we present the first full regions-and-effects inference algorithm for explicitly parallel fork-join programs. We infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++. We chose the DPJ annotations because they give the strongest safety guarantees of any existing concurrency-checking approach we know of, static or dynamic, and it is also the most expressive static checking system we know of that gives strong safety guarantees. This expressiveness, however, makes manual annotation difficult and tedious, which motivates the need for automatic inference, but it also makes the inference problem very challenging: the code may use region polymorphism, imperative updates with complex aliasing, arbitrary recursion, hierarchical region specifications, and wildcard elements to describe potentially infinite sets of regions. We express the inference as a constraint satisfaction problem and develop, implement, and evaluate an algorithm for solving it. The region and effect annotations inferred by the algorithm constitute a checkable proof of safe parallelism, and it can be recorded both for documentation and for fast and modular safety checking.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"61 1","pages":"512-523"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87323999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shang-Wei Lin, Jun Sun, Truong Khanh Nguyen, Yang Liu, J. Dong
Model checking suffers from the state space explosion problem. Compositional verification techniques such as assume-guarantee reasoning (AGR) have been proposed to alleviate the problem. However, there are at least three challenges in applying AGR. Firstly, given a system M1 ? M2, how do we automatically construct and refine (in the presence of spurious counterexamples) an assumption A2, which must be an abstraction of M2? Previous approaches suggest to incrementally learn and modify the assumption through multiple invocations of a model checker, which could be often time consuming. Secondly, how do we keep the state space small when checking M1 ? A2 = f if multiple refinements of A2 are necessary? Lastly, in the presence of multiple parallel components, how do we partition the components? In this work, we propose interpolation-guided compositional verification. The idea is to tackle three challenges by using interpolations to generate and refine the abstraction of M2, to abstract M1 at the same time (so that the state space is reduced even if A2 is refined all the way to M2), and to find good partitions. Experimental results show that the proposed approach outperforms existing approaches consistently.
{"title":"Interpolation Guided Compositional Verification (T)","authors":"Shang-Wei Lin, Jun Sun, Truong Khanh Nguyen, Yang Liu, J. Dong","doi":"10.1109/ASE.2015.33","DOIUrl":"https://doi.org/10.1109/ASE.2015.33","url":null,"abstract":"Model checking suffers from the state space explosion problem. Compositional verification techniques such as assume-guarantee reasoning (AGR) have been proposed to alleviate the problem. However, there are at least three challenges in applying AGR. Firstly, given a system M1 ? M2, how do we automatically construct and refine (in the presence of spurious counterexamples) an assumption A2, which must be an abstraction of M2? Previous approaches suggest to incrementally learn and modify the assumption through multiple invocations of a model checker, which could be often time consuming. Secondly, how do we keep the state space small when checking M1 ? A2 = f if multiple refinements of A2 are necessary? Lastly, in the presence of multiple parallel components, how do we partition the components? In this work, we propose interpolation-guided compositional verification. The idea is to tackle three challenges by using interpolations to generate and refine the abstraction of M2, to abstract M1 at the same time (so that the state space is reduced even if A2 is refined all the way to M2), and to find good partitions. Experimental results show that the proposed approach outperforms existing approaches consistently.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"8 1","pages":"65-74"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88455864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software projects have a high risk of cost and schedule overruns, which has been a source of concern for the software engineering community for a long time. One of the challenges in software project management is to make reliable prediction of delays in the context of constant and rapid changes inherent in software projects. This paper presents a novel approach to providing automated support for project managers and other decision makers in predicting whether a subset of software tasks (among the hundreds to thousands of ongoing tasks) in a software project have a risk of being delayed. Our approach makes use of not only features specific to individual software tasks (i.e. local data) -- as done in previous work -- but also their relationships (i.e. networked data). In addition, using collective classification, our approach can simultaneously predict the degree of delay for a group of related tasks. Our evaluation results show a significant improvement over traditional approaches which perform classification on each task independently: achieving 46% -- 97% precision (49% improved), 46% -- 97% recall (28% improved), 56% -- 75% F-measure (39% improved), and 78% -- 95% Area Under the ROC Curve (16% improved).
{"title":"Predicting Delays in Software Projects Using Networked Classification (T)","authors":"Morakot Choetkiertikul, K. Dam, T. Tran, A. Ghose","doi":"10.1109/ASE.2015.55","DOIUrl":"https://doi.org/10.1109/ASE.2015.55","url":null,"abstract":"Software projects have a high risk of cost and schedule overruns, which has been a source of concern for the software engineering community for a long time. One of the challenges in software project management is to make reliable prediction of delays in the context of constant and rapid changes inherent in software projects. This paper presents a novel approach to providing automated support for project managers and other decision makers in predicting whether a subset of software tasks (among the hundreds to thousands of ongoing tasks) in a software project have a risk of being delayed. Our approach makes use of not only features specific to individual software tasks (i.e. local data) -- as done in previous work -- but also their relationships (i.e. networked data). In addition, using collective classification, our approach can simultaneously predict the degree of delay for a group of related tasks. Our evaluation results show a significant improvement over traditional approaches which perform classification on each task independently: achieving 46% -- 97% precision (49% improved), 46% -- 97% recall (28% improved), 56% -- 75% F-measure (39% improved), and 78% -- 95% Area Under the ROC Curve (16% improved).","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"75 1","pages":"353-364"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86279917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Novice programmers often learn programming by implementing well-known algorithms. There are several challenges in the process. Recommendation systems in software currently focus on programmer productivity and ease of development. Teaching aides for such novice programmers based on recommendation systems still remain an under-explored area. In this paper, we present a general framework for recognizing the desired target for partially-written code and recommending a reliable series of edits to transform the input program into the target solution. Our code analysis is based on graph matching and tree edit algorithms. Our experimental results show that efficient graph comparison techniques can accurately match two portions of source code and produce an accurate set of source code edits. We provide details on implementation of our framework, which is developed as a plugin for Java in Eclipse IDE.
{"title":"An Automated Framework for Recommending Program Elements to Novices (N)","authors":"Kurtis Zimmerman, C. R. Rupakheti","doi":"10.1109/ASE.2015.54","DOIUrl":"https://doi.org/10.1109/ASE.2015.54","url":null,"abstract":"Novice programmers often learn programming by implementing well-known algorithms. There are several challenges in the process. Recommendation systems in software currently focus on programmer productivity and ease of development. Teaching aides for such novice programmers based on recommendation systems still remain an under-explored area. In this paper, we present a general framework for recognizing the desired target for partially-written code and recommending a reliable series of edits to transform the input program into the target solution. Our code analysis is based on graph matching and tree edit algorithms. Our experimental results show that efficient graph comparison techniques can accurately match two portions of source code and produce an accurate set of source code edits. We provide details on implementation of our framework, which is developed as a plugin for Java in Eclipse IDE.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"11 1","pages":"283-288"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75553836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hong-Yi Chen, C. David, D. Kroening, P. Schrammel, Björn Wachter
Proving program termination is key to guaranteeing absence of undesirable behaviour, such as hanging programs and even security vulnerabilities such as denial-of-service attacks. To make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unexplored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present a modular termination analysis for C programs using template-based interprocedural summarisation. Our analysis combines a context-sensitive, over-approximating forward analysis with the inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexicographic linear ranking function templates. Our experimental results show that our tool 2LS outperforms state-of-the-art alternatives, and demonstrate the clear advantage of interprocedural reasoning over monolithic analysis in terms of efficiency, while retaining comparable precision.
{"title":"Synthesising Interprocedural Bit-Precise Termination Proofs (T)","authors":"Hong-Yi Chen, C. David, D. Kroening, P. Schrammel, Björn Wachter","doi":"10.1109/ASE.2015.10","DOIUrl":"https://doi.org/10.1109/ASE.2015.10","url":null,"abstract":"Proving program termination is key to guaranteeing absence of undesirable behaviour, such as hanging programs and even security vulnerabilities such as denial-of-service attacks. To make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unexplored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present a modular termination analysis for C programs using template-based interprocedural summarisation. Our analysis combines a context-sensitive, over-approximating forward analysis with the inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexicographic linear ranking function templates. Our experimental results show that our tool 2LS outperforms state-of-the-art alternatives, and demonstrate the clear advantage of interprocedural reasoning over monolithic analysis in terms of efficiency, while retaining comparable precision.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"18 1","pages":"53-64"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81856704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Misuse or loss of web application data can have catastrophic consequences in today's Internet oriented world. Hence, verification of web application data models is of paramount importance. We have developed a framework for verification of web application data models via translation to First Order Logic (FOL), followed by automated theorem proving. Due to the undecidability of FOL, this automated approach does not always produce a conclusive answer. In this paper, we investigate the use of many-sorted logic in data model verification in order to improve the effectiveness of this approach. Many-sorted logic allows us to specify type information explicitly, thus lightening the burden of reasoning about type information during theorem proving. Our experiments demonstrate that using many-sorted logic improves the verification performance significantly, and completely eliminates inconclusive results in all cases over 7 real world web applications, down from an 17% inconclusive rate.
{"title":"Efficient Data Model Verification with Many-Sorted Logic (T)","authors":"Ivan Bocic, T. Bultan","doi":"10.1109/ASE.2015.48","DOIUrl":"https://doi.org/10.1109/ASE.2015.48","url":null,"abstract":"Misuse or loss of web application data can have catastrophic consequences in today's Internet oriented world. Hence, verification of web application data models is of paramount importance. We have developed a framework for verification of web application data models via translation to First Order Logic (FOL), followed by automated theorem proving. Due to the undecidability of FOL, this automated approach does not always produce a conclusive answer. In this paper, we investigate the use of many-sorted logic in data model verification in order to improve the effectiveness of this approach. Many-sorted logic allows us to specify type information explicitly, thus lightening the burden of reasoning about type information during theorem proving. Our experiments demonstrate that using many-sorted logic improves the verification performance significantly, and completely eliminates inconclusive results in all cases over 7 real world web applications, down from an 17% inconclusive rate.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"6 1","pages":"42-52"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74338882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Defect prediction on new projects or projects with limited historical data is an interesting problem in software engineering. This is largely because it is difficult to collect defect information to label a dataset for training a prediction model. Cross-project defect prediction (CPDP) has tried to address this problem by reusing prediction models built by other projects that have enough historical data. However, CPDP does not always build a strong prediction model because of the different distributions among datasets. Approaches for defect prediction on unlabeled datasets have also tried to address the problem by adopting unsupervised learning but it has one major limitation, the necessity for manual effort. In this study, we propose novel approaches, CLA and CLAMI, that show the potential for defect prediction on unlabeled datasets in an automated manner without need for manual effort. The key idea of the CLA and CLAMI approaches is to label an unlabeled dataset by using the magnitude of metric values. In our empirical study on seven open-source projects, the CLAMI approach led to the promising prediction performances, 0.636 and 0.723 in average f-measure and AUC, that are comparable to those of defect prediction based on supervised learning.
{"title":"CLAMI: Defect Prediction on Unlabeled Datasets (T)","authors":"Jaechang Nam, Sunghun Kim","doi":"10.1109/ASE.2015.56","DOIUrl":"https://doi.org/10.1109/ASE.2015.56","url":null,"abstract":"Defect prediction on new projects or projects with limited historical data is an interesting problem in software engineering. This is largely because it is difficult to collect defect information to label a dataset for training a prediction model. Cross-project defect prediction (CPDP) has tried to address this problem by reusing prediction models built by other projects that have enough historical data. However, CPDP does not always build a strong prediction model because of the different distributions among datasets. Approaches for defect prediction on unlabeled datasets have also tried to address the problem by adopting unsupervised learning but it has one major limitation, the necessity for manual effort. In this study, we propose novel approaches, CLA and CLAMI, that show the potential for defect prediction on unlabeled datasets in an automated manner without need for manual effort. The key idea of the CLA and CLAMI approaches is to label an unlabeled dataset by using the magnitude of metric values. In our empirical study on seven open-source projects, the CLAMI approach led to the promising prediction performances, 0.636 and 0.723 in average f-measure and AUC, that are comparable to those of defect prediction based on supervised learning.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"7 1","pages":"452-463"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82109128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Kowal, Max Tschaikowski, M. Tribastone, Ina Schaefer
In software performance engineering, what-if scenarios, architecture optimization, capacity planning, run-time adaptation, and uncertainty management of realistic models typically require the evaluation of many instances. Effective analysis is however hindered by two orthogonal sources of complexity. The first is the infamous problem of state space explosion -- the analysis of a single model becomes intractable with its size. The second is due to massive parameter spaces to be explored, but such that computations cannot be reused across model instances. In this paper, we efficiently analyze many queuing models with the distinctive feature of more accurately capturing variability and uncertainty of execution rates by incorporating general (i.e., non-exponential) distributions. Applying product-line engineering methods, we consider a family of models generated by a core that evolves into concrete instances by applying simple delta operations affecting both the topology and the model's parameters. State explosion is tackled by turning to a scalable approximation based on ordinary differential equations. The entire model space is analyzed in a family-based fashion, i.e., at once using an efficient symbolic solution of a super-model that subsumes every concrete instance. Extensive numerical tests show that this is orders of magnitude faster than a naive instance-by-instance analysis.
{"title":"Scaling Size and Parameter Spaces in Variability-Aware Software Performance Models (T)","authors":"M. Kowal, Max Tschaikowski, M. Tribastone, Ina Schaefer","doi":"10.1109/ASE.2015.16","DOIUrl":"https://doi.org/10.1109/ASE.2015.16","url":null,"abstract":"In software performance engineering, what-if scenarios, architecture optimization, capacity planning, run-time adaptation, and uncertainty management of realistic models typically require the evaluation of many instances. Effective analysis is however hindered by two orthogonal sources of complexity. The first is the infamous problem of state space explosion -- the analysis of a single model becomes intractable with its size. The second is due to massive parameter spaces to be explored, but such that computations cannot be reused across model instances. In this paper, we efficiently analyze many queuing models with the distinctive feature of more accurately capturing variability and uncertainty of execution rates by incorporating general (i.e., non-exponential) distributions. Applying product-line engineering methods, we consider a family of models generated by a core that evolves into concrete instances by applying simple delta operations affecting both the topology and the model's parameters. State explosion is tackled by turning to a scalable approximation based on ordinary differential equations. The entire model space is analyzed in a family-based fashion, i.e., at once using an efficient symbolic solution of a super-model that subsumes every concrete instance. Extensive numerical tests show that this is orders of magnitude faster than a naive instance-by-instance analysis.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"407-417"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79808788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing specification-based testing techniques require specifications that either do not exist or are too difficult to create. As a result, they often fall short of their goal of helping developers test expected behaviors. In this paper we present a novel, natural language-based approach that exploits the descriptive nature of test names to generate test templates. Similar to how modern IDEs simplify development by providing templates for common constructs such as loops, test templates can save time and lower the cognitive barrier for writing tests. The results of our evaluation show that the approach is feasible: despite the difficulty of the task, when test names contain a sufficient amount of information, the approach's accuracy is over 80% when parsing the relevant information from the test name and generating the template.
{"title":"Automatically Generating Test Templates from Test Names (N)","authors":"Benwen Zhang, Emily Hill, J. Clause","doi":"10.1109/ASE.2015.68","DOIUrl":"https://doi.org/10.1109/ASE.2015.68","url":null,"abstract":"Existing specification-based testing techniques require specifications that either do not exist or are too difficult to create. As a result, they often fall short of their goal of helping developers test expected behaviors. In this paper we present a novel, natural language-based approach that exploits the descriptive nature of test names to generate test templates. Similar to how modern IDEs simplify development by providing templates for common constructs such as loops, test templates can save time and lower the cognitive barrier for writing tests. The results of our evaluation show that the approach is feasible: despite the difficulty of the task, when test names contain a sufficient amount of information, the approach's accuracy is over 80% when parsing the relevant information from the test name and generating the template.","PeriodicalId":6586,"journal":{"name":"2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"46 1","pages":"506-511"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80921367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}