Brian Randell described software engineering as “the multi-person development of multi-version programs”. David Parnas expressed that this “pithy phrase implies everything that differentiates software engineering from other programming” (Parnas, 2011). How does current software engineering research compare against this definition? Is there too much focus currently on research into problems and techniques more associated with programming than software engineering? Are there opportunities to use Randell's description of software engineering to guide the community to new research directions? In this extended abstract, I motivate the keynote, which explores these questions and discusses how a consideration of the development streams used by multiple individuals to produce multiple versions of software opens up new avenues for impactful software engineering research.
Brian Randell将软件工程描述为“多人开发多版本程序”。David Parnas表示,这个“精辟的短语暗示了将软件工程与其他编程区分开来的一切”(Parnas, 2011)。当前的软件工程研究与这个定义相比如何?目前是否有过多的关注于研究与编程相关的问题和技术,而不是软件工程?是否有机会使用Randell对软件工程的描述来引导社区走向新的研究方向?在这个扩展的摘要中,我激发了主题,它探索了这些问题,并讨论了如何考虑由多个个体用于产生多个软件版本的开发流,为有影响力的软件工程研究开辟了新的途径。
{"title":"Is Software Engineering Research Addressing Software Engineering Problems? (Keynote)","authors":"G. Murphy","doi":"10.1145/3324884.3417103","DOIUrl":"https://doi.org/10.1145/3324884.3417103","url":null,"abstract":"Brian Randell described software engineering as “the multi-person development of multi-version programs”. David Parnas expressed that this “pithy phrase implies everything that differentiates software engineering from other programming” (Parnas, 2011). How does current software engineering research compare against this definition? Is there too much focus currently on research into problems and techniques more associated with programming than software engineering? Are there opportunities to use Randell's description of software engineering to guide the community to new research directions? In this extended abstract, I motivate the keynote, which explores these questions and discusses how a consideration of the development streams used by multiple individuals to produce multiple versions of software opens up new avenues for impactful software engineering research.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123765042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
One recent promising direction in reducing costs of mutation analysis is to identify redundant mutations. We propose a technique to discover redundant mutations by proving subsumption relations among method-level mutation operators using weak mutation testing. We conceive and encode a theory of subsumption relations in Z3 for 40 mutation targets (mutations of an expression or statement). Then we prove a number of subsumption relations using the Z3 theorem prover, and reduce the number of mutations in a number of mutation targets. MUJAvA-M includes some subsumption relations in Mujava. We apply Mujava and Mujava-m to 187 classes of 17 projects. Our approach correctly discards mutations in 74.97% of the cases, and reduces the number of mutations by 72.52%.
{"title":"Identifying Mutation Subsumption Relations","authors":"Beatriz Souza","doi":"10.1145/3324884.3418921","DOIUrl":"https://doi.org/10.1145/3324884.3418921","url":null,"abstract":"One recent promising direction in reducing costs of mutation analysis is to identify redundant mutations. We propose a technique to discover redundant mutations by proving subsumption relations among method-level mutation operators using weak mutation testing. We conceive and encode a theory of subsumption relations in Z3 for 40 mutation targets (mutations of an expression or statement). Then we prove a number of subsumption relations using the Z3 theorem prover, and reduce the number of mutations in a number of mutation targets. MUJAvA-M includes some subsumption relations in Mujava. We apply Mujava and Mujava-m to 187 classes of 17 projects. Our approach correctly discards mutations in 74.97% of the cases, and reduces the number of mutations by 72.52%.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124925400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Whenever software components process personal or private data, appropriate data protection mechanisms are mandatory. An essential factor in achieving trust and transparency is not to give preference to a single party but to make it possible to audit the data usage in an unbiased way. The scenario in mind for this contribution contains (i) users bringing in sensitive data they want to be safe, (ii) developers building software-based services whose Intellectual Properties (IPs) they desire to protect, and (iii) platform providers wanting to be trusted and to be able to rely on the developers integrity. The authors see these interests as an insufficiently solved field of tension that can be relaxed by a suitable level of transparently represented software components to give insights without exposing every detail.
{"title":"Towards Transparency-Encouraging Partial Software Disclosure to Enable Trust in Data Usage","authors":"Christopher M. Schindler","doi":"10.1145/3324884.3415282","DOIUrl":"https://doi.org/10.1145/3324884.3415282","url":null,"abstract":"Whenever software components process personal or private data, appropriate data protection mechanisms are mandatory. An essential factor in achieving trust and transparency is not to give preference to a single party but to make it possible to audit the data usage in an unbiased way. The scenario in mind for this contribution contains (i) users bringing in sensitive data they want to be safe, (ii) developers building software-based services whose Intellectual Properties (IPs) they desire to protect, and (iii) platform providers wanting to be trusted and to be able to rely on the developers integrity. The authors see these interests as an insufficiently solved field of tension that can be relaxed by a suitable level of transparently represented software components to give insights without exposing every detail.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126216568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Berend, Xiaofei Xie, L. Ma, Lingjun Zhou, Yang Liu, Chi Xu, Jianjun Zhao
As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. According to the fundamental assumption of deep learning, the DL software does not provide statistical guarantee and has limited capability in handling data that falls outside of its learned distribution, i.e., out-of-distribution (OOD) data. Although recent progress has been made in designing novel testing techniques for DL software, which can detect thousands of errors, the current state-of-the-art DL testing techniques usually do not take the distribution of generated test data into consideration. It is therefore hard to judge whether the “identified errors” are indeed meaningful errors to the DL application (i.e., due to quality issues of the model) or outliers that cannot be handled by the current model (i.e., due to the lack of training data). Tofill this gap, we take thefi rst step and conduct a large scale empirical study, with a total of 451 experiment configurations, 42 deep neural networks (DNNs) and 1.2 million test data instances, to investigate and characterize the impact of OOD-awareness on DL testing. We further analyze the consequences when DL systems go into production by evaluating the effectiveness of adversarial retraining with distribution-aware errors. The results confirm that introducing data distribution awareness in both testing and enhancement phases outperforms distribution unaware retraining by up to 21.5%.
随着深度学习(DL)在许多工业应用中的不断应用,其质量和可靠性开始引起人们的关注。与传统的软件开发过程类似,在早期阶段对DL软件进行测试以发现其缺陷是降低部署后风险的有效方法。根据深度学习的基本假设,深度学习软件不提供统计保证,处理超出其学习分布的数据,即out- distribution (OOD)数据的能力有限。虽然最近在为深度学习软件设计新的测试技术方面取得了进展,它可以检测到数千个错误,但目前最先进的深度学习测试技术通常不会考虑生成测试数据的分布。因此,很难判断“已识别的错误”是否确实是对DL应用有意义的错误(即由于模型的质量问题),还是当前模型无法处理的异常值(即由于缺乏训练数据)。为了填补这一空白,我们采取了第一步,进行了大规模的实证研究,共有451个实验配置,42个深度神经网络(dnn)和120万个测试数据实例,来调查和表征ood感知对深度学习测试的影响。我们通过评估带有分布感知错误的对抗性再训练的有效性,进一步分析了深度学习系统投入生产时的后果。结果证实,在测试和增强阶段引入数据分布意识比不知道分布的再训练高出21.5%。
{"title":"Cats Are Not Fish: Deep Learning Testing Calls for Out-Of-Distribution Awareness","authors":"David Berend, Xiaofei Xie, L. Ma, Lingjun Zhou, Yang Liu, Chi Xu, Jianjun Zhao","doi":"10.1145/3324884.3416609","DOIUrl":"https://doi.org/10.1145/3324884.3416609","url":null,"abstract":"As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. According to the fundamental assumption of deep learning, the DL software does not provide statistical guarantee and has limited capability in handling data that falls outside of its learned distribution, i.e., out-of-distribution (OOD) data. Although recent progress has been made in designing novel testing techniques for DL software, which can detect thousands of errors, the current state-of-the-art DL testing techniques usually do not take the distribution of generated test data into consideration. It is therefore hard to judge whether the “identified errors” are indeed meaningful errors to the DL application (i.e., due to quality issues of the model) or outliers that cannot be handled by the current model (i.e., due to the lack of training data). Tofill this gap, we take thefi rst step and conduct a large scale empirical study, with a total of 451 experiment configurations, 42 deep neural networks (DNNs) and 1.2 million test data instances, to investigate and characterize the impact of OOD-awareness on DL testing. We further analyze the consequences when DL systems go into production by evaluating the effectiveness of adversarial retraining with distribution-aware errors. The results confirm that introducing data distribution awareness in both testing and enhancement phases outperforms distribution unaware retraining by up to 21.5%.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129432233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Client-specific equivalence checking (CSEC) is a technique proposed previously to perform impact analysis of changes to downstream components (libraries) from the perspective of an unchanged system (client). Existing analysis techniques, whether general (re-gression verification, equivalence checking) or special-purpose, when applied to CSEC, either require users to provide specifications, or do not scale. We propose a novel solution to the CSEC problem, called 2clever, that is based on searching the control-flow of a program for impact boundaries. We evaluate a prototype implementation of 2clever on a comprehensive set of benchmarks and conclude that our prototype performs well compared to the state-of-the-art.
{"title":"Scaling Client-Specific Equivalence Checking via Impact Boundary Search","authors":"Nick Feng, Federico Mora, V. Hui, M. Chechik","doi":"10.1145/3324884.3416634","DOIUrl":"https://doi.org/10.1145/3324884.3416634","url":null,"abstract":"Client-specific equivalence checking (CSEC) is a technique proposed previously to perform impact analysis of changes to downstream components (libraries) from the perspective of an unchanged system (client). Existing analysis techniques, whether general (re-gression verification, equivalence checking) or special-purpose, when applied to CSEC, either require users to provide specifications, or do not scale. We propose a novel solution to the CSEC problem, called 2clever, that is based on searching the control-flow of a program for impact boundaries. We evaluate a prototype implementation of 2clever on a comprehensive set of benchmarks and conclude that our prototype performs well compared to the state-of-the-art.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130951222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Navid Salehnamadi, Abdulaziz Alshayban, Iftekhar Ahmed, S. Malek
Android platform provisions a number of sophisticated concurrency mechanisms for the development of apps. The concurrency mechanisms, while powerful, are quite difficult to properly master by mobile developers. In fact, prior studies have shown concurrency issues, such as event-race defects, to be prevalent among real-world Android apps. In this paper, we propose a flow-, context-, and thread-sensitive static analysis framework, called ER Catcher, for detection of event-race defects in Android apps. ER Catcher introduces a new type of summary function aimed at modeling the concurrent behavior of methods in both Android apps and libraries. In addition, it leverages a novel, statically constructed Vector Clock for rapid analysis of happens-before relations. Altogether, these design choices enable ER Catcher to not only detect event-race defects with a substantially higher degree of accuracy, but also in a fraction of time compared to the existing state-of-the-art technique.
{"title":"ER Catcher: A Static Analysis Framework for Accurate and Scalable Event-Race Detection in Android","authors":"Navid Salehnamadi, Abdulaziz Alshayban, Iftekhar Ahmed, S. Malek","doi":"10.1145/3324884.3416639","DOIUrl":"https://doi.org/10.1145/3324884.3416639","url":null,"abstract":"Android platform provisions a number of sophisticated concurrency mechanisms for the development of apps. The concurrency mechanisms, while powerful, are quite difficult to properly master by mobile developers. In fact, prior studies have shown concurrency issues, such as event-race defects, to be prevalent among real-world Android apps. In this paper, we propose a flow-, context-, and thread-sensitive static analysis framework, called ER Catcher, for detection of event-race defects in Android apps. ER Catcher introduces a new type of summary function aimed at modeling the concurrent behavior of methods in both Android apps and libraries. In addition, it leverages a novel, statically constructed Vector Clock for rapid analysis of happens-before relations. Altogether, these design choices enable ER Catcher to not only detect event-race defects with a substantially higher degree of accuracy, but also in a fraction of time compared to the existing state-of-the-art technique.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130996719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Gafurov, Margrete Sunde Grovan, Arne Erik Hurum
We present lightweight model-based testing (MBT) of privacy and authorization concepts of national portal for electronic health services in Norway (which has over a million of visits per month). We have developed test models for creating and updating privacy levels and authorization categories using finite state machine. Our models emphasize not only positive but also negative behavioral aspects of the system. Using edge and edge-pair coverage as an acceptance criteria we identify and systematically derive abstract tests (high level user scenario) from models. Abstract tests are further refined and transformed into concrete tests with detailed steps and data. Although derivation of abstract tests and their transformation into concrete ones are manual, execution of concrete tests and generation of test report are automated. In total, we extracted 85 abstract test cases which resulted in 80 concrete test cases with over 550 iterations. Automated execution of all tests takes about 1 hour, while manual execution of one takes about 5 minutes (over 40 times speedup). MBT contributed to shift the focus of our intellectual work effort into model design rather than test case design, thus making derivation of test scenarios systematic and straight forward. In addition, applying MBT augmented and extended our traditional quality assurance techniques by facilitating better comprehension of new privacy and authorization concepts. Graphical models served as a useful aid in learning these concepts for newcomers.
{"title":"Lightweight MBT Testing for National e-Health Portal in Norway","authors":"D. Gafurov, Margrete Sunde Grovan, Arne Erik Hurum","doi":"10.1145/3324884.3421843","DOIUrl":"https://doi.org/10.1145/3324884.3421843","url":null,"abstract":"We present lightweight model-based testing (MBT) of privacy and authorization concepts of national portal for electronic health services in Norway (which has over a million of visits per month). We have developed test models for creating and updating privacy levels and authorization categories using finite state machine. Our models emphasize not only positive but also negative behavioral aspects of the system. Using edge and edge-pair coverage as an acceptance criteria we identify and systematically derive abstract tests (high level user scenario) from models. Abstract tests are further refined and transformed into concrete tests with detailed steps and data. Although derivation of abstract tests and their transformation into concrete ones are manual, execution of concrete tests and generation of test report are automated. In total, we extracted 85 abstract test cases which resulted in 80 concrete test cases with over 550 iterations. Automated execution of all tests takes about 1 hour, while manual execution of one takes about 5 minutes (over 40 times speedup). MBT contributed to shift the focus of our intellectual work effort into model design rather than test case design, thus making derivation of test scenarios systematic and straight forward. In addition, applying MBT augmented and extended our traditional quality assurance techniques by facilitating better comprehension of new privacy and authorization concepts. Graphical models served as a useful aid in learning these concepts for newcomers.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117172372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Natural language comments are like bridges between human logic and software semantics. Developers use comments to describe the function, implementation, and property of code snippets. This kind of connections contains rich information, like the potential types of a variable and the pre-condition of a method, among other things. In this paper, we categorize comments and use natural language processing techniques to extract information from them. Based on the semantics of programming languages, different rules are built for each comment category to systematically propagate comments among code entities. Then we use the propagated comments to check the code usage and comments consistency. Our demo system finds 37 bugs in real-world projects, 30 of which have been confirmed by the developers. Except for bugs in the code, we also find 304 pieces of defected comments. The 12 of them are misleading and 292 of them are not correct. Moreover, among the 41573 pieces of comments we propagate, 87 comments are for private native methods which had neither code nor comments. We also conduct a user study where we find that propagated comments are as good as human-written comments in three dimensions of consistency, naturalness, and meaningfulness.
{"title":"The Classification and Propagation of Program Comments","authors":"Xiangzhe Xu","doi":"10.1145/3324884.3418913","DOIUrl":"https://doi.org/10.1145/3324884.3418913","url":null,"abstract":"Natural language comments are like bridges between human logic and software semantics. Developers use comments to describe the function, implementation, and property of code snippets. This kind of connections contains rich information, like the potential types of a variable and the pre-condition of a method, among other things. In this paper, we categorize comments and use natural language processing techniques to extract information from them. Based on the semantics of programming languages, different rules are built for each comment category to systematically propagate comments among code entities. Then we use the propagated comments to check the code usage and comments consistency. Our demo system finds 37 bugs in real-world projects, 30 of which have been confirmed by the developers. Except for bugs in the code, we also find 304 pieces of defected comments. The 12 of them are misleading and 292 of them are not correct. Moreover, among the 41573 pieces of comments we propagate, 87 comments are for private native methods which had neither code nor comments. We also conduct a user study where we find that propagated comments are as good as human-written comments in three dimensions of consistency, naturalness, and meaningfulness.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129047194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The advances in machine learning (ML) have stimulated the integration of their capabilities into software systems. However, there is a tangible gap between software engineering and machine learning practices, that is delaying the progress of intelligent services development. Software organisations are devoting effort to adjust the software engineering processes and practices to facilitate the integration of machine learning models. Machine learning researchers as well are focusing on improving the interpretability of machine learning models to support overall system robustness. Our research focuses on bridging this gap through a methodology that evaluates the robustness of machine learning-enabled software engineering systems. In particular, this methodology will automate the evaluation of the robustness properties of software systems against dataset shift problems in ML. It will also feature a notification mechanism that facilitates the debugging of ML components.
{"title":"Towards Robust Production Machine Learning Systems: Managing Dataset Shift","authors":"Hala Abdelkader","doi":"10.1145/3324884.3415281","DOIUrl":"https://doi.org/10.1145/3324884.3415281","url":null,"abstract":"The advances in machine learning (ML) have stimulated the integration of their capabilities into software systems. However, there is a tangible gap between software engineering and machine learning practices, that is delaying the progress of intelligent services development. Software organisations are devoting effort to adjust the software engineering processes and practices to facilitate the integration of machine learning models. Machine learning researchers as well are focusing on improving the interpretability of machine learning models to support overall system robustness. Our research focuses on bridging this gap through a methodology that evaluates the robustness of machine learning-enabled software engineering systems. In particular, this methodology will automate the evaluation of the robustness properties of software systems against dataset shift problems in ML. It will also feature a notification mechanism that facilitates the debugging of ML components.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115899262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mobile apps interact with their environment extensively, and these interactions can complicate testing activities because test cases may need a complete environment to be executed. Interactions with the environment can also introduce test flakiness, for instance when the environment behaves in non-deterministic ways. For these reasons, it is common to create test mocks that can eliminate the need for (part of) the environment to be present during testing. Manual mock creation, however, can be extremely time consuming and error-prone. Moreover, the generated mocks can typically only be used in the context of the specific tests for which they were created. To address these issues, we propose MOKA, a general framework for collecting and generating reusable test mocks in an automated way. MOKA leverages the ability to observe a large number of interactions between an application and its environment and uses an iterative approach to generate two possible, alternative types of mocks with different reusability characteristics: advanced mocks generated through program synthesis (ideally) and basic record-replay-based mocks (as a fallback solution). In this paper, we describe the new ideas behind MOKA, its main characteristics, a preliminary empirical study, and a set of possible applications that could benefit from our framework.
{"title":"A Framework for Automated Test Mocking of Mobile Apps","authors":"M. Fazzini, Alessandra Gorla, A. Orso","doi":"10.1145/3324884.3418927","DOIUrl":"https://doi.org/10.1145/3324884.3418927","url":null,"abstract":"Mobile apps interact with their environment extensively, and these interactions can complicate testing activities because test cases may need a complete environment to be executed. Interactions with the environment can also introduce test flakiness, for instance when the environment behaves in non-deterministic ways. For these reasons, it is common to create test mocks that can eliminate the need for (part of) the environment to be present during testing. Manual mock creation, however, can be extremely time consuming and error-prone. Moreover, the generated mocks can typically only be used in the context of the specific tests for which they were created. To address these issues, we propose MOKA, a general framework for collecting and generating reusable test mocks in an automated way. MOKA leverages the ability to observe a large number of interactions between an application and its environment and uses an iterative approach to generate two possible, alternative types of mocks with different reusability characteristics: advanced mocks generated through program synthesis (ideally) and basic record-replay-based mocks (as a fallback solution). In this paper, we describe the new ideas behind MOKA, its main characteristics, a preliminary empirical study, and a set of possible applications that could benefit from our framework.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121623730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}