{"title":"MUBot: Learning to Test Large-Scale Commercial Android Apps like a Human","authors":"Chao Peng, Zhao Zhang, Zhengwei Lv, Ping Yang","doi":"10.1109/ICSME55016.2022.00074","DOIUrl":null,"url":null,"abstract":"Automated GUI testing has been playing a key role to uncover crashes to ensure the stability and robustness of Android apps. Recent research has proposed random, search-based and model-based testing techniques for GUI event generation. In industrial practices, different companies have developed various GUI exploration tools such as Facebook Sapienz, WeChat WeTest and ByteDance Fastbot to test their products. However, these tools are bound to their predefined GUI exploration strategies and lack of the ability to generate human-like actions to test meaningful scenarios. To address these challenges, Humanoid is the first Android testing tool that utilises deep learning to imitate human behaviours and achieves promising results over current model-based methods. However, we find some challenges when applying Humanoid to test our sophisticated commercial apps such as infinite loops and low test coverage. To this end, we performed the first case study on the performance of deep learning techniques using commercial apps to understand the underlying reason of the current weakness of this promising method. Based on our findings, we propose MUBot (Multi-modal User Bot) for human-like Android testing. Our empirical evaluation reveals that MUBot has better performance over Humanoid and Fastbot, our in-house testing tool on coverage achieved and bug-fixing rate on commercial apps.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"34 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME55016.2022.00074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Automated GUI testing has been playing a key role to uncover crashes to ensure the stability and robustness of Android apps. Recent research has proposed random, search-based and model-based testing techniques for GUI event generation. In industrial practices, different companies have developed various GUI exploration tools such as Facebook Sapienz, WeChat WeTest and ByteDance Fastbot to test their products. However, these tools are bound to their predefined GUI exploration strategies and lack of the ability to generate human-like actions to test meaningful scenarios. To address these challenges, Humanoid is the first Android testing tool that utilises deep learning to imitate human behaviours and achieves promising results over current model-based methods. However, we find some challenges when applying Humanoid to test our sophisticated commercial apps such as infinite loops and low test coverage. To this end, we performed the first case study on the performance of deep learning techniques using commercial apps to understand the underlying reason of the current weakness of this promising method. Based on our findings, we propose MUBot (Multi-modal User Bot) for human-like Android testing. Our empirical evaluation reveals that MUBot has better performance over Humanoid and Fastbot, our in-house testing tool on coverage achieved and bug-fixing rate on commercial apps.
自动化GUI测试在发现崩溃以确保Android应用的稳定性和健壮性方面发挥着关键作用。最近的研究提出了随机、基于搜索和基于模型的GUI事件生成测试技术。在工业实践中,不同的公司已经开发了各种GUI探索工具,如Facebook Sapienz,微信WeTest和字节跳动Fastbot来测试他们的产品。然而,这些工具受限于预定义的GUI探索策略,缺乏生成类似人类的操作来测试有意义的场景的能力。为了应对这些挑战,Humanoid是第一个利用深度学习来模仿人类行为的Android测试工具,与目前基于模型的方法相比,它取得了令人鼓舞的结果。然而,当应用Humanoid来测试我们复杂的商业应用程序(如无限循环和低测试覆盖率)时,我们发现了一些挑战。为此,我们对使用商业应用程序的深度学习技术的性能进行了第一个案例研究,以了解这种有前途的方法当前弱点的潜在原因。基于我们的发现,我们提出了用于类人Android测试的MUBot (Multi-modal User Bot)。我们的实证评估表明,在商业应用的覆盖率和bug修复率方面,MUBot比我们的内部测试工具Humanoid和Fastbot有更好的表现。