CTRAS:众包测试报告聚合和汇总

2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) Pub Date : 2019-05-25 DOI:10.1109/ICSE.2019.00096

Rui Hao, Yang Feng, James A. Jones, Yuying Li, Zhenyu Chen

{"title":"CTRAS:众包测试报告聚合和汇总","authors":"Rui Hao, Yang Feng, James A. Jones, Yuying Li, Zhenyu Chen","doi":"10.1109/ICSE.2019.00096","DOIUrl":null,"url":null,"abstract":"Crowdsourced testing has been widely adopted to improve the quality of various software products. Crowdsourced workers typically perform testing tasks and report their experiences through test reports. While the crowdsourced test reports provide feedbacks from real usage scenarios, inspecting such a large number of reports becomes a time-consuming yet inevitable task. To improve the efficiency of this task, existing widely used issue-tracking systems, such as JIRA, Bugzilla, and Mantis, have provided keyword-search-based methods to assist users in identifying duplicate test reports. However, on mobile devices (such as mobile phones), where the crowdsourced test reports often contain insufficient text descriptions but instead rich screenshots, these text-analysis-based methods become less effective because the data has fundamentally changed. In this paper, instead of focusing on only detecting duplicates based on textual descriptions, we present CTRAS: a novel approach to leveraging duplicates to enrich the content of bug descriptions and improve the efficiency of inspecting these reports. CTRAS is capable of automatically aggregating duplicates based on both textual information and screenshots, and further summarizes the duplicate test reports into a comprehensive and comprehensible report. To validate CTRAS, we conducted quantitative studies using more than 5000 test reports, collected from 12 industrial crowdsourced projects. The experimental results reveal that CTRAS can reach an accuracy of 0.87, on average, regarding automatically detecting duplicate reports, and it outperforms the classic Max-Coverage-based and MMR summarization methods under Jensen Shannon divergence metric. Moreover, we conducted a task-based user study with 30 participants, whose result indicates that CTRAS can save nearly 30% time cost on average without loss of correctness.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"121 1","pages":"900-911"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"CTRAS: Crowdsourced Test Report Aggregation and Summarization\",\"authors\":\"Rui Hao, Yang Feng, James A. Jones, Yuying Li, Zhenyu Chen\",\"doi\":\"10.1109/ICSE.2019.00096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Crowdsourced testing has been widely adopted to improve the quality of various software products. Crowdsourced workers typically perform testing tasks and report their experiences through test reports. While the crowdsourced test reports provide feedbacks from real usage scenarios, inspecting such a large number of reports becomes a time-consuming yet inevitable task. To improve the efficiency of this task, existing widely used issue-tracking systems, such as JIRA, Bugzilla, and Mantis, have provided keyword-search-based methods to assist users in identifying duplicate test reports. However, on mobile devices (such as mobile phones), where the crowdsourced test reports often contain insufficient text descriptions but instead rich screenshots, these text-analysis-based methods become less effective because the data has fundamentally changed. In this paper, instead of focusing on only detecting duplicates based on textual descriptions, we present CTRAS: a novel approach to leveraging duplicates to enrich the content of bug descriptions and improve the efficiency of inspecting these reports. CTRAS is capable of automatically aggregating duplicates based on both textual information and screenshots, and further summarizes the duplicate test reports into a comprehensive and comprehensible report. To validate CTRAS, we conducted quantitative studies using more than 5000 test reports, collected from 12 industrial crowdsourced projects. The experimental results reveal that CTRAS can reach an accuracy of 0.87, on average, regarding automatically detecting duplicate reports, and it outperforms the classic Max-Coverage-based and MMR summarization methods under Jensen Shannon divergence metric. Moreover, we conducted a task-based user study with 30 participants, whose result indicates that CTRAS can save nearly 30% time cost on average without loss of correctness.\",\"PeriodicalId\":6736,\"journal\":{\"name\":\"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)\",\"volume\":\"121 1\",\"pages\":\"900-911\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE.2019.00096\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2019.00096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

摘要

众包测试已被广泛采用，以提高各种软件产品的质量。众包工人通常执行测试任务，并通过测试报告报告他们的经验。虽然众包测试报告提供了真实使用场景的反馈，但检查如此大量的报告成为一项耗时但不可避免的任务。为了提高这项任务的效率，现有的广泛使用的问题跟踪系统，如JIRA、Bugzilla和Mantis，已经提供了基于关键字搜索的方法来帮助用户识别重复的测试报告。然而，在移动设备(如移动电话)上，众包测试报告通常包含不充分的文本描述，而不是丰富的屏幕截图，这些基于文本分析的方法变得不那么有效，因为数据已经从根本上改变了。在本文中，我们不是只关注基于文本描述的重复检测，而是提出了CTRAS:一种利用重复来丰富错误描述内容并提高检查这些报告效率的新方法。CTRAS能够根据文本信息和屏幕截图自动聚合重复的测试报告，并进一步将重复的测试报告总结为一个全面的、可理解的报告。为了验证CTRAS，我们使用从12个工业众包项目中收集的5000多份测试报告进行了定量研究。实验结果表明，CTRAS自动检测重复报告的平均准确率为0.87，优于经典的基于max - coverage和基于Jensen Shannon散度度量的MMR汇总方法。此外，我们对30名参与者进行了基于任务的用户研究，结果表明CTRAS可以在不损失正确性的情况下平均节省近30%的时间成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

CTRAS: Crowdsourced Test Report Aggregation and Summarization

Crowdsourced testing has been widely adopted to improve the quality of various software products. Crowdsourced workers typically perform testing tasks and report their experiences through test reports. While the crowdsourced test reports provide feedbacks from real usage scenarios, inspecting such a large number of reports becomes a time-consuming yet inevitable task. To improve the efficiency of this task, existing widely used issue-tracking systems, such as JIRA, Bugzilla, and Mantis, have provided keyword-search-based methods to assist users in identifying duplicate test reports. However, on mobile devices (such as mobile phones), where the crowdsourced test reports often contain insufficient text descriptions but instead rich screenshots, these text-analysis-based methods become less effective because the data has fundamentally changed. In this paper, instead of focusing on only detecting duplicates based on textual descriptions, we present CTRAS: a novel approach to leveraging duplicates to enrich the content of bug descriptions and improve the efficiency of inspecting these reports. CTRAS is capable of automatically aggregating duplicates based on both textual information and screenshots, and further summarizes the duplicate test reports into a comprehensive and comprehensible report. To validate CTRAS, we conducted quantitative studies using more than 5000 test reports, collected from 12 industrial crowdsourced projects. The experimental results reveal that CTRAS can reach an accuracy of 0.87, on average, regarding automatically detecting duplicate reports, and it outperforms the classic Max-Coverage-based and MMR summarization methods under Jensen Shannon divergence metric. Moreover, we conducted a task-based user study with 30 participants, whose result indicates that CTRAS can save nearly 30% time cost on average without loss of correctness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量

期刊最新文献

VFix: Value-Flow-Guided Precise Program Repair for Null Pointer Dereferences Search-Based Energy Testing of Android Scalable Approaches for Test Suite Reduction A System Identification Based Oracle for Control-CPS Software Fault Localization Training Binary Classifiers as Data Structure Invariants