{"title":"通过图像和文本语义理解和特征集成增强众包测试报告优先级","authors":"Chunrong Fang;Shengcheng Yu;Quanjun Zhang;Xin Li;Yulei Liu;Zhenyu Chen","doi":"10.1109/TSE.2024.3516372","DOIUrl":null,"url":null,"abstract":"Crowdsourced testing has gained prominence in the field of software testing due to its ability to effectively address the challenges posed by the fragmentation problem in mobile app testing. The inherent openness of crowdsourced testing brings diversity to the testing outcome. However, it also presents challenges for app developers in inspecting a substantial quantity of test reports. To help app developers inspect the bugs in crowdsourced test reports as early as possible, crowdsourced test report prioritization has emerged as an effective technology by establishing a systematic optimal report inspecting sequence. Nevertheless, crowdsourced test reports consist of app screenshots and textual descriptions, but current prioritization approaches mostly rely on textual descriptions, and some may add vectorized image features at the image-as-a-whole level or widget level. They still lack precision in accurately characterizing the distinctive features of crowdsourced test reports. In terms of prioritization strategy, prevailing approaches adopt simple prioritization based on features combined merely using weighted coefficients, without adequately considering the semantics, which may result in biased and ineffective outcomes. In this paper, we propose \n<sc>EncrePrior</small>\n, an enhanced crowdsourced test report prioritization approach via image-and-text semantic understanding and feature integration. \n<sc>EncrePrior</small>\n extracts distinctive features from crowdsourced test reports. For app screenshots, \n<sc>EncrePrior</small>\n considers the structure (i.e., GUI layout) and the contents (i.e., GUI widgets), viewing the app screenshot from the macroscopic and microscopic perspectives, respectively. For textual descriptions, \n<sc>EncrePrior</small>\n considers the Bug Description and Reproduction Step as the bug context. During the prioritization, we do not directly merge the features with weights to guide the prioritization. Instead, in order to comprehensively consider the semantics, we adopt a prioritize-reprioritize strategy. This practice combines different features together by considering their individual ranks. The reports are first prioritized on four features separately. Then, the ranks on four sequences are used to lexicographically reprioritize the test reports with an integration of features from app screenshots and textual descriptions. Results of an empirical study show that \n<sc>EncrePrior</small>\n outperforms the representative baseline approach \n<sc>DeepPrior</small>\n by 15.61% on average, ranging from 2.99% to 63.64% on different apps, and the novelly proposed features and prioritization strategy all contribute to the excellent performance of \n<sc>EncrePrior</small>\n.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 1","pages":"283-304"},"PeriodicalIF":6.5000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced Crowdsourced Test Report Prioritization via Image-and-Text Semantic Understanding and Feature Integration\",\"authors\":\"Chunrong Fang;Shengcheng Yu;Quanjun Zhang;Xin Li;Yulei Liu;Zhenyu Chen\",\"doi\":\"10.1109/TSE.2024.3516372\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Crowdsourced testing has gained prominence in the field of software testing due to its ability to effectively address the challenges posed by the fragmentation problem in mobile app testing. The inherent openness of crowdsourced testing brings diversity to the testing outcome. However, it also presents challenges for app developers in inspecting a substantial quantity of test reports. To help app developers inspect the bugs in crowdsourced test reports as early as possible, crowdsourced test report prioritization has emerged as an effective technology by establishing a systematic optimal report inspecting sequence. Nevertheless, crowdsourced test reports consist of app screenshots and textual descriptions, but current prioritization approaches mostly rely on textual descriptions, and some may add vectorized image features at the image-as-a-whole level or widget level. They still lack precision in accurately characterizing the distinctive features of crowdsourced test reports. In terms of prioritization strategy, prevailing approaches adopt simple prioritization based on features combined merely using weighted coefficients, without adequately considering the semantics, which may result in biased and ineffective outcomes. In this paper, we propose \\n<sc>EncrePrior</small>\\n, an enhanced crowdsourced test report prioritization approach via image-and-text semantic understanding and feature integration. \\n<sc>EncrePrior</small>\\n extracts distinctive features from crowdsourced test reports. For app screenshots, \\n<sc>EncrePrior</small>\\n considers the structure (i.e., GUI layout) and the contents (i.e., GUI widgets), viewing the app screenshot from the macroscopic and microscopic perspectives, respectively. For textual descriptions, \\n<sc>EncrePrior</small>\\n considers the Bug Description and Reproduction Step as the bug context. During the prioritization, we do not directly merge the features with weights to guide the prioritization. Instead, in order to comprehensively consider the semantics, we adopt a prioritize-reprioritize strategy. This practice combines different features together by considering their individual ranks. The reports are first prioritized on four features separately. Then, the ranks on four sequences are used to lexicographically reprioritize the test reports with an integration of features from app screenshots and textual descriptions. Results of an empirical study show that \\n<sc>EncrePrior</small>\\n outperforms the representative baseline approach \\n<sc>DeepPrior</small>\\n by 15.61% on average, ranging from 2.99% to 63.64% on different apps, and the novelly proposed features and prioritization strategy all contribute to the excellent performance of \\n<sc>EncrePrior</small>\\n.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":\"51 1\",\"pages\":\"283-304\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2024-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10795658/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10795658/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Enhanced Crowdsourced Test Report Prioritization via Image-and-Text Semantic Understanding and Feature Integration
Crowdsourced testing has gained prominence in the field of software testing due to its ability to effectively address the challenges posed by the fragmentation problem in mobile app testing. The inherent openness of crowdsourced testing brings diversity to the testing outcome. However, it also presents challenges for app developers in inspecting a substantial quantity of test reports. To help app developers inspect the bugs in crowdsourced test reports as early as possible, crowdsourced test report prioritization has emerged as an effective technology by establishing a systematic optimal report inspecting sequence. Nevertheless, crowdsourced test reports consist of app screenshots and textual descriptions, but current prioritization approaches mostly rely on textual descriptions, and some may add vectorized image features at the image-as-a-whole level or widget level. They still lack precision in accurately characterizing the distinctive features of crowdsourced test reports. In terms of prioritization strategy, prevailing approaches adopt simple prioritization based on features combined merely using weighted coefficients, without adequately considering the semantics, which may result in biased and ineffective outcomes. In this paper, we propose
EncrePrior
, an enhanced crowdsourced test report prioritization approach via image-and-text semantic understanding and feature integration.
EncrePrior
extracts distinctive features from crowdsourced test reports. For app screenshots,
EncrePrior
considers the structure (i.e., GUI layout) and the contents (i.e., GUI widgets), viewing the app screenshot from the macroscopic and microscopic perspectives, respectively. For textual descriptions,
EncrePrior
considers the Bug Description and Reproduction Step as the bug context. During the prioritization, we do not directly merge the features with weights to guide the prioritization. Instead, in order to comprehensively consider the semantics, we adopt a prioritize-reprioritize strategy. This practice combines different features together by considering their individual ranks. The reports are first prioritized on four features separately. Then, the ranks on four sequences are used to lexicographically reprioritize the test reports with an integration of features from app screenshots and textual descriptions. Results of an empirical study show that
EncrePrior
outperforms the representative baseline approach
DeepPrior
by 15.61% on average, ranging from 2.99% to 63.64% on different apps, and the novelly proposed features and prioritization strategy all contribute to the excellent performance of
EncrePrior
.
期刊介绍:
IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include:
a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models.
b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects.
c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards.
d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues.
e) System issues: Hardware-software trade-offs.
f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.