三维点云测试输入优先级排序

IF 6.6 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING ACM Transactions on Software Engineering and Methodology Pub Date : 2024-01-27 DOI:10.1145/3643676

Yinghua Li, Xueqi Dang, Lei Ma, Jacques Klein, Yves LE Traon, Tegawendé F. Bissyandé

{"title":"三维点云测试输入优先级排序","authors":"Yinghua Li, Xueqi Dang, Lei Ma, Jacques Klein, Yves LE Traon, Tegawendé F. Bissyandé","doi":"10.1145/3643676","DOIUrl":null,"url":null,"abstract":"<p>Three-dimensional (3D) point cloud applications have become increasingly prevalent in diverse domains, showcasing their efficacy in various software systems. However, testing such applications presents unique challenges due to the high-dimensional nature of 3D point cloud data and the vast number of possible test cases. Test input prioritization has emerged as a promising approach to enhance testing efficiency by prioritizing potentially misclassified test cases during the early stages of the testing process. Consequently, this enables the early labeling of critical inputs, leading to a reduction in the overall labeling cost. However, applying existing prioritization methods to 3D point cloud data is constrained by several factors: 1) Inadequate consideration of crucial spatial information, and 2) susceptibility to noises inherent in 3D point cloud data. In this paper, we propose PCPrior, the first test prioritization approach specifically designed for 3D point cloud test cases. The fundamental concept behind PCPrior is that test inputs closer to the decision boundary of the model are more likely to be predicted incorrectly. To capture the spatial relationship between a point cloud test and the decision boundary, we propose transforming each test (a point cloud) into a low-dimensional feature vector, towards indirectly revealing the underlying proximity between a test and the decision boundary. To achieve this, we carefully design a group of feature generation strategies, and for each test input, we generate four distinct types of features, namely, spatial features, mutation features, prediction features, and uncertainty features. Through a concatenation of the four feature types, PCPrior assembles a final feature vector for each test. Subsequently, a ranking model is employed to estimate the probability of misclassification for each test based on its feature vector. Finally, PCPrior ranks all tests based on their misclassification probabilities. We conducted an extensive study based on 165 subjects to evaluate the performance of PCPrior, encompassing both natural and noisy datasets. The results demonstrate that PCPrior outperforms all the compared test prioritization approaches, with an average improvement of 10.99%~66.94% on natural datasets and 16.62%~53% on noisy datasets.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"5 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Test Input Prioritization for 3D Point Clouds\",\"authors\":\"Yinghua Li, Xueqi Dang, Lei Ma, Jacques Klein, Yves LE Traon, Tegawendé F. Bissyandé\",\"doi\":\"10.1145/3643676\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Three-dimensional (3D) point cloud applications have become increasingly prevalent in diverse domains, showcasing their efficacy in various software systems. However, testing such applications presents unique challenges due to the high-dimensional nature of 3D point cloud data and the vast number of possible test cases. Test input prioritization has emerged as a promising approach to enhance testing efficiency by prioritizing potentially misclassified test cases during the early stages of the testing process. Consequently, this enables the early labeling of critical inputs, leading to a reduction in the overall labeling cost. However, applying existing prioritization methods to 3D point cloud data is constrained by several factors: 1) Inadequate consideration of crucial spatial information, and 2) susceptibility to noises inherent in 3D point cloud data. In this paper, we propose PCPrior, the first test prioritization approach specifically designed for 3D point cloud test cases. The fundamental concept behind PCPrior is that test inputs closer to the decision boundary of the model are more likely to be predicted incorrectly. To capture the spatial relationship between a point cloud test and the decision boundary, we propose transforming each test (a point cloud) into a low-dimensional feature vector, towards indirectly revealing the underlying proximity between a test and the decision boundary. To achieve this, we carefully design a group of feature generation strategies, and for each test input, we generate four distinct types of features, namely, spatial features, mutation features, prediction features, and uncertainty features. Through a concatenation of the four feature types, PCPrior assembles a final feature vector for each test. Subsequently, a ranking model is employed to estimate the probability of misclassification for each test based on its feature vector. Finally, PCPrior ranks all tests based on their misclassification probabilities. We conducted an extensive study based on 165 subjects to evaluate the performance of PCPrior, encompassing both natural and noisy datasets. The results demonstrate that PCPrior outperforms all the compared test prioritization approaches, with an average improvement of 10.99%~66.94% on natural datasets and 16.62%~53% on noisy datasets.</p>\",\"PeriodicalId\":50933,\"journal\":{\"name\":\"ACM Transactions on Software Engineering and Methodology\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2024-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Software Engineering and Methodology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3643676\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3643676","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

三维（3D）点云应用在不同领域越来越普遍，在各种软件系统中展示了其功效。然而，由于三维点云数据的高维特性和大量可能的测试用例，测试此类应用程序面临着独特的挑战。在测试过程的早期阶段，通过优先处理可能被错误分类的测试用例，测试输入优先级排序已成为提高测试效率的一种有前途的方法。因此，这样就能及早标注关键输入，从而降低整体标注成本。然而，将现有的优先级排序方法应用于三维点云数据受到几个因素的限制：1) 对关键空间信息考虑不周；2) 易受三维点云数据固有噪声的影响。在本文中，我们提出了 PCPrior，这是第一种专为三维点云测试案例设计的测试优先级排序方法。PCPrior 背后的基本概念是，更接近模型决策边界的测试输入更有可能被错误预测。为了捕捉点云测试与决策边界之间的空间关系，我们建议将每个测试（点云）转化为低维特征向量，从而间接揭示测试与决策边界之间的内在接近性。为此，我们精心设计了一组特征生成策略，并针对每个测试输入生成四种不同类型的特征，即空间特征、突变特征、预测特征和不确定性特征。通过这四种特征类型的串联，PCPrior 为每个测试生成一个最终特征向量。随后，根据每个测试的特征向量，采用排序模型来估算其误判概率。最后，PCPrior 根据误分类概率对所有测试进行排序。为了评估 PCPrior 的性能，我们对 165 个受试者进行了广泛的研究，包括自然数据集和噪声数据集。结果表明，PCPrior 优于所有比较过的测试优先级排序方法，在自然数据集上平均提高了 10.99% 到 66.94%，在噪声数据集上平均提高了 16.62% 到 53%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Test Input Prioritization for 3D Point Clouds

Three-dimensional (3D) point cloud applications have become increasingly prevalent in diverse domains, showcasing their efficacy in various software systems. However, testing such applications presents unique challenges due to the high-dimensional nature of 3D point cloud data and the vast number of possible test cases. Test input prioritization has emerged as a promising approach to enhance testing efficiency by prioritizing potentially misclassified test cases during the early stages of the testing process. Consequently, this enables the early labeling of critical inputs, leading to a reduction in the overall labeling cost. However, applying existing prioritization methods to 3D point cloud data is constrained by several factors: 1) Inadequate consideration of crucial spatial information, and 2) susceptibility to noises inherent in 3D point cloud data. In this paper, we propose PCPrior, the first test prioritization approach specifically designed for 3D point cloud test cases. The fundamental concept behind PCPrior is that test inputs closer to the decision boundary of the model are more likely to be predicted incorrectly. To capture the spatial relationship between a point cloud test and the decision boundary, we propose transforming each test (a point cloud) into a low-dimensional feature vector, towards indirectly revealing the underlying proximity between a test and the decision boundary. To achieve this, we carefully design a group of feature generation strategies, and for each test input, we generate four distinct types of features, namely, spatial features, mutation features, prediction features, and uncertainty features. Through a concatenation of the four feature types, PCPrior assembles a final feature vector for each test. Subsequently, a ranking model is employed to estimate the probability of misclassification for each test based on its feature vector. Finally, PCPrior ranks all tests based on their misclassification probabilities. We conducted an extensive study based on 165 subjects to evaluate the performance of PCPrior, encompassing both natural and noisy datasets. The results demonstrate that PCPrior outperforms all the compared test prioritization approaches, with an average improvement of 10.99%~66.94% on natural datasets and 16.62%~53% on noisy datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Software Engineering and Methodology 工程技术-计算机：软件工程

CiteScore

6.30

自引率

4.50%

发文量

164

审稿时长

>12 weeks

期刊介绍： Designing and building a large, complex software system is a tremendous challenge. ACM Transactions on Software Engineering and Methodology (TOSEM) publishes papers on all aspects of that challenge: specification, design, development and maintenance. It covers tools and methodologies, languages, data structures, and algorithms. TOSEM also reports on successful efforts, noting practical lessons that can be scaled and transferred to other projects, and often looks at applications of innovative technologies. The tone is scholarly but readable; the content is worthy of study; the presentation is effective.