Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis最新文献_第2页

Log-based slicing for system-level test cases 系统级测试用例的基于日志的切片

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3464824

Salma Messaoudi, Donghwan Shin, Annibale Panichella, D. Bianculli, L. Briand

Regression testing is arguably one of the most important activities in software testing. However, its cost-effectiveness and usefulness can be largely impaired by complex system test cases that are poorly designed (e.g., test cases containing multiple test scenarios combined into a single test case) and that require a large amount of time and resources to run. One way to mitigate this issue is decomposing such system test cases into smaller, separate test cases---each of them with only one test scenario and with its corresponding assertions---so that the execution time of the decomposed test cases is lower than the original test cases, while the test effectiveness of the original test cases is preserved. This decomposition can be achieved with program slicing techniques, since test cases are software programs too. However, existing static and dynamic slicing techniques exhibit limitations when (1) the test cases use external resources, (2) code instrumentation is not a viable option, and (3) test execution is expensive. In this paper, we propose a novel approach, called DS3 (Decomposing System teSt caSe), which automatically decomposes a complex system test case into separate test case slices. The idea is to use test case execution logs, obtained from past regression testing sessions, to identify "hidden" dependencies in the slices generated by static slicing. Since logs include run-time information about the system under test, we can use them to extract access and usage of global resources and refine the slices generated by static slicing. We evaluated DS3 in terms of slicing effectiveness and compared it with a vanilla static slicing tool. We also compared the slices obtained by DS3 with the corresponding original system test cases, in terms of test efficiency and effectiveness. The evaluation results on one proprietary system and one open-source system show that DS3 is able to accurately identify the dependencies related to the usage of global resources, which vanilla static slicing misses. Moreover, the generated test case slices are, on average, 3.56 times faster than original system test cases and they exhibit no significant loss in terms of fault detection effectiveness.

回归测试可以说是软件测试中最重要的活动之一。然而，它的成本效益和有用性很大程度上被设计不良的复杂系统测试用例(例如，包含多个测试场景组合成单个测试用例的测试用例)所削弱，并且需要大量的时间和资源来运行。缓解这个问题的一种方法是将这样的系统测试用例分解成更小的、独立的测试用例——每个用例只有一个测试场景和它相应的断言——这样分解的测试用例的执行时间比原始测试用例要低，而原始测试用例的测试有效性被保留。这种分解可以通过程序切片技术来实现，因为测试用例也是软件程序。然而，当(1)测试用例使用外部资源时，现有的静态和动态切片技术表现出局限性，(2)代码插装不是一个可行的选择，(3)测试执行是昂贵的。在本文中，我们提出了一种新的方法，称为DS3(分解系统测试用例)，它自动地将复杂的系统测试用例分解为单独的测试用例切片。其思想是使用测试用例执行日志(从过去的回归测试会话中获得)来识别由静态切片生成的切片中的“隐藏”依赖。由于日志包含关于被测系统的运行时信息，我们可以使用它们来提取全局资源的访问和使用，并细化由静态切片生成的切片。我们在切片效率方面评估了DS3，并将其与普通静态切片工具进行了比较。在测试效率和有效性方面，我们还将DS3获得的切片与相应的原始系统测试用例进行了比较。在一个专有系统和一个开源系统上的评估结果表明，DS3能够准确地识别与全局资源使用相关的依赖关系，这是静态切片所无法做到的。此外，生成的测试用例切片平均比原始系统测试用例快3.56倍，并且在故障检测有效性方面没有明显的损失。

{"title":"Log-based slicing for system-level test cases","authors":"Salma Messaoudi, Donghwan Shin, Annibale Panichella, D. Bianculli, L. Briand","doi":"10.1145/3460319.3464824","DOIUrl":"https://doi.org/10.1145/3460319.3464824","url":null,"abstract":"Regression testing is arguably one of the most important activities in software testing. However, its cost-effectiveness and usefulness can be largely impaired by complex system test cases that are poorly designed (e.g., test cases containing multiple test scenarios combined into a single test case) and that require a large amount of time and resources to run. One way to mitigate this issue is decomposing such system test cases into smaller, separate test cases---each of them with only one test scenario and with its corresponding assertions---so that the execution time of the decomposed test cases is lower than the original test cases, while the test effectiveness of the original test cases is preserved. This decomposition can be achieved with program slicing techniques, since test cases are software programs too. However, existing static and dynamic slicing techniques exhibit limitations when (1) the test cases use external resources, (2) code instrumentation is not a viable option, and (3) test execution is expensive. In this paper, we propose a novel approach, called DS3 (Decomposing System teSt caSe), which automatically decomposes a complex system test case into separate test case slices. The idea is to use test case execution logs, obtained from past regression testing sessions, to identify \"hidden\" dependencies in the slices generated by static slicing. Since logs include run-time information about the system under test, we can use them to extract access and usage of global resources and refine the slices generated by static slicing. We evaluated DS3 in terms of slicing effectiveness and compared it with a vanilla static slicing tool. We also compared the slices obtained by DS3 with the corresponding original system test cases, in terms of test efficiency and effectiveness. The evaluation results on one proprietary system and one open-source system show that DS3 is able to accurately identify the dependencies related to the usage of global resources, which vanilla static slicing misses. Moreover, the generated test case slices are, on average, 3.56 times faster than original system test cases and they exhibit no significant loss in terms of fault detection effectiveness.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134379591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Semantic table structure identification in spreadsheets 电子表格中的语义表结构识别

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3464812

Yakun Zhang, Xiao Lv, Haoyu Dong, Wensheng Dou, Shi Han, Dongmei Zhang, Jun Wei, Dan Ye

Spreadsheets are widely used in various business tasks, and contain amounts of valuable data. However, spreadsheet tables are usually organized in a semi-structured way, and contain complicated semantic structures, e.g., header types and relations among headers. Lack of documented semantic table structures, existing data analysis and error detection tools can hardly understand spreadsheet tables. Therefore, identifying semantic table structures in spreadsheet tables is of great importance, and can greatly promote various analysis tasks on spreadsheets. In this paper, we propose Tasi (Table structure identification) to automatically identify semantic table structures in spreadsheets. Based on the contents, styles, and spatial locations in table headers, Tasi adopts a multi-classifier to predict potential header types and relations, and then integrates all header types and relations into consistent semantic table structures. We further propose TasiError, to detect spreadsheet errors based on the identified semantic table structures by Tasi. Our experiments on real-world spreadsheets show that, Tasi can precisely identify semantic table structures in spreadsheets, and TasiError can detect real-world spreadsheet errors with higher precision (75.2%) and recall (82.9%) than existing approaches.

电子表格广泛用于各种业务任务，并包含大量有价值的数据。然而，电子表格通常以半结构化的方式组织，并且包含复杂的语义结构，例如标题类型和标题之间的关系。缺乏文档化的语义表结构，现有的数据分析和错误检测工具很难理解电子表格。因此，识别电子表格中的语义表结构非常重要，可以极大地促进电子表格上的各种分析任务。在本文中，我们提出了Tasi(表结构识别)来自动识别电子表格中的语义表结构。Tasi基于表头中的内容、样式和空间位置，采用多分类器预测潜在的头类型和关系，然后将所有头类型和关系集成到一致的语义表结构中。我们进一步提出了TasiError，基于TasiError识别的语义表结构来检测电子表格错误。我们在真实电子表格上的实验表明，Tasi可以精确地识别电子表格中的语义表结构，而TasiError可以比现有方法更高的准确率(75.2%)和召回率(82.9%)检测出真实电子表格中的错误。

{"title":"Semantic table structure identification in spreadsheets","authors":"Yakun Zhang, Xiao Lv, Haoyu Dong, Wensheng Dou, Shi Han, Dongmei Zhang, Jun Wei, Dan Ye","doi":"10.1145/3460319.3464812","DOIUrl":"https://doi.org/10.1145/3460319.3464812","url":null,"abstract":"Spreadsheets are widely used in various business tasks, and contain amounts of valuable data. However, spreadsheet tables are usually organized in a semi-structured way, and contain complicated semantic structures, e.g., header types and relations among headers. Lack of documented semantic table structures, existing data analysis and error detection tools can hardly understand spreadsheet tables. Therefore, identifying semantic table structures in spreadsheet tables is of great importance, and can greatly promote various analysis tasks on spreadsheets. In this paper, we propose Tasi (Table structure identification) to automatically identify semantic table structures in spreadsheets. Based on the contents, styles, and spatial locations in table headers, Tasi adopts a multi-classifier to predict potential header types and relations, and then integrates all header types and relations into consistent semantic table structures. We further propose TasiError, to detect spreadsheet errors based on the identified semantic table structures by Tasi. Our experiments on real-world spreadsheets show that, Tasi can precisely identify semantic table structures in spreadsheets, and TasiError can detect real-world spreadsheet errors with higher precision (75.2%) and recall (82.9%) than existing approaches.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"109 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113960401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

UAFSan: an object-identifier-based dynamic approach for detecting use-after-free vulnerabilities UAFSan:一种基于对象标识符的动态方法，用于检测释放后使用漏洞

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3464835

Binfa Gui, Wei Song, Jeff Huang

Use-After-Free (UAF) vulnerabilities constitute severe threats to software security. In contrast to other memory errors, UAFs are more difficult to detect through manual or static analysis due to pointer aliases and complicated relationships between pointers and objects. Existing evidence-based dynamic detection approaches only track either pointers or objects to record the availability of objects, which become invalid when the memory that stored the freed object is reallocated. To this end, we propose an approach UAFSan dedicated to comprehensively detecting UAFs at runtime. Specifically, we assign a unique identifier to each newly-allocated object and its pointers; when a pointer dereferences a memory object, we determine whether a UAF occurs by checking the consistency of their identifiers. We implement UAFSan in an open-source tool and evaluate it on a large collection of popular benchmarks and real-world programs. The experiment results demonstrate that UAFSan successfully detect all UAFs with reasonable overhead, whereas existing publicly-available dynamic detectors all miss certain UAFs.

UAF (Use-After-Free)漏洞对软件安全构成严重威胁。与其他内存错误相比，由于指针别名和指针与对象之间复杂的关系，uaf更难以通过手动或静态分析来检测。现有的基于证据的动态检测方法只能跟踪指针或对象来记录对象的可用性，当存储释放对象的内存被重新分配时，对象的可用性将失效。为此，我们提出了一种致力于在运行时全面检测无人机的UAFSan方法。具体来说，我们为每个新分配的对象及其指针分配一个唯一标识符;当指针对内存对象解引用时，我们通过检查其标识符的一致性来确定是否发生了UAF。我们在一个开源工具中实现了UAFSan，并在大量流行的基准测试和实际程序中对其进行了评估。实验结果表明，在合理的开销下，UAFSan能够成功地检测到所有的无人机，而现有的公开可用的动态探测器都无法检测到某些无人机。

{"title":"UAFSan: an object-identifier-based dynamic approach for detecting use-after-free vulnerabilities","authors":"Binfa Gui, Wei Song, Jeff Huang","doi":"10.1145/3460319.3464835","DOIUrl":"https://doi.org/10.1145/3460319.3464835","url":null,"abstract":"Use-After-Free (UAF) vulnerabilities constitute severe threats to software security. In contrast to other memory errors, UAFs are more difficult to detect through manual or static analysis due to pointer aliases and complicated relationships between pointers and objects. Existing evidence-based dynamic detection approaches only track either pointers or objects to record the availability of objects, which become invalid when the memory that stored the freed object is reallocated. To this end, we propose an approach UAFSan dedicated to comprehensively detecting UAFs at runtime. Specifically, we assign a unique identifier to each newly-allocated object and its pointers; when a pointer dereferences a memory object, we determine whether a UAF occurs by checking the consistency of their identifiers. We implement UAFSan in an open-source tool and evaluate it on a large collection of popular benchmarks and real-world programs. The experiment results demonstrate that UAFSan successfully detect all UAFs with reasonable overhead, whereas existing publicly-available dynamic detectors all miss certain UAFs.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116413086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

WebEvo: taming web application evolution via detecting semantic structure changes WebEvo:通过检测语义结构变化来驯服web应用程序的演变

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3464800

Fei Shao, Ruiwen Xu, W. Haque, Jingwei Xu, Ying Zhang, Wei Yang, Yanfang Ye, Xusheng Xiao

The development of Web technology and the beginning of the Big Data era have led to the development of technologies for extracting data from websites, such as information retrieval (IR) and robotic process automation (RPA) tools. As websites are constantly evolving, to prevent these tools from functioning improperly due to website evolution, it is important to monitor the changes in websites and report them to the developers and testers. Existing monitoring tools mainly use DOM-tree based techniques to detect changes in the new web pages. However, these monitoring tools incorrectly report content-based changes (i.e., web content refreshed every time a web page is retrieved) as the changes that will adversely affect the performance of the IR and RPA tools. This results in false warnings since the IR and RPA tools typically consider these changes as expected and retrieve dynamic data from them. Moreover, these monitoring tools cannot identify GUI widget evolution (e.g., moving a button), and thus cannot help the IR and RPA tools adapt to the evolved widgets (e.g., automatic repair of locators for the evolved widgets). To address the limitations of the existing monitoring tools, we propose an approach, WebEvo, that leverages historic pages to identify the DOM elements whose changes are content-based changes, which can be safely ignored when reporting changes in the new web pages. Furthermore, to identify refactoring changes that preserve semantics and appearances of GUI widgets, WebEvo adapts computer vision (CV) techniques to identify the mappings of the GUI widgets from the old web page to the new web page on an element-by-element basis. Empirical evaluations on 13 real-world websites from 9 popular categories demonstrate the superiority of WebEvo over the existing DOM-tree based detection or whole-page visual comparison in terms of both effectiveness and efficiency.

Web技术的发展和大数据时代的开始，导致了从网站中提取数据的技术的发展，如信息检索(IR)和机器人过程自动化(RPA)工具。随着网站的不断发展，为了防止这些工具由于网站的发展而导致功能不正常，监控网站的变化并向开发人员和测试人员报告是很重要的。现有的监控工具主要使用基于dom树的技术来检测新网页的变化。然而，这些监控工具错误地将基于内容的更改(即，每次检索网页时刷新的web内容)报告为会对IR和RPA工具的性能产生不利影响的更改。这将导致错误的警告，因为IR和RPA工具通常会将这些更改视为预期的，并从中检索动态数据。此外，这些监视工具不能识别GUI小部件的演变(例如，移动按钮)，因此不能帮助IR和RPA工具适应演变的小部件(例如，为演变的小部件自动修复定位器)。为了解决现有监控工具的局限性，我们提出了一种方法，WebEvo，它利用历史页面来识别DOM元素，这些元素的变化是基于内容的变化，在报告新网页的变化时可以安全地忽略它们。此外，为了识别那些保留语义和GUI小部件外观的重构变化，WebEvo采用了计算机视觉(CV)技术，以逐个元素的基础来识别GUI小部件从旧网页到新网页的映射。对来自9个流行类别的13个真实网站的实证评估表明，WebEvo在有效性和效率方面优于现有的基于dom树的检测或整个页面的视觉比较。

{"title":"WebEvo: taming web application evolution via detecting semantic structure changes","authors":"Fei Shao, Ruiwen Xu, W. Haque, Jingwei Xu, Ying Zhang, Wei Yang, Yanfang Ye, Xusheng Xiao","doi":"10.1145/3460319.3464800","DOIUrl":"https://doi.org/10.1145/3460319.3464800","url":null,"abstract":"The development of Web technology and the beginning of the Big Data era have led to the development of technologies for extracting data from websites, such as information retrieval (IR) and robotic process automation (RPA) tools. As websites are constantly evolving, to prevent these tools from functioning improperly due to website evolution, it is important to monitor the changes in websites and report them to the developers and testers. Existing monitoring tools mainly use DOM-tree based techniques to detect changes in the new web pages. However, these monitoring tools incorrectly report content-based changes (i.e., web content refreshed every time a web page is retrieved) as the changes that will adversely affect the performance of the IR and RPA tools. This results in false warnings since the IR and RPA tools typically consider these changes as expected and retrieve dynamic data from them. Moreover, these monitoring tools cannot identify GUI widget evolution (e.g., moving a button), and thus cannot help the IR and RPA tools adapt to the evolved widgets (e.g., automatic repair of locators for the evolved widgets). To address the limitations of the existing monitoring tools, we propose an approach, WebEvo, that leverages historic pages to identify the DOM elements whose changes are content-based changes, which can be safely ignored when reporting changes in the new web pages. Furthermore, to identify refactoring changes that preserve semantics and appearances of GUI widgets, WebEvo adapts computer vision (CV) techniques to identify the mappings of the GUI widgets from the old web page to the new web page on an element-by-element basis. Empirical evaluations on 13 real-world websites from 9 popular categories demonstrate the superiority of WebEvo over the existing DOM-tree based detection or whole-page visual comparison in terms of both effectiveness and efficiency.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115230561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Boosting symbolic execution via constraint solving time prediction (experience paper) 通过约束求解时间预测提高符号执行(经验论文)

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3464813

Sicheng Luo, Hui Xu, Yanxiang Bi, Xin Wang, Yangfan Zhou

Symbolic execution is an essential approach for automated test case generation. However, the approach is generally not scalable to large programs. One critical reason is that the constraint solving problems in symbolic execution are generally hard. Consequently, the symbolic execution process may get stuck in solving such hard problems. To mitigate this issue, symbolic execution tools generally rely on a timeout threshold to terminate the solving. Such a timeout is generally set to a fixed, predefined value, e.g., five minutes in angr. Nevertheless, how to set a proper timeout is critical to the tool’s efficiency. This paper proposes an approach to tackle the problem by predicting the time required for solving a constraint model so that the symbolic execution engine could base on the information to determine whether to continue the current solving process. Due to the cost of the prediction itself, our approach triggers the predictor only when the solving time has exceeded a relatively small value. We have shown that such a predictor can achieve promising performance with several different machine learning models and datasets. By further employing an adaptive design, the predictor can achieve an F1-score ranging from 0.743 to 0.800 on these datasets. We then apply the predictor to eight programs and conduct simulation experiments. Results show that the efficiency of constraint solving for symbolic execution can be improved by 1.25x to 3x, depending on the distribution of the hardness of their constraint models.

符号执行是自动化测试用例生成的基本方法。然而，这种方法通常不能扩展到大型程序中。一个关键原因是符号执行中的约束解决问题通常很困难。因此，符号执行过程可能会在解决这些难题时陷入困境。为了缓解这个问题，符号执行工具通常依赖于超时阈值来终止求解。这种超时通常设置为固定的预定义值，例如，在angr中设置为5分钟。然而，如何设置适当的超时对工具的效率至关重要。本文提出了一种通过预测求解约束模型所需的时间来解决这个问题的方法，以便符号执行引擎可以根据这些信息来决定是否继续当前的求解过程。由于预测本身的成本，我们的方法仅在求解时间超过一个相对较小的值时触发预测器。我们已经证明，这样的预测器可以在几个不同的机器学习模型和数据集上实现有希望的性能。通过进一步采用自适应设计，预测器可以在这些数据集上获得从0.743到0.800的f1分数。然后将该预测器应用于8个程序并进行了仿真实验。结果表明，根据约束模型的硬度分布，符号执行的约束求解效率可以提高1.25倍到3倍。

{"title":"Boosting symbolic execution via constraint solving time prediction (experience paper)","authors":"Sicheng Luo, Hui Xu, Yanxiang Bi, Xin Wang, Yangfan Zhou","doi":"10.1145/3460319.3464813","DOIUrl":"https://doi.org/10.1145/3460319.3464813","url":null,"abstract":"Symbolic execution is an essential approach for automated test case generation. However, the approach is generally not scalable to large programs. One critical reason is that the constraint solving problems in symbolic execution are generally hard. Consequently, the symbolic execution process may get stuck in solving such hard problems. To mitigate this issue, symbolic execution tools generally rely on a timeout threshold to terminate the solving. Such a timeout is generally set to a fixed, predefined value, e.g., five minutes in angr. Nevertheless, how to set a proper timeout is critical to the tool’s efficiency. This paper proposes an approach to tackle the problem by predicting the time required for solving a constraint model so that the symbolic execution engine could base on the information to determine whether to continue the current solving process. Due to the cost of the prediction itself, our approach triggers the predictor only when the solving time has exceeded a relatively small value. We have shown that such a predictor can achieve promising performance with several different machine learning models and datasets. By further employing an adaptive design, the predictor can achieve an F1-score ranging from 0.743 to 0.800 on these datasets. We then apply the predictor to eight programs and conduct simulation experiments. Results show that the efficiency of constraint solving for symbolic execution can be improved by 1.25x to 3x, depending on the distribution of the hardness of their constraint models.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128127598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Interval constraint-based mutation testing of numerical specifications 基于区间约束的数值规格突变检测

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3464808

Clothilde Jeangoudoux, Eva Darulova, C. Lauter

Mutation testing is an established approach for checking whether code satisfies a code-independent functional specification, and for evaluating whether a test set is adequate. Current mutation testing approaches, however, do not account for accuracy requirements that appear with numerical specifications implemented in floating- point arithmetic code, but which are a frequent part of safety-critical software. We present Magneto, an instantiation of mutation testing that fully automatically generates a test set from a real-valued specification. The generated tests check numerical code for accuracy, robustness and functional behavior bugs. Our technique is based on formulating test case and oracle generation as a constraint satisfaction problem over interval domains, which soundly bounds errors, but is nonetheless efficient. We evaluate Magneto on a standard floating-point benchmark set and find that it outperforms a random testing baseline for producing useful adequate test sets.

突变测试是一种既定的方法，用于检查代码是否满足与代码无关的功能规范，以及评估测试集是否足够。然而，当前的突变测试方法并没有考虑到在浮点算术代码中实现的数值规范中出现的精度要求，但这是安全关键软件的常见部分。我们给出了Magneto，一个突变测试的实例，它完全自动地从实值规范生成测试集。生成的测试检查数字代码的准确性、健壮性和功能行为错误。我们的技术是基于将测试用例和oracle生成作为区间域上的约束满足问题，这合理地限制了错误，但仍然是有效的。我们在标准浮点基准集上评估了Magneto，并发现它在生成有用的测试集方面优于随机测试基线。

引用次数: 0

Automated debugging: past, present, and future (ISSTA impact paper award) 自动化调试:过去、现在和未来(ISSTA影响论文奖)

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3472397

Chris Parnin, A. Orso

The paper titled “Are Automated Debugging Techniques Actually Helping Programmers?” was published in the proceedings of the International Symposium on Software Testing and Analysis (ISSTA) in 2011, and has been selected to receive the ISSTA 2021 Impact Paper Award. The paper investigated, through two user studies, how developers used and benefited from popular automated debugging techniques. The results of the studies provided (1) evidence that several assumptions made by automated debugging techniques did not hold in practice and (2) insights on limitations of existing approaches and how these limitations could be addressed. In this talk, we revisit the original paper and the work that led to it. We then assess the impact of that research by reviewing how the area of automated debugging has evolved since the paper was published. Finally, we conclude the talk by reflecting on the current state of the art in this area and discussing open issues and potential directions for future work.

这篇题为“自动化调试技术真的对程序员有帮助吗?”的论文发表在2011年国际软件测试与分析研讨会(ISSTA)的论文集上，并被选为ISSTA 2021年影响论文奖。通过两个用户研究，本文调查了开发人员如何使用和受益于流行的自动调试技术。这些研究的结果提供了(1)证据，证明自动化调试技术所做的一些假设在实践中并不成立;(2)对现有方法的局限性以及如何解决这些局限性的见解。在这次演讲中，我们将重新审视最初的论文和导致它的工作。然后，我们通过回顾自论文发表以来自动调试领域的发展情况来评估该研究的影响。最后，我们通过反思这一领域的现状并讨论未来工作的开放问题和潜在方向来结束谈话。

引用次数: 1

Parema: an unpacking framework for demystifying VM-based Android packers Parema:一个解包框架，用于揭开基于vm的Android打包程序的神秘面纱

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3464839

Lei Xue, Yuxiao Yan, Luyi Yan, Muhui Jiang, Xiapu Luo, Dinghao Wu, Yajin Zhou

Android packers have been widely adopted by developers to protect apps from being plagiarized. Meanwhile, various unpacking tools unpack the apps through direct memory dumping. To defend against these off-the-shelf unpacking tools, packers start to adopt virtual machine (VM) based protection techniques, which replace the original Dalvik bytecode (DCode) with customized bytecode (PCode) in memory. This defeats the unpackers using memory dumping mechanisms. However, little is known about whether such packers can provide enough protection to Android apps. In this paper, we aim to shed light on these questions and take the first step towards demystifying the protections provided to the apps by the VM-based packers. We proposed novel program analysis techniques to investigate existing commercial VM-based packers including a learning phase and a deobfuscation phase.We aim at deobfuscating the VM-protection DCode in three scenarios, recovering original DCode or its semantics with training apps, and restoring the semantics without training apps. We also develop a prototype named Parema to automate much work of the deobfuscation procedure. By applying it to the online VM-based Android packers, we reveal that all evaluated packers do not provide adequate protection and could be compromised.

Android打包程序已经被开发者广泛采用，以防止应用程序被剽窃。同时，各种解包工具通过直接内存转储来解包应用程序。为了防御这些现成的解包工具，打包者开始采用基于虚拟机(VM)的保护技术，将内存中的原始Dalvik字节码(DCode)替换为定制的字节码(PCode)。这挫败了使用内存转储机制的解包程序。然而，对于这样的包装程序能否为Android应用程序提供足够的保护，人们知之甚少。在本文中，我们的目标是阐明这些问题，并迈出第一步，揭开虚拟机为应用程序提供保护的神秘面纱。我们提出了新的程序分析技术来研究现有的基于虚拟机的商业包装程序，包括学习阶段和去混淆阶段。我们的目标是在三种情况下对vm保护DCode进行解混淆，使用训练应用程序恢复原始DCode或其语义，以及不使用训练应用程序恢复语义。我们还开发了一个名为Parema的原型，以自动化许多去混淆过程的工作。通过将其应用于基于虚拟机的在线Android打包程序，我们发现所有被评估的打包程序都没有提供足够的保护，并且可能受到损害。

{"title":"Parema: an unpacking framework for demystifying VM-based Android packers","authors":"Lei Xue, Yuxiao Yan, Luyi Yan, Muhui Jiang, Xiapu Luo, Dinghao Wu, Yajin Zhou","doi":"10.1145/3460319.3464839","DOIUrl":"https://doi.org/10.1145/3460319.3464839","url":null,"abstract":"Android packers have been widely adopted by developers to protect apps from being plagiarized. Meanwhile, various unpacking tools unpack the apps through direct memory dumping. To defend against these off-the-shelf unpacking tools, packers start to adopt virtual machine (VM) based protection techniques, which replace the original Dalvik bytecode (DCode) with customized bytecode (PCode) in memory. This defeats the unpackers using memory dumping mechanisms. However, little is known about whether such packers can provide enough protection to Android apps. In this paper, we aim to shed light on these questions and take the first step towards demystifying the protections provided to the apps by the VM-based packers. We proposed novel program analysis techniques to investigate existing commercial VM-based packers including a learning phase and a deobfuscation phase.We aim at deobfuscating the VM-protection DCode in three scenarios, recovering original DCode or its semantics with training apps, and restoring the semantics without training apps. We also develop a prototype named Parema to automate much work of the deobfuscation procedure. By applying it to the online VM-based Android packers, we reveal that all evaluated packers do not provide adequate protection and could be compromised.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114391091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Fuzzing SMT solvers via two-dimensional input space exploration 通过二维输入空间探索模糊SMT求解器

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3464803

Peisen Yao, Heqing Huang, Wensheng Tang, Qingkai Shi, Rongxin Wu, Charles Zhang

Satisfiability Modulo Theories (SMT) solvers serve as the core engine of many techniques, such as symbolic execution. Therefore, ensuring the robustness and correctness of SMT solvers is critical. While fuzzing is an efficient and effective method for validating the quality of SMT solvers, we observe that prior fuzzing work only focused on generating various first-order formulas as the inputs but neglected the algorithmic configuration space of an SMT solver, which leads to under-reporting many deeply-hidden bugs. In this paper, we present Falcon, a fuzzing technique that explores both the formula space and the configuration space. Combining the two spaces significantly enlarges the search space and makes it challenging to detect bugs efficiently. We solve this problem by utilizing the correlations between the two spaces to reduce the search space, and introducing an adaptive mutation strategy to boost the search efficiency. During six months of extensive testing, Falcon finds 518 confirmed bugs in CVC4 and Z3, two state-of-the-art SMT solvers, 469 of which have already been fixed. Compared to two state-of-the-art fuzzers, Falcon detects 38 and 44 more bugs and improves the coverage by a large margin in 24 hours of testing.

可满足模理论(SMT)解算器是许多技术的核心引擎，如符号执行。因此，确保SMT求解器的鲁棒性和正确性至关重要。虽然模糊测试是验证SMT解算器质量的有效方法，但我们观察到，之前的模糊测试工作只关注生成各种一阶公式作为输入，而忽略了SMT解算器的算法配置空间，这导致低估了许多深度隐藏的错误。在本文中，我们提出了Falcon，一种同时探索公式空间和组态空间的模糊技术。这两个空间的结合极大地扩大了搜索空间，使得有效地检测bug变得困难。我们利用两个空间之间的相关性来减少搜索空间，并引入自适应突变策略来提高搜索效率。在六个月的广泛测试中，Falcon在CVC4和Z3这两个最先进的SMT解决方案中发现了518个已确认的漏洞，其中469个已经修复。与两款最先进的fuzzers相比，Falcon在24小时的测试中多检测了38个和44个漏洞，并大大提高了覆盖率。

{"title":"Fuzzing SMT solvers via two-dimensional input space exploration","authors":"Peisen Yao, Heqing Huang, Wensheng Tang, Qingkai Shi, Rongxin Wu, Charles Zhang","doi":"10.1145/3460319.3464803","DOIUrl":"https://doi.org/10.1145/3460319.3464803","url":null,"abstract":"Satisfiability Modulo Theories (SMT) solvers serve as the core engine of many techniques, such as symbolic execution. Therefore, ensuring the robustness and correctness of SMT solvers is critical. While fuzzing is an efficient and effective method for validating the quality of SMT solvers, we observe that prior fuzzing work only focused on generating various first-order formulas as the inputs but neglected the algorithmic configuration space of an SMT solver, which leads to under-reporting many deeply-hidden bugs. In this paper, we present Falcon, a fuzzing technique that explores both the formula space and the configuration space. Combining the two spaces significantly enlarges the search space and makes it challenging to detect bugs efficiently. We solve this problem by utilizing the correlations between the two spaces to reduce the search space, and introducing an adaptive mutation strategy to boost the search efficiency. During six months of extensive testing, Falcon finds 518 confirmed bugs in CVC4 and Z3, two state-of-the-art SMT solvers, 469 of which have already been fixed. Compared to two state-of-the-art fuzzers, Falcon detects 38 and 44 more bugs and improves the coverage by a large margin in 24 hours of testing.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128244979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

AdvDoor: adversarial backdoor attack of deep learning system AdvDoor:深度学习系统的对抗性后门攻击

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2021-07-11 DOI: 10.1145/3460319.3464809

Quan Zhang, Yifeng Ding, Yongqiang Tian, Jianmin Guo, Min Yuan, Yu Jiang

Deep Learning (DL) system has been widely used in many critical applications, such as autonomous vehicles and unmanned aerial vehicles. However, their security is threatened by backdoor attack, which is achieved by adding artificial patterns on specific training data. Existing attack methods normally poison the data using a patch, and they can be easily detected by existing detection methods. In this work, we propose the Adversarial Backdoor, which utilizes the Targeted Universal Adversarial Perturbation (TUAP) to hide the anomalies in DL models and confuse existing powerful detection methods. With extensive experiments, it is demonstrated that Adversarial Backdoor can be injected stably with an attack success rate around 98%. Moreover, Adversarial Backdoor can bypass state-of-the-art backdoor detection methods. More specifically, only around 37% of the poisoned models can be caught, and less than 29% of the poisoned data cannot bypass the detection. In contrast, for the patch backdoor, all the poisoned models and more than 80% of the poisoned data will be detected. This work intends to alarm the researchers and developers of this potential threat and to inspire the designing of effective detection methods.

深度学习(Deep Learning, DL)系统已广泛应用于自动驾驶汽车和无人机等关键领域。然而，它们的安全性受到后门攻击的威胁，后门攻击是通过在特定的训练数据上添加人工模式来实现的。现有的攻击方法通常使用补丁来毒害数据，并且很容易被现有的检测方法检测到。在这项工作中，我们提出了对抗性后门，它利用目标通用对抗性摄动(TUAP)来隐藏深度学习模型中的异常，并混淆现有的强大检测方法。通过大量的实验证明，对抗性后门可以稳定注入，攻击成功率在98%左右。此外，对抗性后门可以绕过最先进的后门检测方法。更具体地说，只有大约37%的中毒模型可以被捕获，不到29%的中毒数据不能绕过检测。相比之下，对于补丁后门，所有的中毒车型和80%以上的中毒数据都会被检测出来。这项工作旨在提醒研究人员和开发人员注意这一潜在威胁，并启发设计有效的检测方法。

{"title":"AdvDoor: adversarial backdoor attack of deep learning system","authors":"Quan Zhang, Yifeng Ding, Yongqiang Tian, Jianmin Guo, Min Yuan, Yu Jiang","doi":"10.1145/3460319.3464809","DOIUrl":"https://doi.org/10.1145/3460319.3464809","url":null,"abstract":"Deep Learning (DL) system has been widely used in many critical applications, such as autonomous vehicles and unmanned aerial vehicles. However, their security is threatened by backdoor attack, which is achieved by adding artificial patterns on specific training data. Existing attack methods normally poison the data using a patch, and they can be easily detected by existing detection methods. In this work, we propose the Adversarial Backdoor, which utilizes the Targeted Universal Adversarial Perturbation (TUAP) to hide the anomalies in DL models and confuse existing powerful detection methods. With extensive experiments, it is demonstrated that Adversarial Backdoor can be injected stably with an attack success rate around 98%. Moreover, Adversarial Backdoor can bypass state-of-the-art backdoor detection methods. More specifically, only around 37% of the poisoned models can be caught, and less than 29% of the poisoned data cannot bypass the detection. In contrast, for the patch backdoor, all the poisoned models and more than 80% of the poisoned data will be detected. This work intends to alarm the researchers and developers of this potential threat and to inspire the designing of effective detection methods.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133853507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31