Salma Messaoudi, Donghwan Shin, Annibale Panichella, D. Bianculli, L. Briand
Regression testing is arguably one of the most important activities in software testing. However, its cost-effectiveness and usefulness can be largely impaired by complex system test cases that are poorly designed (e.g., test cases containing multiple test scenarios combined into a single test case) and that require a large amount of time and resources to run. One way to mitigate this issue is decomposing such system test cases into smaller, separate test cases---each of them with only one test scenario and with its corresponding assertions---so that the execution time of the decomposed test cases is lower than the original test cases, while the test effectiveness of the original test cases is preserved. This decomposition can be achieved with program slicing techniques, since test cases are software programs too. However, existing static and dynamic slicing techniques exhibit limitations when (1) the test cases use external resources, (2) code instrumentation is not a viable option, and (3) test execution is expensive. In this paper, we propose a novel approach, called DS3 (Decomposing System teSt caSe), which automatically decomposes a complex system test case into separate test case slices. The idea is to use test case execution logs, obtained from past regression testing sessions, to identify "hidden" dependencies in the slices generated by static slicing. Since logs include run-time information about the system under test, we can use them to extract access and usage of global resources and refine the slices generated by static slicing. We evaluated DS3 in terms of slicing effectiveness and compared it with a vanilla static slicing tool. We also compared the slices obtained by DS3 with the corresponding original system test cases, in terms of test efficiency and effectiveness. The evaluation results on one proprietary system and one open-source system show that DS3 is able to accurately identify the dependencies related to the usage of global resources, which vanilla static slicing misses. Moreover, the generated test case slices are, on average, 3.56 times faster than original system test cases and they exhibit no significant loss in terms of fault detection effectiveness.
{"title":"Log-based slicing for system-level test cases","authors":"Salma Messaoudi, Donghwan Shin, Annibale Panichella, D. Bianculli, L. Briand","doi":"10.1145/3460319.3464824","DOIUrl":"https://doi.org/10.1145/3460319.3464824","url":null,"abstract":"Regression testing is arguably one of the most important activities in software testing. However, its cost-effectiveness and usefulness can be largely impaired by complex system test cases that are poorly designed (e.g., test cases containing multiple test scenarios combined into a single test case) and that require a large amount of time and resources to run. One way to mitigate this issue is decomposing such system test cases into smaller, separate test cases---each of them with only one test scenario and with its corresponding assertions---so that the execution time of the decomposed test cases is lower than the original test cases, while the test effectiveness of the original test cases is preserved. This decomposition can be achieved with program slicing techniques, since test cases are software programs too. However, existing static and dynamic slicing techniques exhibit limitations when (1) the test cases use external resources, (2) code instrumentation is not a viable option, and (3) test execution is expensive. In this paper, we propose a novel approach, called DS3 (Decomposing System teSt caSe), which automatically decomposes a complex system test case into separate test case slices. The idea is to use test case execution logs, obtained from past regression testing sessions, to identify \"hidden\" dependencies in the slices generated by static slicing. Since logs include run-time information about the system under test, we can use them to extract access and usage of global resources and refine the slices generated by static slicing. We evaluated DS3 in terms of slicing effectiveness and compared it with a vanilla static slicing tool. We also compared the slices obtained by DS3 with the corresponding original system test cases, in terms of test efficiency and effectiveness. The evaluation results on one proprietary system and one open-source system show that DS3 is able to accurately identify the dependencies related to the usage of global resources, which vanilla static slicing misses. Moreover, the generated test case slices are, on average, 3.56 times faster than original system test cases and they exhibit no significant loss in terms of fault detection effectiveness.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134379591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yakun Zhang, Xiao Lv, Haoyu Dong, Wensheng Dou, Shi Han, Dongmei Zhang, Jun Wei, Dan Ye
Spreadsheets are widely used in various business tasks, and contain amounts of valuable data. However, spreadsheet tables are usually organized in a semi-structured way, and contain complicated semantic structures, e.g., header types and relations among headers. Lack of documented semantic table structures, existing data analysis and error detection tools can hardly understand spreadsheet tables. Therefore, identifying semantic table structures in spreadsheet tables is of great importance, and can greatly promote various analysis tasks on spreadsheets. In this paper, we propose Tasi (Table structure identification) to automatically identify semantic table structures in spreadsheets. Based on the contents, styles, and spatial locations in table headers, Tasi adopts a multi-classifier to predict potential header types and relations, and then integrates all header types and relations into consistent semantic table structures. We further propose TasiError, to detect spreadsheet errors based on the identified semantic table structures by Tasi. Our experiments on real-world spreadsheets show that, Tasi can precisely identify semantic table structures in spreadsheets, and TasiError can detect real-world spreadsheet errors with higher precision (75.2%) and recall (82.9%) than existing approaches.
{"title":"Semantic table structure identification in spreadsheets","authors":"Yakun Zhang, Xiao Lv, Haoyu Dong, Wensheng Dou, Shi Han, Dongmei Zhang, Jun Wei, Dan Ye","doi":"10.1145/3460319.3464812","DOIUrl":"https://doi.org/10.1145/3460319.3464812","url":null,"abstract":"Spreadsheets are widely used in various business tasks, and contain amounts of valuable data. However, spreadsheet tables are usually organized in a semi-structured way, and contain complicated semantic structures, e.g., header types and relations among headers. Lack of documented semantic table structures, existing data analysis and error detection tools can hardly understand spreadsheet tables. Therefore, identifying semantic table structures in spreadsheet tables is of great importance, and can greatly promote various analysis tasks on spreadsheets. In this paper, we propose Tasi (Table structure identification) to automatically identify semantic table structures in spreadsheets. Based on the contents, styles, and spatial locations in table headers, Tasi adopts a multi-classifier to predict potential header types and relations, and then integrates all header types and relations into consistent semantic table structures. We further propose TasiError, to detect spreadsheet errors based on the identified semantic table structures by Tasi. Our experiments on real-world spreadsheets show that, Tasi can precisely identify semantic table structures in spreadsheets, and TasiError can detect real-world spreadsheet errors with higher precision (75.2%) and recall (82.9%) than existing approaches.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"109 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113960401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Use-After-Free (UAF) vulnerabilities constitute severe threats to software security. In contrast to other memory errors, UAFs are more difficult to detect through manual or static analysis due to pointer aliases and complicated relationships between pointers and objects. Existing evidence-based dynamic detection approaches only track either pointers or objects to record the availability of objects, which become invalid when the memory that stored the freed object is reallocated. To this end, we propose an approach UAFSan dedicated to comprehensively detecting UAFs at runtime. Specifically, we assign a unique identifier to each newly-allocated object and its pointers; when a pointer dereferences a memory object, we determine whether a UAF occurs by checking the consistency of their identifiers. We implement UAFSan in an open-source tool and evaluate it on a large collection of popular benchmarks and real-world programs. The experiment results demonstrate that UAFSan successfully detect all UAFs with reasonable overhead, whereas existing publicly-available dynamic detectors all miss certain UAFs.
{"title":"UAFSan: an object-identifier-based dynamic approach for detecting use-after-free vulnerabilities","authors":"Binfa Gui, Wei Song, Jeff Huang","doi":"10.1145/3460319.3464835","DOIUrl":"https://doi.org/10.1145/3460319.3464835","url":null,"abstract":"Use-After-Free (UAF) vulnerabilities constitute severe threats to software security. In contrast to other memory errors, UAFs are more difficult to detect through manual or static analysis due to pointer aliases and complicated relationships between pointers and objects. Existing evidence-based dynamic detection approaches only track either pointers or objects to record the availability of objects, which become invalid when the memory that stored the freed object is reallocated. To this end, we propose an approach UAFSan dedicated to comprehensively detecting UAFs at runtime. Specifically, we assign a unique identifier to each newly-allocated object and its pointers; when a pointer dereferences a memory object, we determine whether a UAF occurs by checking the consistency of their identifiers. We implement UAFSan in an open-source tool and evaluate it on a large collection of popular benchmarks and real-world programs. The experiment results demonstrate that UAFSan successfully detect all UAFs with reasonable overhead, whereas existing publicly-available dynamic detectors all miss certain UAFs.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116413086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fei Shao, Ruiwen Xu, W. Haque, Jingwei Xu, Ying Zhang, Wei Yang, Yanfang Ye, Xusheng Xiao
The development of Web technology and the beginning of the Big Data era have led to the development of technologies for extracting data from websites, such as information retrieval (IR) and robotic process automation (RPA) tools. As websites are constantly evolving, to prevent these tools from functioning improperly due to website evolution, it is important to monitor the changes in websites and report them to the developers and testers. Existing monitoring tools mainly use DOM-tree based techniques to detect changes in the new web pages. However, these monitoring tools incorrectly report content-based changes (i.e., web content refreshed every time a web page is retrieved) as the changes that will adversely affect the performance of the IR and RPA tools. This results in false warnings since the IR and RPA tools typically consider these changes as expected and retrieve dynamic data from them. Moreover, these monitoring tools cannot identify GUI widget evolution (e.g., moving a button), and thus cannot help the IR and RPA tools adapt to the evolved widgets (e.g., automatic repair of locators for the evolved widgets). To address the limitations of the existing monitoring tools, we propose an approach, WebEvo, that leverages historic pages to identify the DOM elements whose changes are content-based changes, which can be safely ignored when reporting changes in the new web pages. Furthermore, to identify refactoring changes that preserve semantics and appearances of GUI widgets, WebEvo adapts computer vision (CV) techniques to identify the mappings of the GUI widgets from the old web page to the new web page on an element-by-element basis. Empirical evaluations on 13 real-world websites from 9 popular categories demonstrate the superiority of WebEvo over the existing DOM-tree based detection or whole-page visual comparison in terms of both effectiveness and efficiency.
{"title":"WebEvo: taming web application evolution via detecting semantic structure changes","authors":"Fei Shao, Ruiwen Xu, W. Haque, Jingwei Xu, Ying Zhang, Wei Yang, Yanfang Ye, Xusheng Xiao","doi":"10.1145/3460319.3464800","DOIUrl":"https://doi.org/10.1145/3460319.3464800","url":null,"abstract":"The development of Web technology and the beginning of the Big Data era have led to the development of technologies for extracting data from websites, such as information retrieval (IR) and robotic process automation (RPA) tools. As websites are constantly evolving, to prevent these tools from functioning improperly due to website evolution, it is important to monitor the changes in websites and report them to the developers and testers. Existing monitoring tools mainly use DOM-tree based techniques to detect changes in the new web pages. However, these monitoring tools incorrectly report content-based changes (i.e., web content refreshed every time a web page is retrieved) as the changes that will adversely affect the performance of the IR and RPA tools. This results in false warnings since the IR and RPA tools typically consider these changes as expected and retrieve dynamic data from them. Moreover, these monitoring tools cannot identify GUI widget evolution (e.g., moving a button), and thus cannot help the IR and RPA tools adapt to the evolved widgets (e.g., automatic repair of locators for the evolved widgets). To address the limitations of the existing monitoring tools, we propose an approach, WebEvo, that leverages historic pages to identify the DOM elements whose changes are content-based changes, which can be safely ignored when reporting changes in the new web pages. Furthermore, to identify refactoring changes that preserve semantics and appearances of GUI widgets, WebEvo adapts computer vision (CV) techniques to identify the mappings of the GUI widgets from the old web page to the new web page on an element-by-element basis. Empirical evaluations on 13 real-world websites from 9 popular categories demonstrate the superiority of WebEvo over the existing DOM-tree based detection or whole-page visual comparison in terms of both effectiveness and efficiency.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115230561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Symbolic execution is an essential approach for automated test case generation. However, the approach is generally not scalable to large programs. One critical reason is that the constraint solving problems in symbolic execution are generally hard. Consequently, the symbolic execution process may get stuck in solving such hard problems. To mitigate this issue, symbolic execution tools generally rely on a timeout threshold to terminate the solving. Such a timeout is generally set to a fixed, predefined value, e.g., five minutes in angr. Nevertheless, how to set a proper timeout is critical to the tool’s efficiency. This paper proposes an approach to tackle the problem by predicting the time required for solving a constraint model so that the symbolic execution engine could base on the information to determine whether to continue the current solving process. Due to the cost of the prediction itself, our approach triggers the predictor only when the solving time has exceeded a relatively small value. We have shown that such a predictor can achieve promising performance with several different machine learning models and datasets. By further employing an adaptive design, the predictor can achieve an F1-score ranging from 0.743 to 0.800 on these datasets. We then apply the predictor to eight programs and conduct simulation experiments. Results show that the efficiency of constraint solving for symbolic execution can be improved by 1.25x to 3x, depending on the distribution of the hardness of their constraint models.
{"title":"Boosting symbolic execution via constraint solving time prediction (experience paper)","authors":"Sicheng Luo, Hui Xu, Yanxiang Bi, Xin Wang, Yangfan Zhou","doi":"10.1145/3460319.3464813","DOIUrl":"https://doi.org/10.1145/3460319.3464813","url":null,"abstract":"Symbolic execution is an essential approach for automated test case generation. However, the approach is generally not scalable to large programs. One critical reason is that the constraint solving problems in symbolic execution are generally hard. Consequently, the symbolic execution process may get stuck in solving such hard problems. To mitigate this issue, symbolic execution tools generally rely on a timeout threshold to terminate the solving. Such a timeout is generally set to a fixed, predefined value, e.g., five minutes in angr. Nevertheless, how to set a proper timeout is critical to the tool’s efficiency. This paper proposes an approach to tackle the problem by predicting the time required for solving a constraint model so that the symbolic execution engine could base on the information to determine whether to continue the current solving process. Due to the cost of the prediction itself, our approach triggers the predictor only when the solving time has exceeded a relatively small value. We have shown that such a predictor can achieve promising performance with several different machine learning models and datasets. By further employing an adaptive design, the predictor can achieve an F1-score ranging from 0.743 to 0.800 on these datasets. We then apply the predictor to eight programs and conduct simulation experiments. Results show that the efficiency of constraint solving for symbolic execution can be improved by 1.25x to 3x, depending on the distribution of the hardness of their constraint models.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128127598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mutation testing is an established approach for checking whether code satisfies a code-independent functional specification, and for evaluating whether a test set is adequate. Current mutation testing approaches, however, do not account for accuracy requirements that appear with numerical specifications implemented in floating- point arithmetic code, but which are a frequent part of safety-critical software. We present Magneto, an instantiation of mutation testing that fully automatically generates a test set from a real-valued specification. The generated tests check numerical code for accuracy, robustness and functional behavior bugs. Our technique is based on formulating test case and oracle generation as a constraint satisfaction problem over interval domains, which soundly bounds errors, but is nonetheless efficient. We evaluate Magneto on a standard floating-point benchmark set and find that it outperforms a random testing baseline for producing useful adequate test sets.
{"title":"Interval constraint-based mutation testing of numerical specifications","authors":"Clothilde Jeangoudoux, Eva Darulova, C. Lauter","doi":"10.1145/3460319.3464808","DOIUrl":"https://doi.org/10.1145/3460319.3464808","url":null,"abstract":"Mutation testing is an established approach for checking whether code satisfies a code-independent functional specification, and for evaluating whether a test set is adequate. Current mutation testing approaches, however, do not account for accuracy requirements that appear with numerical specifications implemented in floating- point arithmetic code, but which are a frequent part of safety-critical software. We present Magneto, an instantiation of mutation testing that fully automatically generates a test set from a real-valued specification. The generated tests check numerical code for accuracy, robustness and functional behavior bugs. Our technique is based on formulating test case and oracle generation as a constraint satisfaction problem over interval domains, which soundly bounds errors, but is nonetheless efficient. We evaluate Magneto on a standard floating-point benchmark set and find that it outperforms a random testing baseline for producing useful adequate test sets.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134408382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper titled “Are Automated Debugging Techniques Actually Helping Programmers?” was published in the proceedings of the International Symposium on Software Testing and Analysis (ISSTA) in 2011, and has been selected to receive the ISSTA 2021 Impact Paper Award. The paper investigated, through two user studies, how developers used and benefited from popular automated debugging techniques. The results of the studies provided (1) evidence that several assumptions made by automated debugging techniques did not hold in practice and (2) insights on limitations of existing approaches and how these limitations could be addressed. In this talk, we revisit the original paper and the work that led to it. We then assess the impact of that research by reviewing how the area of automated debugging has evolved since the paper was published. Finally, we conclude the talk by reflecting on the current state of the art in this area and discussing open issues and potential directions for future work.
{"title":"Automated debugging: past, present, and future (ISSTA impact paper award)","authors":"Chris Parnin, A. Orso","doi":"10.1145/3460319.3472397","DOIUrl":"https://doi.org/10.1145/3460319.3472397","url":null,"abstract":"The paper titled “Are Automated Debugging Techniques Actually Helping Programmers?” was published in the proceedings of the International Symposium on Software Testing and Analysis (ISSTA) in 2011, and has been selected to receive the ISSTA 2021 Impact Paper Award. The paper investigated, through two user studies, how developers used and benefited from popular automated debugging techniques. The results of the studies provided (1) evidence that several assumptions made by automated debugging techniques did not hold in practice and (2) insights on limitations of existing approaches and how these limitations could be addressed. In this talk, we revisit the original paper and the work that led to it. We then assess the impact of that research by reviewing how the area of automated debugging has evolved since the paper was published. Finally, we conclude the talk by reflecting on the current state of the art in this area and discussing open issues and potential directions for future work.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133802850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Android packers have been widely adopted by developers to protect apps from being plagiarized. Meanwhile, various unpacking tools unpack the apps through direct memory dumping. To defend against these off-the-shelf unpacking tools, packers start to adopt virtual machine (VM) based protection techniques, which replace the original Dalvik bytecode (DCode) with customized bytecode (PCode) in memory. This defeats the unpackers using memory dumping mechanisms. However, little is known about whether such packers can provide enough protection to Android apps. In this paper, we aim to shed light on these questions and take the first step towards demystifying the protections provided to the apps by the VM-based packers. We proposed novel program analysis techniques to investigate existing commercial VM-based packers including a learning phase and a deobfuscation phase.We aim at deobfuscating the VM-protection DCode in three scenarios, recovering original DCode or its semantics with training apps, and restoring the semantics without training apps. We also develop a prototype named Parema to automate much work of the deobfuscation procedure. By applying it to the online VM-based Android packers, we reveal that all evaluated packers do not provide adequate protection and could be compromised.
{"title":"Parema: an unpacking framework for demystifying VM-based Android packers","authors":"Lei Xue, Yuxiao Yan, Luyi Yan, Muhui Jiang, Xiapu Luo, Dinghao Wu, Yajin Zhou","doi":"10.1145/3460319.3464839","DOIUrl":"https://doi.org/10.1145/3460319.3464839","url":null,"abstract":"Android packers have been widely adopted by developers to protect apps from being plagiarized. Meanwhile, various unpacking tools unpack the apps through direct memory dumping. To defend against these off-the-shelf unpacking tools, packers start to adopt virtual machine (VM) based protection techniques, which replace the original Dalvik bytecode (DCode) with customized bytecode (PCode) in memory. This defeats the unpackers using memory dumping mechanisms. However, little is known about whether such packers can provide enough protection to Android apps. In this paper, we aim to shed light on these questions and take the first step towards demystifying the protections provided to the apps by the VM-based packers. We proposed novel program analysis techniques to investigate existing commercial VM-based packers including a learning phase and a deobfuscation phase.We aim at deobfuscating the VM-protection DCode in three scenarios, recovering original DCode or its semantics with training apps, and restoring the semantics without training apps. We also develop a prototype named Parema to automate much work of the deobfuscation procedure. By applying it to the online VM-based Android packers, we reveal that all evaluated packers do not provide adequate protection and could be compromised.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114391091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Satisfiability Modulo Theories (SMT) solvers serve as the core engine of many techniques, such as symbolic execution. Therefore, ensuring the robustness and correctness of SMT solvers is critical. While fuzzing is an efficient and effective method for validating the quality of SMT solvers, we observe that prior fuzzing work only focused on generating various first-order formulas as the inputs but neglected the algorithmic configuration space of an SMT solver, which leads to under-reporting many deeply-hidden bugs. In this paper, we present Falcon, a fuzzing technique that explores both the formula space and the configuration space. Combining the two spaces significantly enlarges the search space and makes it challenging to detect bugs efficiently. We solve this problem by utilizing the correlations between the two spaces to reduce the search space, and introducing an adaptive mutation strategy to boost the search efficiency. During six months of extensive testing, Falcon finds 518 confirmed bugs in CVC4 and Z3, two state-of-the-art SMT solvers, 469 of which have already been fixed. Compared to two state-of-the-art fuzzers, Falcon detects 38 and 44 more bugs and improves the coverage by a large margin in 24 hours of testing.
{"title":"Fuzzing SMT solvers via two-dimensional input space exploration","authors":"Peisen Yao, Heqing Huang, Wensheng Tang, Qingkai Shi, Rongxin Wu, Charles Zhang","doi":"10.1145/3460319.3464803","DOIUrl":"https://doi.org/10.1145/3460319.3464803","url":null,"abstract":"Satisfiability Modulo Theories (SMT) solvers serve as the core engine of many techniques, such as symbolic execution. Therefore, ensuring the robustness and correctness of SMT solvers is critical. While fuzzing is an efficient and effective method for validating the quality of SMT solvers, we observe that prior fuzzing work only focused on generating various first-order formulas as the inputs but neglected the algorithmic configuration space of an SMT solver, which leads to under-reporting many deeply-hidden bugs. In this paper, we present Falcon, a fuzzing technique that explores both the formula space and the configuration space. Combining the two spaces significantly enlarges the search space and makes it challenging to detect bugs efficiently. We solve this problem by utilizing the correlations between the two spaces to reduce the search space, and introducing an adaptive mutation strategy to boost the search efficiency. During six months of extensive testing, Falcon finds 518 confirmed bugs in CVC4 and Z3, two state-of-the-art SMT solvers, 469 of which have already been fixed. Compared to two state-of-the-art fuzzers, Falcon detects 38 and 44 more bugs and improves the coverage by a large margin in 24 hours of testing.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128244979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quan Zhang, Yifeng Ding, Yongqiang Tian, Jianmin Guo, Min Yuan, Yu Jiang
Deep Learning (DL) system has been widely used in many critical applications, such as autonomous vehicles and unmanned aerial vehicles. However, their security is threatened by backdoor attack, which is achieved by adding artificial patterns on specific training data. Existing attack methods normally poison the data using a patch, and they can be easily detected by existing detection methods. In this work, we propose the Adversarial Backdoor, which utilizes the Targeted Universal Adversarial Perturbation (TUAP) to hide the anomalies in DL models and confuse existing powerful detection methods. With extensive experiments, it is demonstrated that Adversarial Backdoor can be injected stably with an attack success rate around 98%. Moreover, Adversarial Backdoor can bypass state-of-the-art backdoor detection methods. More specifically, only around 37% of the poisoned models can be caught, and less than 29% of the poisoned data cannot bypass the detection. In contrast, for the patch backdoor, all the poisoned models and more than 80% of the poisoned data will be detected. This work intends to alarm the researchers and developers of this potential threat and to inspire the designing of effective detection methods.
{"title":"AdvDoor: adversarial backdoor attack of deep learning system","authors":"Quan Zhang, Yifeng Ding, Yongqiang Tian, Jianmin Guo, Min Yuan, Yu Jiang","doi":"10.1145/3460319.3464809","DOIUrl":"https://doi.org/10.1145/3460319.3464809","url":null,"abstract":"Deep Learning (DL) system has been widely used in many critical applications, such as autonomous vehicles and unmanned aerial vehicles. However, their security is threatened by backdoor attack, which is achieved by adding artificial patterns on specific training data. Existing attack methods normally poison the data using a patch, and they can be easily detected by existing detection methods. In this work, we propose the Adversarial Backdoor, which utilizes the Targeted Universal Adversarial Perturbation (TUAP) to hide the anomalies in DL models and confuse existing powerful detection methods. With extensive experiments, it is demonstrated that Adversarial Backdoor can be injected stably with an attack success rate around 98%. Moreover, Adversarial Backdoor can bypass state-of-the-art backdoor detection methods. More specifically, only around 37% of the poisoned models can be caught, and less than 29% of the poisoned data cannot bypass the detection. In contrast, for the patch backdoor, all the poisoned models and more than 80% of the poisoned data will be detected. This work intends to alarm the researchers and developers of this potential threat and to inspire the designing of effective detection methods.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133853507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}