首页 > 最新文献

Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis最新文献

英文 中文
Lightweight verification of array indexing 数组索引的轻量级验证
Martin Kellogg, Vlastimil Dort, Suzanne Millstein, Michael D. Ernst
In languages like C, out-of-bounds array accesses lead to security vulnerabilities and crashes. Even in managed languages like Java, which check array bounds at run time, out-of-bounds accesses cause exceptions that terminate the program. We present a lightweight type system that certifies, at compile time, that array accesses in the program are in-bounds. The type system consists of several cooperating hierarchies of dependent types, specialized to the domain of array bounds-checking. Programmers write type annotations at procedure boundaries, allowing modular verification at a cost that scales linearly with program size. We implemented our type system for Java in a tool called the Index Checker. We evaluated the Index Checker on over 100,000 lines of open-source code and discovered array access errors even in well-tested, industrial projects such as Google Guava.
在像C这样的语言中,越界数组访问会导致安全漏洞和崩溃。即使在像Java这样在运行时检查数组边界的托管语言中,越界访问也会导致异常,从而终止程序。我们提供了一个轻量级类型系统,它在编译时证明程序中的数组访问是在边界内的。类型系统由依赖类型的几个协作层次结构组成,专门用于数组边界检查领域。程序员在过程边界编写类型注释,允许模块化验证,其成本与程序大小成线性关系。我们在一个名为Index Checker的工具中实现了Java类型系统。我们在超过100,000行开源代码上评估了Index Checker,甚至在谷歌Guava等经过良好测试的工业项目中也发现了数组访问错误。
{"title":"Lightweight verification of array indexing","authors":"Martin Kellogg, Vlastimil Dort, Suzanne Millstein, Michael D. Ernst","doi":"10.1145/3213846.3213849","DOIUrl":"https://doi.org/10.1145/3213846.3213849","url":null,"abstract":"In languages like C, out-of-bounds array accesses lead to security vulnerabilities and crashes. Even in managed languages like Java, which check array bounds at run time, out-of-bounds accesses cause exceptions that terminate the program. We present a lightweight type system that certifies, at compile time, that array accesses in the program are in-bounds. The type system consists of several cooperating hierarchies of dependent types, specialized to the domain of array bounds-checking. Programmers write type annotations at procedure boundaries, allowing modular verification at a cost that scales linearly with program size. We implemented our type system for Java in a tool called the Index Checker. We evaluated the Index Checker on over 100,000 lines of open-source code and discovered array access errors even in well-tested, industrial projects such as Google Guava.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81645341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Shooting from the heap: ultra-scalable static analysis with heap snapshots 从堆中拍摄:具有堆快照的超可伸缩静态分析
Neville Grech, G. Fourtounis, Adrian Francalanza, Y. Smaragdakis
Traditional whole-program static analysis (e.g., a points-to analysis that models the heap) encounters scalability problems for realistic applications. We propose a ``featherweight'' analysis that combines a dynamic snapshot of the heap with otherwise full static analysis of program behavior. The analysis is extremely scalable, offering speedups of well over 3x, with complexity empirically evaluated to grow linearly relative to the number of reachable methods. The analysis is also an excellent tradeoff of precision and recall (relative to different dynamic executions): while it can never fully capture all program behaviors (i.e., it cannot match the near-perfect recall of a full static analysis) it often approaches it closely while achieving much higher (3.5x) precision.
传统的全程序静态分析(例如,对堆建模的点对分析)在实际应用中会遇到可伸缩性问题。我们提出了一种“轻量级”分析,它结合了堆的动态快照和程序行为的完整静态分析。该分析具有极强的可扩展性,提供了超过3倍的速度提升,并且根据经验评估,复杂度相对于可达方法的数量呈线性增长。该分析也是精度和召回率(相对于不同的动态执行)的一个很好的权衡:虽然它永远不能完全捕获所有的程序行为(即,它不能匹配完整静态分析的近乎完美的召回率),但它经常接近它,同时实现更高(3.5倍)的精度。
{"title":"Shooting from the heap: ultra-scalable static analysis with heap snapshots","authors":"Neville Grech, G. Fourtounis, Adrian Francalanza, Y. Smaragdakis","doi":"10.1145/3213846.3213860","DOIUrl":"https://doi.org/10.1145/3213846.3213860","url":null,"abstract":"Traditional whole-program static analysis (e.g., a points-to analysis that models the heap) encounters scalability problems for realistic applications. We propose a ``featherweight'' analysis that combines a dynamic snapshot of the heap with otherwise full static analysis of program behavior. The analysis is extremely scalable, offering speedups of well over 3x, with complexity empirically evaluated to grow linearly relative to the number of reachable methods. The analysis is also an excellent tradeoff of precision and recall (relative to different dynamic executions): while it can never fully capture all program behaviors (i.e., it cannot match the near-perfect recall of a full static analysis) it often approaches it closely while achieving much higher (3.5x) precision.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"119 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89367869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Bench4BL: reproducibility study on the performance of IR-based bug localization Bench4BL:基于ir的虫虫定位性能的可重复性研究
Jaekwon Lee, Dongsun Kim, Tegawendé F. Bissyandé, Woosung Jung, Yves Le Traon
In recent years, the use of Information Retrieval (IR) techniques to automate the localization of buggy files, given a bug report, has shown promising results. The abundance of approaches in the literature, however, contrasts with the reality of IR-based bug localization (IRBL) adoption by developers (or even by the research community to complement other research approaches). Presumably, this situation is due to the lack of comprehensive evaluations for state-of-the-art approaches which offer insights into the actual performance of the techniques. We report on a comprehensive reproduction study of six state-of-the-art IRBL techniques. This study applies not only subjects used in existing studies (old subjects) but also 46 new subjects (61,431 Java files and 9,459 bug reports) to the IRBL techniques. In addition, the study compares two different version matching (between bug reports and source code files) strategies to highlight some observations related to performance deterioration. We also vary test file inclusion to investigate the effectiveness of IRBL techniques on test files, or its noise impact on performance. Finally, we assess potential performance gain if duplicate bug reports are leveraged.
近年来,利用信息检索(Information Retrieval, IR)技术,在给出错误报告的情况下,自动定位错误文件,已经显示出良好的效果。然而,文献中丰富的方法与开发人员(甚至是研究社区为补充其他研究方法而采用的基于ir的错误定位(IRBL)的现实形成了鲜明对比。据推测,这种情况是由于缺乏对最先进的方法的全面评估,这些方法提供了对技术实际性能的见解。我们报告了六种最先进的IRBL技术的全面复制研究。本研究不仅将现有研究中使用的对象(旧对象)应用于IRBL技术,还将46个新对象(61431个Java文件和9459个bug报告)应用于IRBL技术。此外,该研究还比较了两种不同的版本匹配(在bug报告和源代码文件之间)策略,以突出一些与性能下降相关的观察结果。我们还改变了测试文件的包含,以研究IRBL技术对测试文件的有效性,或者它的噪声对性能的影响。最后,如果利用了重复的bug报告,我们将评估潜在的性能增益。
{"title":"Bench4BL: reproducibility study on the performance of IR-based bug localization","authors":"Jaekwon Lee, Dongsun Kim, Tegawendé F. Bissyandé, Woosung Jung, Yves Le Traon","doi":"10.1145/3213846.3213856","DOIUrl":"https://doi.org/10.1145/3213846.3213856","url":null,"abstract":"In recent years, the use of Information Retrieval (IR) techniques to automate the localization of buggy files, given a bug report, has shown promising results. The abundance of approaches in the literature, however, contrasts with the reality of IR-based bug localization (IRBL) adoption by developers (or even by the research community to complement other research approaches). Presumably, this situation is due to the lack of comprehensive evaluations for state-of-the-art approaches which offer insights into the actual performance of the techniques. We report on a comprehensive reproduction study of six state-of-the-art IRBL techniques. This study applies not only subjects used in existing studies (old subjects) but also 46 new subjects (61,431 Java files and 9,459 bug reports) to the IRBL techniques. In addition, the study compares two different version matching (between bug reports and source code files) strategies to highlight some observations related to performance deterioration. We also vary test file inclusion to investigate the effectiveness of IRBL techniques on test files, or its noise impact on performance. Finally, we assess potential performance gain if duplicate bug reports are leveraged.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73166976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Search-based detection of deviation failures in the migration of legacy spreadsheet applications 对遗留电子表格应用程序迁移中的偏差故障进行基于搜索的检测
M. Almasi, H. Hemmati, G. Fraser, Phil McMinn, Janis Benefelds
Many legacy financial applications exist as a collection of formulas implemented in spreadsheets. Migration of these spreadsheets to a full-fledged system, written in a language such as Java, is an error- prone process. While small differences in the outputs of numerical calculations from the two systems are inevitable and tolerable, large discrepancies can have serious financial implications. Such discrepancies are likely due to faults in the migrated implementation, and are referred to as deviation failures. In this paper, we present a search-based technique that seeks to reveal deviation failures automatically. We evaluate different variants of this approach on two financial applications involving 40 formulas. These applications were produced by SEB Life & Pension Holding AB, who migrated their Microsoft Excel spreadsheets to a Java application. While traditional random and branch coverage-based test generation techniques were only able to detect approximately 25% and 32% of known faults in the migrated code respectively, our search-based approach detected up to 70% of faults with the same test generation budget. Without restriction of the search budget, up to 90% of known deviation failures were detected. In addition, three previously unknown faults were detected by this method that were confirmed by SEB experts.
许多遗留的财务应用程序都是在电子表格中实现的公式集合。将这些电子表格迁移到用Java等语言编写的成熟系统是一个容易出错的过程。虽然两种系统的数值计算结果之间的微小差异是不可避免和可以容忍的,但巨大的差异可能会造成严重的财政问题。这种差异很可能是由于迁移实现中的错误造成的,并且被称为偏差失败。在本文中,我们提出了一种基于搜索的技术,旨在自动揭示偏差故障。我们在涉及40个公式的两个金融应用中评估了这种方法的不同变体。这些应用程序是由SEB Life & Pension Holding AB制作的,他们将微软Excel电子表格迁移到Java应用程序中。传统的随机和基于分支覆盖率的测试生成技术分别只能检测到迁移代码中大约25%和32%的已知错误,而我们基于搜索的方法在相同的测试生成预算下检测到高达70%的错误。在没有搜索预算限制的情况下,可以检测到高达90%的已知偏差故障。此外,该方法还检测到三个以前未知的故障,并由SEB专家确认。
{"title":"Search-based detection of deviation failures in the migration of legacy spreadsheet applications","authors":"M. Almasi, H. Hemmati, G. Fraser, Phil McMinn, Janis Benefelds","doi":"10.1145/3213846.3213861","DOIUrl":"https://doi.org/10.1145/3213846.3213861","url":null,"abstract":"Many legacy financial applications exist as a collection of formulas implemented in spreadsheets. Migration of these spreadsheets to a full-fledged system, written in a language such as Java, is an error- prone process. While small differences in the outputs of numerical calculations from the two systems are inevitable and tolerable, large discrepancies can have serious financial implications. Such discrepancies are likely due to faults in the migrated implementation, and are referred to as deviation failures. In this paper, we present a search-based technique that seeks to reveal deviation failures automatically. We evaluate different variants of this approach on two financial applications involving 40 formulas. These applications were produced by SEB Life & Pension Holding AB, who migrated their Microsoft Excel spreadsheets to a Java application. While traditional random and branch coverage-based test generation techniques were only able to detect approximately 25% and 32% of known faults in the migrated code respectively, our search-based approach detected up to 70% of faults with the same test generation budget. Without restriction of the search budget, up to 90% of known deviation failures were detected. In addition, three previously unknown faults were detected by this method that were confirmed by SEB experts.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"90 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73556584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing developer-provided to user-provided tests for fault localization and automated program repair 比较开发人员提供的测试和用户提供的测试,以进行故障定位和自动程序修复
René Just, Chris Parnin, Ian Drosos, Michael D. Ernst
To realistically evaluate a software testing or debugging technique, it must be run on defects and tests that are characteristic of those a developer would encounter in practice. For example, to determine the utility of a fault localization or automated program repair technique, it could be run on real defects from a bug tracking system, using real tests that are committed to the version control repository along with the fixes. Although such a methodology uses real tests, it may not use tests that are characteristic of the information a developer or tool would have in practice. The tests that a developer commits after fixing a defect may encode more information than was available to the developer when initially diagnosing the defect. This paper compares, both quantitatively and qualitatively, the developer-provided tests committed along with fixes (as found in the version control repository) versus the user-provided tests extracted from bug reports (as found in the issue tracker). It provides evidence that developer-provided tests are more targeted toward the defect and encode more information than user-provided tests. For fault localization, developer-provided tests overestimate a technique’s ability to rank a defective statement in the list of the top-n most suspicious statements. For automated program repair, developer-provided tests overestimate a technique’s ability to (efficiently) generate correct patches—user-provided tests lead to fewer correct patches and increased repair time. This paper also provides suggestions for improving the design and evaluation of fault localization and automated program repair techniques.
要实际地评估软件测试或调试技术,它必须在缺陷和测试上运行,这些缺陷和测试是开发人员在实践中可能遇到的特征。例如,为了确定故障定位或自动程序修复技术的实用性,可以在bug跟踪系统中的实际缺陷上运行,使用与修复一起提交到版本控制存储库的实际测试。尽管这样的方法使用真实的测试,但它可能不会使用具有开发人员或工具在实践中所拥有的信息特征的测试。开发人员在修复缺陷后提交的测试可能编码了比最初诊断缺陷时开发人员可用的更多的信息。本文从数量和质量上比较了开发人员提供的测试和修复(在版本控制存储库中找到),以及从bug报告中提取的用户提供的测试(在问题跟踪器中找到)。它提供了证据,证明开发人员提供的测试比用户提供的测试更有针对性,并且编码了更多的信息。对于故障定位,开发人员提供的测试高估了一种技术在最可疑语句列表中排有缺陷语句的能力。对于自动程序修复,开发人员提供的测试高估了一种技术(有效地)生成正确补丁的能力——用户提供的测试导致更少的正确补丁,并增加了修复时间。本文还提出了改进故障定位和自动程序修复技术的设计和评估的建议。
{"title":"Comparing developer-provided to user-provided tests for fault localization and automated program repair","authors":"René Just, Chris Parnin, Ian Drosos, Michael D. Ernst","doi":"10.1145/3213846.3213870","DOIUrl":"https://doi.org/10.1145/3213846.3213870","url":null,"abstract":"To realistically evaluate a software testing or debugging technique, it must be run on defects and tests that are characteristic of those a developer would encounter in practice. For example, to determine the utility of a fault localization or automated program repair technique, it could be run on real defects from a bug tracking system, using real tests that are committed to the version control repository along with the fixes. Although such a methodology uses real tests, it may not use tests that are characteristic of the information a developer or tool would have in practice. The tests that a developer commits after fixing a defect may encode more information than was available to the developer when initially diagnosing the defect. This paper compares, both quantitatively and qualitatively, the developer-provided tests committed along with fixes (as found in the version control repository) versus the user-provided tests extracted from bug reports (as found in the issue tracker). It provides evidence that developer-provided tests are more targeted toward the defect and encode more information than user-provided tests. For fault localization, developer-provided tests overestimate a technique’s ability to rank a defective statement in the list of the top-n most suspicious statements. For automated program repair, developer-provided tests overestimate a technique’s ability to (efficiently) generate correct patches—user-provided tests lead to fewer correct patches and increased repair time. This paper also provides suggestions for improving the design and evaluation of fault localization and automated program repair techniques.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74455352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
An empirical study on TensorFlow program bugs TensorFlow程序bug的实证研究
Yuhao Zhang, Yifan Chen, S. Cheung, Yingfei Xiong, Lu Zhang
Deep learning applications become increasingly popular in important domains such as self-driving systems and facial identity systems. Defective deep learning applications may lead to catastrophic consequences. Although recent research efforts were made on testing and debugging deep learning applications, the characteristics of deep learning defects have never been studied. To fill this gap, we studied deep learning applications built on top of TensorFlow and collected program bugs related to TensorFlow from StackOverflow QA pages and Github projects. We extracted information from QA pages, commit messages, pull request messages, and issue discussions to examine the root causes and symptoms of these bugs. We also studied the strategies deployed by TensorFlow users for bug detection and localization. These findings help researchers and TensorFlow users to gain a better understanding of coding defects in TensorFlow programs and point out a new direction for future research.
深度学习应用在自动驾驶系统和面部识别系统等重要领域越来越受欢迎。有缺陷的深度学习应用可能会导致灾难性的后果。虽然最近对深度学习应用的测试和调试进行了研究,但深度学习缺陷的特征从未被研究过。为了填补这一空白,我们研究了基于TensorFlow的深度学习应用程序,并从StackOverflow QA页面和Github项目中收集了与TensorFlow相关的程序错误。我们从QA页面中提取信息、提交消息、提取请求消息并发布讨论,以检查这些bug的根本原因和症状。我们还研究了TensorFlow用户部署的错误检测和定位策略。这些发现有助于研究人员和TensorFlow用户更好地理解TensorFlow程序中的编码缺陷,并为未来的研究指明了新的方向。
{"title":"An empirical study on TensorFlow program bugs","authors":"Yuhao Zhang, Yifan Chen, S. Cheung, Yingfei Xiong, Lu Zhang","doi":"10.1145/3213846.3213866","DOIUrl":"https://doi.org/10.1145/3213846.3213866","url":null,"abstract":"Deep learning applications become increasingly popular in important domains such as self-driving systems and facial identity systems. Defective deep learning applications may lead to catastrophic consequences. Although recent research efforts were made on testing and debugging deep learning applications, the characteristics of deep learning defects have never been studied. To fill this gap, we studied deep learning applications built on top of TensorFlow and collected program bugs related to TensorFlow from StackOverflow QA pages and Github projects. We extracted information from QA pages, commit messages, pull request messages, and issue discussions to examine the root causes and symptoms of these bugs. We also studied the strategies deployed by TensorFlow users for bug detection and localization. These findings help researchers and TensorFlow users to gain a better understanding of coding defects in TensorFlow programs and point out a new direction for future research.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"56 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85650146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 218
Analyzing the analyzers: FlowDroid/IccTA, AmanDroid, and DroidSafe 分析分析仪:FlowDroid/IccTA, AmanDroid和DroidSafe
Lina Qiu, Yingying Wang, J. Rubin
Numerous static analysis techniques have recently been proposed for identifying information flows in mobile applications. These techniques are compared to each other, usually on a set of syntactic benchmarks. Yet, configurations used for such comparisons are rarely described. Our experience shows that tools are often compared under different setup, rendering the comparisons irreproducible and largely inaccurate. In this paper, we provide a large, controlled, and independent comparison of the three most prominent static analysis tools: FlowDroid combined with IccTA, Amandroid, and DroidSafe. We evaluate all tools using common configuration setup and the same set of benchmark applications. We compare the results of our analysis to the results reported in previous studies, identify main reasons for inaccuracy in existing tools, and provide suggestions for future research.
最近提出了许多静态分析技术来识别移动应用程序中的信息流。这些技术通常在一组语法基准上相互比较。然而,用于这种比较的配置很少被描述。我们的经验表明,工具经常在不同的设置下进行比较,导致比较不可复制且很大程度上不准确。在本文中,我们对三种最著名的静态分析工具:FlowDroid与IccTA、Amandroid和DroidSafe进行了大规模、受控和独立的比较。我们使用通用配置设置和同一组基准测试应用程序来评估所有工具。我们将我们的分析结果与以往的研究结果进行比较,找出现有工具不准确的主要原因,并对未来的研究提出建议。
{"title":"Analyzing the analyzers: FlowDroid/IccTA, AmanDroid, and DroidSafe","authors":"Lina Qiu, Yingying Wang, J. Rubin","doi":"10.1145/3213846.3213873","DOIUrl":"https://doi.org/10.1145/3213846.3213873","url":null,"abstract":"Numerous static analysis techniques have recently been proposed for identifying information flows in mobile applications. These techniques are compared to each other, usually on a set of syntactic benchmarks. Yet, configurations used for such comparisons are rarely described. Our experience shows that tools are often compared under different setup, rendering the comparisons irreproducible and largely inaccurate. In this paper, we provide a large, controlled, and independent comparison of the three most prominent static analysis tools: FlowDroid combined with IccTA, Amandroid, and DroidSafe. We evaluate all tools using common configuration setup and the same set of benchmark applications. We compare the results of our analysis to the results reported in previous studies, identify main reasons for inaccuracy in existing tools, and provide suggestions for future research.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77249116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
Test case prioritization for acceptance testing of cyber physical systems: a multi-objective search-based approach 网络物理系统验收测试用例优先级:基于多目标搜索的方法
Seung Yeob Shin, S. Nejati, M. Sabetzadeh, L. Briand, Frank Zimmer
Acceptance testing validates that a system meets its requirements and determines whether it can be sufficiently trusted and put into operation. For cyber physical systems (CPS), acceptance testing is a hardware-in-the-loop process conducted in a (near-)operational environment. Acceptance testing of a CPS often necessitates that the test cases be prioritized, as there are usually too many scenarios to consider given time constraints. CPS acceptance testing is further complicated by the uncertainty in the environment and the impact of testing on hardware. We propose an automated test case prioritization approach for CPS acceptance testing, accounting for time budget constraints, uncertainty, and hardware damage risks. Our approach is based on multi-objective search, combined with a test case minimization algorithm that eliminates redundant operations from an ordered sequence of test cases. We evaluate our approach on a representative case study from the satellite domain. The results indicate that, compared to test cases that are prioritized manually by satellite engineers, our automated approach more than doubles the number of test cases that fit into a given time frame, while reducing to less than one third the number of operations that entail the risk of damage to key hardware components.
验收测试验证系统是否满足其需求,并确定它是否可以充分信任并投入运行。对于网络物理系统(CPS),验收测试是在(接近)操作环境中进行的硬件在环过程。CPS的验收测试通常需要对测试用例进行优先级排序,因为通常有太多的场景需要考虑给定的时间限制。由于环境的不确定性和测试对硬件的影响,CPS验收测试变得更加复杂。我们为CPS验收测试提出了一种自动化的测试用例优先化方法,考虑了时间预算限制、不确定性和硬件损坏风险。我们的方法是基于多目标搜索,结合测试用例最小化算法,从有序的测试用例序列中消除冗余操作。我们通过卫星领域的一个代表性案例研究来评估我们的方法。结果表明,与由卫星工程师手动确定优先级的测试用例相比,我们的自动化方法将适合给定时间框架的测试用例数量增加了一倍以上,同时将导致关键硬件组件损坏风险的操作数量减少到不到三分之一。
{"title":"Test case prioritization for acceptance testing of cyber physical systems: a multi-objective search-based approach","authors":"Seung Yeob Shin, S. Nejati, M. Sabetzadeh, L. Briand, Frank Zimmer","doi":"10.1145/3213846.3213852","DOIUrl":"https://doi.org/10.1145/3213846.3213852","url":null,"abstract":"Acceptance testing validates that a system meets its requirements and determines whether it can be sufficiently trusted and put into operation. For cyber physical systems (CPS), acceptance testing is a hardware-in-the-loop process conducted in a (near-)operational environment. Acceptance testing of a CPS often necessitates that the test cases be prioritized, as there are usually too many scenarios to consider given time constraints. CPS acceptance testing is further complicated by the uncertainty in the environment and the impact of testing on hardware. We propose an automated test case prioritization approach for CPS acceptance testing, accounting for time budget constraints, uncertainty, and hardware damage risks. Our approach is based on multi-objective search, combined with a test case minimization algorithm that eliminates redundant operations from an ordered sequence of test cases. We evaluate our approach on a representative case study from the satellite domain. The results indicate that, compared to test cases that are prioritized manually by satellite engineers, our automated approach more than doubles the number of test cases that fit into a given time frame, while reducing to less than one third the number of operations that entail the risk of damage to key hardware components.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87541606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Advances in the ANaConDA framework for dynamic analysis and testing of concurrent C/C++ programs 并行C/ c++程序动态分析与测试的ANaConDA框架研究进展
Jan Fiedor, Monika Muzikovská, A. Smrčka, O. Vašíček, Tomáš Vojnar
The paper presents advances in the ANaConDA framework for dynamic analysis and testing of concurrent C/C++ programs. ANaConDA comes with several built-in analysers, covering detection of data races, deadlocks, or contract violations, and allows for an easy creation of new analysers. To increase the variety of tested interleavings, ANaConDA offers various noise injection techniques. The framework performs the analysis on a binary level, thus not requiring the source code of the program to be available. Apart from many academic experiments, ANaConDA has also been successfully used to discover various errors in industrial code.
本文介绍了用于并行C/ c++程序动态分析和测试的ANaConDA框架的研究进展。ANaConDA带有几个内置分析器,涵盖数据竞争、死锁或契约违反的检测,并允许轻松创建新的分析器。为了增加测试交错的多样性,ANaConDA提供了各种噪声注入技术。该框架在二进制级别上执行分析,因此不需要程序的源代码可用。除了许多学术实验外,ANaConDA还成功地用于发现工业代码中的各种错误。
{"title":"Advances in the ANaConDA framework for dynamic analysis and testing of concurrent C/C++ programs","authors":"Jan Fiedor, Monika Muzikovská, A. Smrčka, O. Vašíček, Tomáš Vojnar","doi":"10.1145/3213846.3229505","DOIUrl":"https://doi.org/10.1145/3213846.3229505","url":null,"abstract":"The paper presents advances in the ANaConDA framework for dynamic analysis and testing of concurrent C/C++ programs. ANaConDA comes with several built-in analysers, covering detection of data races, deadlocks, or contract violations, and allows for an easy creation of new analysers. To increase the variety of tested interleavings, ANaConDA offers various noise injection techniques. The framework performs the analysis on a binary level, thus not requiring the source code of the program to be available. Apart from many academic experiments, ANaConDA has also been successfully used to discover various errors in industrial code.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87809949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automated test mapping and coverage for network topologies 网络拓扑的自动化测试映射和覆盖
P. Strandberg, T. Ostrand, E. Weyuker, Daniel Sundmark, W. Afzal
Communication devices such as routers and switches play a critical role in the reliable functioning of embedded system networks. Dozens of such devices may be part of an embedded system network, and they need to be tested in conjunction with various computational elements on actual hardware, in many different configurations that are representative of actual operating networks. An individual physical network topology can be used as the basis for a test system that can execute many test cases, by identifying the part of the physical network topology that corresponds to the configuration required by each individual test case. Given a set of available test systems and a large number of test cases, the problem is to determine for each test case, which of the test systems are suitable for executing the test case, and to provide the mapping that associates the test case elements (the logical network topology) with the appropriate elements of the test system (the physical network topology). We studied a real industrial environment where this problem was originally handled by a simple software procedure that was very slow in many cases, and also failed to provide thorough coverage of each network's elements. In this paper, we represent both the test systems and the test cases as graphs, and develop a new prototype algorithm that a) determines whether or not a test case can be mapped to a subgraph of the test system, b) rapidly finds mappings that do exist, and c) exercises diverse sets of network nodes when multiple mappings exist for the test case. The prototype has been implemented and applied to over 10,000 combinations of test cases and test systems, and reduced the computation time by a factor of more than 80 from the original procedure. In addition, relative to a meaningful measure of network topology coverage, the mappings achieved an increased level of thoroughness in exercising the elements of each test system.
通信设备如路由器和交换机在嵌入式系统网络的可靠运行中起着至关重要的作用。许多这样的设备可能是嵌入式系统网络的一部分,它们需要与实际硬件上的各种计算元素一起在代表实际操作网络的许多不同配置中进行测试。一个单独的物理网络拓扑可以被用作一个测试系统的基础,这个测试系统可以执行许多测试用例,通过识别物理网络拓扑的一部分来对应于每个单独的测试用例所需要的配置。给定一组可用的测试系统和大量的测试用例,问题是确定每个测试用例,哪个测试系统适合执行测试用例,并提供将测试用例元素(逻辑网络拓扑)与测试系统的适当元素(物理网络拓扑)相关联的映射。我们研究了一个真实的工业环境,在这个环境中,这个问题最初是由一个简单的软件程序来处理的,在很多情况下,这个程序非常慢,而且也不能全面覆盖每个网络的元素。在本文中,我们将测试系统和测试用例都表示为图,并开发了一种新的原型算法,该算法a)确定测试用例是否可以映射到测试系统的子图,b)快速找到确实存在的映射,以及c)当测试用例存在多个映射时,练习不同的网络节点集。该原型已实现并应用于超过10,000个测试用例和测试系统的组合,并将计算时间从原始程序减少了80多倍。此外,相对于网络拓扑覆盖的有意义的度量,映射在执行每个测试系统的元素时达到了更高的彻底性水平。
{"title":"Automated test mapping and coverage for network topologies","authors":"P. Strandberg, T. Ostrand, E. Weyuker, Daniel Sundmark, W. Afzal","doi":"10.1145/3213846.3213859","DOIUrl":"https://doi.org/10.1145/3213846.3213859","url":null,"abstract":"Communication devices such as routers and switches play a critical role in the reliable functioning of embedded system networks. Dozens of such devices may be part of an embedded system network, and they need to be tested in conjunction with various computational elements on actual hardware, in many different configurations that are representative of actual operating networks. An individual physical network topology can be used as the basis for a test system that can execute many test cases, by identifying the part of the physical network topology that corresponds to the configuration required by each individual test case. Given a set of available test systems and a large number of test cases, the problem is to determine for each test case, which of the test systems are suitable for executing the test case, and to provide the mapping that associates the test case elements (the logical network topology) with the appropriate elements of the test system (the physical network topology). We studied a real industrial environment where this problem was originally handled by a simple software procedure that was very slow in many cases, and also failed to provide thorough coverage of each network's elements. In this paper, we represent both the test systems and the test cases as graphs, and develop a new prototype algorithm that a) determines whether or not a test case can be mapped to a subgraph of the test system, b) rapidly finds mappings that do exist, and c) exercises diverse sets of network nodes when multiple mappings exist for the test case. The prototype has been implemented and applied to over 10,000 combinations of test cases and test systems, and reduced the computation time by a factor of more than 80 from the original procedure. In addition, relative to a meaningful measure of network topology coverage, the mappings achieved an increased level of thoroughness in exercising the elements of each test system.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"101 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80449309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1