2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)最新文献

英文中文

Inducing Subtle Mutations with Program Repair 用程序修复诱导细微突变

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-04-01 DOI: 10.1109/ICSTW52544.2021.00018

F. Schwander, Rahul Gopinath, A. Zeller

Mutation analysis is the gold standard for assessing the effectiveness of a test suite to prevent bugs. It involves injecting syntactic changes in the program, generating variants (mutants) of the program under test, and checking whether the test suite detects the mutant. Practitioners often rely on these live mutants to decide what test cases to write for improving the test suite effectiveness.While a majority of such syntactic changes result in semantic differences from the original, it is possible that such a change fails to induce a corresponding semantic change in the mutant. Such equivalent mutants can lead to wastage of manual effort.We describe a novel technique that produces high-quality mutants while avoiding the generation of equivalent mutants for input processors. Our idea is to generate plausible, near correct inputs for the program, collect those rejected, and generate variants that accept these rejected strings. This technique allows us to provide an enhanced set of mutants along with newly generated test cases that kill them.We evaluate our method on eight python programs and show that our technique can generate new mutants that are both interesting for the developer and guaranteed to be mortal.

突变分析是评估测试套件防止错误有效性的黄金标准。它包括在程序中注入语法更改，生成被测程序的变体(突变)，并检查测试套件是否检测到突变。从业者经常依靠这些活跃的突变体来决定编写哪些测试用例来提高测试套件的有效性。虽然这种句法变化大多数会导致与原体的语义差异，但这种变化可能不会引起突变体中相应的语义变化。这种等效的突变可能导致人工努力的浪费。我们描述了一种产生高质量突变体的新技术，同时避免了输入处理器产生等效突变体。我们的想法是为程序生成可信的、接近正确的输入，收集那些被拒绝的输入，并生成接受这些被拒绝的字符串的变体。这项技术允许我们提供一组增强的突变体，以及新生成的测试用例来杀死它们。我们在8个python程序上评估了我们的方法，并表明我们的技术可以生成新的突变体，这些突变体对开发人员来说既有趣，又保证是致命的。

{"title":"Inducing Subtle Mutations with Program Repair","authors":"F. Schwander, Rahul Gopinath, A. Zeller","doi":"10.1109/ICSTW52544.2021.00018","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00018","url":null,"abstract":"Mutation analysis is the gold standard for assessing the effectiveness of a test suite to prevent bugs. It involves injecting syntactic changes in the program, generating variants (mutants) of the program under test, and checking whether the test suite detects the mutant. Practitioners often rely on these live mutants to decide what test cases to write for improving the test suite effectiveness.While a majority of such syntactic changes result in semantic differences from the original, it is possible that such a change fails to induce a corresponding semantic change in the mutant. Such equivalent mutants can lead to wastage of manual effort.We describe a novel technique that produces high-quality mutants while avoiding the generation of equivalent mutants for input processors. Our idea is to generate plausible, near correct inputs for the program, collect those rejected, and generate variants that accept these rejected strings. This technique allows us to provide an enhanced set of mutants along with newly generated test cases that kill them.We evaluate our method on eight python programs and show that our technique can generate new mutants that are both interesting for the developer and guaranteed to be mortal.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132493756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Assuring Fairness of Algorithmic Decision Making 保证算法决策的公平性

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-04-01 DOI: 10.1109/ICSTW52544.2021.00029

Marc P. Hauer, R. Adler, K. Zweig

Assuring fairness of an algorithmic decision making (ADM) system is a challenging task involving different and possibly conflicting views on fairness as expressed by multiple fairness measures. We argue that a combination of the agile development framework Acceptance Test-Driven Development (ATDD) and the concept of Assurance Cases from safety engineering is a pragmatic way to assure fairness levels that are adequate for a predefined application. The approach supports examinations by regulating bodies or related auditing processes by providing a structured argument explaining the achieved level of fairness and its sufficiency for the application.

确保算法决策(ADM)系统的公平性是一项具有挑战性的任务，涉及多种公平措施所表达的不同的、可能相互冲突的公平性观点。我们认为，将敏捷开发框架验收测试驱动开发(ATDD)和来自安全工程的保证用例概念结合起来，是一种实用的方法，可以确保对于预定义的应用程序来说足够的公平性。该方法通过提供结构化的论证，解释已达到的公平水平及其对申请的充分性，从而支持监管机构或相关审计过程的审查。

引用次数: 7

A Combinatorial Approach to Explaining Image Classifiers 一种组合方法解释图像分类器

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-04-01 DOI: 10.1109/ICSTW52544.2021.00019

Jaganmohan Chandrasekaran, Yu Lei, R. Kacker, D. R. Kuhn

Machine Learning (ML) models, a core component to artificial intelligence systems, often come as a black box to the user, leading to the problem of interpretability. Explainable Artificial Intelligence (XAI) is key to providing confidence and trustworthiness for machine learning-based software systems. We observe a fundamental connection between XAI and software fault localization. In this paper, we present an approach that uses BEN, a combinatorial testing-based software fault localization approach, to produce explanations for decisions made by ML models.

机器学习(ML)模型是人工智能系统的核心组成部分，对用户来说往往是一个黑盒子，导致可解释性问题。可解释人工智能(XAI)是为基于机器学习的软件系统提供信心和可信度的关键。我们观察到XAI与软件故障定位之间的基本联系。在本文中，我们提出了一种使用BEN(一种基于组合测试的软件故障定位方法)的方法，为ML模型做出的决策提供解释。

引用次数: 8

Active Machine Learning to Test Autonomous Driving 主动机器学习测试自动驾驶

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-04-01 DOI: 10.1109/ICSTW52544.2021.00055

K. Meinke

Autonomous driving represents a significant challenge to all software quality assurance techniques, including testing. Generative machine learning (ML) techniques including active ML have considerable potential to generate high quality synthetic test data that can complement and improve on existing techniques such as hardware-in-the-loop and road testing.

自动驾驶对包括测试在内的所有软件质量保证技术构成了重大挑战。包括主动机器学习在内的生成式机器学习(ML)技术在生成高质量的综合测试数据方面具有相当大的潜力，可以补充和改进现有技术，如硬件在环和道路测试。

引用次数: 1

Enabling Fast Exploration and Validation of Thermal Dissipation Requirements for Heterogeneous SoCs 实现异质soc散热需求的快速探索和验证

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-04-01 DOI: 10.1109/ICSTW52544.2021.00030

Joel Öhrling, D. Truscan, S. Lafond

The management of the energy consumption and thermal dissipation of multi-core heterogeneous platforms is becoming increasingly important as it can have direct impact on the platform performance. This paper discusses an approach that enables fast exploration and validation of heterogeneous system on chips (SoCs) platform configurations with respect to their thermal dissipation. Such platforms can be configured to find the optimal trade-off between performance and power consumption. This directly reflects in the head dissipation of the platform, which when increases over a given threshold will actually decrease the performance of the platform. Therefore, it is important to be able to quickly probe and explore different configurations and identify the most suitable one. However, this task is hindered by the large space of possible configurations of such platforms and by the time required to benchmark each configurations. As such, we propose an approach in which we construct a model of the thermal dissipation of a given platform using a system identification methods and then we use this model to explore and validate different configurations. The approach allows us to decrease the exploration time with several orders of magnitude. We exemplify the approach on an Odroid-XU4 board featuring an Exynos 5422 SoC.

多核异构平台的能耗和散热管理直接影响到平台的性能，因此变得越来越重要。本文讨论了一种能够快速探索和验证异构系统芯片(soc)平台配置的方法，涉及其散热。这样的平台可以配置为找到性能和功耗之间的最佳权衡。这直接反映在平台的头部耗散上，当其增加超过给定阈值时，实际上会降低平台的性能。因此，能够快速探测和探索不同的配置并确定最合适的配置非常重要。然而，这类平台可能的配置空间太大，对每种配置进行基准测试所需的时间太长，阻碍了这项任务的完成。因此，我们提出了一种方法，我们使用系统识别方法构建给定平台的散热模型，然后我们使用该模型来探索和验证不同的配置。该方法使我们能够将勘探时间减少几个数量级。我们在带有Exynos 5422 SoC的Odroid-XU4板上举例说明了这种方法。

{"title":"Enabling Fast Exploration and Validation of Thermal Dissipation Requirements for Heterogeneous SoCs","authors":"Joel Öhrling, D. Truscan, S. Lafond","doi":"10.1109/ICSTW52544.2021.00030","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00030","url":null,"abstract":"The management of the energy consumption and thermal dissipation of multi-core heterogeneous platforms is becoming increasingly important as it can have direct impact on the platform performance. This paper discusses an approach that enables fast exploration and validation of heterogeneous system on chips (SoCs) platform configurations with respect to their thermal dissipation. Such platforms can be configured to find the optimal trade-off between performance and power consumption. This directly reflects in the head dissipation of the platform, which when increases over a given threshold will actually decrease the performance of the platform. Therefore, it is important to be able to quickly probe and explore different configurations and identify the most suitable one. However, this task is hindered by the large space of possible configurations of such platforms and by the time required to benchmark each configurations. As such, we propose an approach in which we construct a model of the thermal dissipation of a given platform using a system identification methods and then we use this model to explore and validate different configurations. The approach allows us to decrease the exploration time with several orders of magnitude. We exemplify the approach on an Odroid-XU4 board featuring an Exynos 5422 SoC.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128149887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Combinatorially XSSing Web Application Firewalls 组合xsing Web应用程序防火墙

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-04-01 DOI: 10.1109/ICSTW52544.2021.00026

Bernhard Garn, Daniel Sebastian Lang, Manuel Leithner, D. R. Kuhn, R. Kacker, D. Simos

Cross-Site scripting (XSS) is a common class of vulnerabilities in the domain of web applications. As it re-mains prevalent despite continued efforts by practitioners and researchers, site operators often seek to protect their assets using web application firewalls (WAFs). These systems employ filtering mechanisms to intercept and reject requests that may be suitable to exploit XSS flaws and related vulnerabilities such as SQL injections. However, they generally do not offer complete protection and can often be bypassed using specifically crafted exploits. In this work, we evaluate the effectiveness of WAFs to detect XSS exploits. We develop an attack grammar and use a combinatorial testing approach to generate attack vectors. We compare our vectors with conventional counterparts and their ability to bypass different WAFs. Our results show that the vectors generated with combinatorial testing perform equal or better in almost all cases. They further confirm that most of the rule sets evaluated in this work can be bypassed by at least one of these crafted inputs.

跨站点脚本(XSS)是web应用程序领域中常见的一类漏洞。尽管从业人员和研究人员不断努力，但它仍然普遍存在，站点运营商经常寻求使用web应用程序防火墙(waf)来保护他们的资产。这些系统采用过滤机制来拦截和拒绝可能适合利用XSS缺陷和相关漏洞(如SQL注入)的请求。然而，它们通常不提供完整的保护，并且经常可以被专门设计的漏洞绕过。在这项工作中，我们评估了waf检测XSS漏洞的有效性。我们开发了一种攻击语法，并使用组合测试方法来生成攻击向量。我们将我们的载体与传统的对口物及其绕过不同waf的能力进行比较。我们的结果表明，用组合测试生成的向量在几乎所有情况下都具有相同或更好的性能。他们进一步确认，在这项工作中评估的大多数规则集可以被这些精心制作的输入中的至少一个绕过。

{"title":"Combinatorially XSSing Web Application Firewalls","authors":"Bernhard Garn, Daniel Sebastian Lang, Manuel Leithner, D. R. Kuhn, R. Kacker, D. Simos","doi":"10.1109/ICSTW52544.2021.00026","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00026","url":null,"abstract":"Cross-Site scripting (XSS) is a common class of vulnerabilities in the domain of web applications. As it re-mains prevalent despite continued efforts by practitioners and researchers, site operators often seek to protect their assets using web application firewalls (WAFs). These systems employ filtering mechanisms to intercept and reject requests that may be suitable to exploit XSS flaws and related vulnerabilities such as SQL injections. However, they generally do not offer complete protection and can often be bypassed using specifically crafted exploits. In this work, we evaluate the effectiveness of WAFs to detect XSS exploits. We develop an attack grammar and use a combinatorial testing approach to generate attack vectors. We compare our vectors with conventional counterparts and their ability to bypass different WAFs. Our results show that the vectors generated with combinatorial testing perform equal or better in almost all cases. They further confirm that most of the rule sets evaluated in this work can be bypassed by at least one of these crafted inputs.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130113584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Improving Mobile User Interface Testing with Model Driven Monkey Search 用模型驱动的猴子搜索改进移动用户界面测试

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-04-01 DOI: 10.1109/ICSTW52544.2021.00034

Jordan Doyle, Takfarinas Saber, Paolo Arcaini, Anthony Ventresque

Testing mobile applications often relies on tools, such as Exerciser Monkey for Android systems, that simulate user input. Exerciser Monkey, for example, generates random events (e.g., touches, gestures, navigational keys) that give developers a sense of what their application will do when deployed on real mobile phones with real users interacting with it. These tools, however, have no knowledge of the underlying applications' structures and only interact with them randomly or in a predefined manner (e.g., if developers designed scenarios, a labour-intensive task) - making them slow and poor at finding bugs.In this paper, we propose a novel control flow structure able to represent the code of Android applications, including all the interactive elements. We show that our structure can increase the effectiveness (higher coverage) and efficiency (removing duplicate/redundant tests) of the Exerciser Monkey by giving it knowledge of the test environment. We compare the interface coverage achieved by the Exerciser Monkey with our new Monkey++ using a depth first search of our control flow structure and show that while the random nature of Exerciser Monkey creates slow test suites of poor coverage, the test suite created by a depth first search is one order of magnitude faster and achieves full coverage of the user interaction elements. We believe this research will lead to a more effective and efficient Exerciser Monkey, as well as better targeted search based techniques for automated Android testing.

测试移动应用程序通常依赖于模拟用户输入的工具，比如用于Android系统的Exerciser Monkey。例如，Exerciser Monkey生成随机事件(例如，触摸、手势、导航键)，让开发人员了解他们的应用程序在真正的手机上部署时将做什么，并与真正的用户进行交互。然而，这些工具不了解底层应用程序的结构，只随机地或以预定义的方式与它们交互(例如，如果开发人员设计了场景，则是一项劳动密集型任务)，这使得它们在发现错误方面速度缓慢且较差。在本文中，我们提出了一种新的控制流结构，能够表示Android应用程序的代码，包括所有的交互元素。我们展示了我们的结构可以通过赋予Exerciser Monkey测试环境的知识来提高它的有效性(更高的覆盖率)和效率(删除重复/冗余的测试)。我们使用深度优先搜索我们的控制流结构，比较了锻炼者猴子和我们的新Monkey++实现的界面覆盖，并表明，虽然锻炼者猴子的随机性质造成了低覆盖率的缓慢测试套件，但深度优先搜索创建的测试套件要快一个量级，并实现了用户交互元素的完全覆盖。我们相信这项研究将带来一个更有效和高效的锻炼者猴子，以及更好的基于目标搜索的自动化Android测试技术。

{"title":"Improving Mobile User Interface Testing with Model Driven Monkey Search","authors":"Jordan Doyle, Takfarinas Saber, Paolo Arcaini, Anthony Ventresque","doi":"10.1109/ICSTW52544.2021.00034","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00034","url":null,"abstract":"Testing mobile applications often relies on tools, such as Exerciser Monkey for Android systems, that simulate user input. Exerciser Monkey, for example, generates random events (e.g., touches, gestures, navigational keys) that give developers a sense of what their application will do when deployed on real mobile phones with real users interacting with it. These tools, however, have no knowledge of the underlying applications' structures and only interact with them randomly or in a predefined manner (e.g., if developers designed scenarios, a labour-intensive task) - making them slow and poor at finding bugs.In this paper, we propose a novel control flow structure able to represent the code of Android applications, including all the interactive elements. We show that our structure can increase the effectiveness (higher coverage) and efficiency (removing duplicate/redundant tests) of the Exerciser Monkey by giving it knowledge of the test environment. We compare the interface coverage achieved by the Exerciser Monkey with our new Monkey++ using a depth first search of our control flow structure and show that while the random nature of Exerciser Monkey creates slow test suites of poor coverage, the test suite created by a depth first search is one order of magnitude faster and achieves full coverage of the user interaction elements. We believe this research will lead to a more effective and efficient Exerciser Monkey, as well as better targeted search based techniques for automated Android testing.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130266958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

AI-based Test Automation: A Grey Literature Analysis 基于人工智能的测试自动化:灰色文献分析

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-04-01 DOI: 10.1109/ICSTW52544.2021.00051

F. Ricca, A. Marchetto, Andrea Stocco

This paper provides the results of a survey of the grey literature concerning the use of artificial intelligence to improve test automation practices. We surveyed more than 1,200 sources of grey literature (e.g., blogs, white-papers, user manuals, StackOverflow posts) looking for highlights by professionals on how AI is adopted to aid the development and evolution of test code. Ultimately, we filtered 136 relevant documents from which we extracted a taxonomy of problems that AI aims to tackle, along with a taxonomy of AI-enabled solutions to such problems. Manual code development and automated test generation are the most cited problem and solution, respectively. The paper concludes by distilling the six most prevalent tools on the market, along with think-aloud reflections about the current and future status of artificial intelligence for test automation.

本文提供了关于使用人工智能来改进测试自动化实践的灰色文献的调查结果。我们调查了1200多个灰色文献(例如，博客、白皮书、用户手册、StackOverflow帖子)，寻找专业人士关于如何采用AI来帮助测试代码的开发和演变的重点。最终，我们过滤了136个相关文档，从中提取了人工智能旨在解决的问题的分类，以及针对这些问题的人工智能解决方案的分类。手动代码开发和自动化测试生成分别是被引用最多的问题和解决方案。本文总结了市场上最流行的六种工具，以及对测试自动化中人工智能的当前和未来状态的思考。

引用次数: 6

A Practical Method for API Testing in the Context of Continuous Delivery and Behavior Driven Development 持续交付和行为驱动开发环境下API测试的实用方法

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-04-01 DOI: 10.1109/ICSTW52544.2021.00020

Brian Elgaard Bennett

Enterprises are increasingly adopting an API-first approach to connect and expose software services. Saxo Bank is no exception to this.Crafting test suites for such APIs can seem straight forward due to the headless nature, but our experience shows that test suites often have two problems. The first problem is that execution of tests tends to fail and pass in seemingly nondeterministic ways (tests are flaky). The second problem is that functional coverage is not clearly documented.We have found that both problems stem from a lack of explicit focus on initial context (IC), a concept from behavior driven development. When a test is flaky it is often because actual IC in the test environment is not as required by the test. When functional coverage is not clear, it is most often because a systematic analysis involving IC was not performed.We propose a method for test analysis in which we include IC in the input space when analyzing functional coverage for an API, thereby including anything which can influence the outcome of test cases.Establishing IC is in general a hard problem. We have found that focus on the bounded context, a concept from domain driven design, of the system under test is a practical way to establish relevant IC.Experience with Saxo Bank's Open API shows that this method allows testers and developers to cooperate continuously, producing test plan documents which include the reasoning behind functional coverage. Explicit focus on IC in automated test case implementations turns flaky tests into tests which report on required IC in a test environment. The method easily generalizes to all levels of API tests.

企业越来越多地采用api优先的方法来连接和公开软件服务。盛宝银行(Saxo Bank)也不例外。为这样的api制作测试套件似乎是直接的，因为它是无头的，但是我们的经验表明，测试套件通常有两个问题。第一个问题是，测试的执行往往以看似不确定的方式失败和通过(测试是不稳定的)。第二个问题是功能覆盖没有清晰的文档化。我们发现，这两个问题都源于缺乏对初始背景(IC)的明确关注，这是行为驱动开发的一个概念。当测试出现片状时，通常是因为测试环境中的实际IC不符合测试的要求。当功能覆盖不清楚时，最常见的原因是没有执行涉及集成电路的系统分析。我们提出了一种测试分析方法，在分析API的功能覆盖时，我们将IC包含在输入空间中，从而包括任何可能影响测试用例结果的东西。建立集成电路通常是一个难题。我们发现，关注被测系统的有界上下文(一个来自领域驱动设计的概念)是建立相关ic的一种实用方法。盛宝银行开放API的经验表明，这种方法允许测试人员和开发人员持续合作，生成包含功能覆盖背后原因的测试计划文档。在自动化测试用例实现中对集成电路的显式关注将零散的测试转变为在测试环境中报告所需集成电路的测试。该方法很容易推广到所有级别的API测试。

{"title":"A Practical Method for API Testing in the Context of Continuous Delivery and Behavior Driven Development","authors":"Brian Elgaard Bennett","doi":"10.1109/ICSTW52544.2021.00020","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00020","url":null,"abstract":"Enterprises are increasingly adopting an API-first approach to connect and expose software services. Saxo Bank is no exception to this.Crafting test suites for such APIs can seem straight forward due to the headless nature, but our experience shows that test suites often have two problems. The first problem is that execution of tests tends to fail and pass in seemingly nondeterministic ways (tests are flaky). The second problem is that functional coverage is not clearly documented.We have found that both problems stem from a lack of explicit focus on initial context (IC), a concept from behavior driven development. When a test is flaky it is often because actual IC in the test environment is not as required by the test. When functional coverage is not clear, it is most often because a systematic analysis involving IC was not performed.We propose a method for test analysis in which we include IC in the input space when analyzing functional coverage for an API, thereby including anything which can influence the outcome of test cases.Establishing IC is in general a hard problem. We have found that focus on the bounded context, a concept from domain driven design, of the system under test is a practical way to establish relevant IC.Experience with Saxo Bank's Open API shows that this method allows testers and developers to cooperate continuously, producing test plan documents which include the reasoning behind functional coverage. Explicit focus on IC in automated test case implementations turns flaky tests into tests which report on required IC in a test environment. The method easily generalizes to all levels of API tests.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"259 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123087320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for Vision AI? 使用Grad-CAM热图的测试自动化-视觉AI的MLOps的未来管道段?

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

Pub Date : 2021-03-02 DOI: 10.1109/ICSTW52544.2021.00039

Markus Borg, Ronald Jabangwe, Simon Åberg, Arvid Ekblom, Ludwig Hedlund, August Lidfeldt

Machine Learning (ML) is a fundamental part of modern perception systems. In the last decade, the performance of computer vision using trained deep neural networks has outperformed previous approaches based on careful feature engineering. However, the opaqueness of large ML models is a substantial impediment for critical applications such as in the automotive context. As a remedy, Gradient-weighted Class Activation Mapping (Grad-CAM) has been proposed to provide visual explanations of model internals. In this paper, we demonstrate how Grad-CAM heatmaps can be used to increase the explainability of an image recognition model trained for a pedestrian underpass. We argue how the heatmaps support compliance to the EU’s seven key requirements for Trustworthy AI. Finally, we propose adding automated heatmap analysis as a pipe segment in an MLOps pipeline. We believe that such a building block can be used to automatically detect if a trained ML-model is activated based on invalid pixels in test images, suggesting biased models.

机器学习(ML)是现代感知系统的基本组成部分。在过去的十年中，使用经过训练的深度神经网络的计算机视觉的性能优于先前基于仔细特征工程的方法。然而，大型机器学习模型的不透明性对汽车等关键应用来说是一个重大障碍。作为补救措施，梯度加权类激活映射(Grad-CAM)被提出提供模型内部的可视化解释。在本文中，我们演示了如何使用Grad-CAM热图来增加为行人地下通道训练的图像识别模型的可解释性。我们讨论了热图如何支持遵守欧盟对可信赖人工智能的七项关键要求。最后，我们建议在MLOps管道中添加自动热图分析作为管道段。我们相信，这样的构建块可以用来自动检测训练后的ml模型是否基于测试图像中的无效像素被激活，从而提示有偏差的模型。

{"title":"Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for Vision AI?","authors":"Markus Borg, Ronald Jabangwe, Simon Åberg, Arvid Ekblom, Ludwig Hedlund, August Lidfeldt","doi":"10.1109/ICSTW52544.2021.00039","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00039","url":null,"abstract":"Machine Learning (ML) is a fundamental part of modern perception systems. In the last decade, the performance of computer vision using trained deep neural networks has outperformed previous approaches based on careful feature engineering. However, the opaqueness of large ML models is a substantial impediment for critical applications such as in the automotive context. As a remedy, Gradient-weighted Class Activation Mapping (Grad-CAM) has been proposed to provide visual explanations of model internals. In this paper, we demonstrate how Grad-CAM heatmaps can be used to increase the explainability of an image recognition model trained for a pedestrian underpass. We argue how the heatmaps support compliance to the EU’s seven key requirements for Trustworthy AI. Finally, we propose adding automated heatmap analysis as a pipe segment in an MLOps pipeline. We believe that such a building block can be used to automatically detect if a trained ML-model is activated based on invalid pixels in test images, suggesting biased models.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126511173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀