2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)最新文献

Automated Function Assessment in Driving Scenarios 驾驶场景中的自动功能评估

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00050

Christian King, Lennart Ries, Christopher Kober, C. Wohlfahrt, E. Sax

In recent years, numerous innovations in the automotive industry have addressed the field of driver assistance systems and automated driving. Therefore additional required sensors, as well as the need for digital maps and online services, lead to an ever-increasing system space, which must be covered. Established test approaches in the area of Hardware-in-the-Loop (HiL) use predefined and structured test cases to test the systems on the basis of requirements. In the approach of systematic testing, an evaluation is only carried out for a specific test case respectively the duration of a test step. This paper presents a concept for an automated quality assessment of driving scenarios or digital test drives. The aim is the analysis and subsequent evaluation of continuous function behavior during a realistic test drive within a simulated environment. Compared to conventional systematic test approaches, the presented concept allows a continuous evaluation of the test drive, whereby multiple evaluations of systems in similar scenarios with deviating boundary conditions is possible. For the first time, this enables a functional evaluation of a complete test drive comprising numerous scenarios and situations. The presented approach was prototypically implemented and demonstrated on a Hardware-in-the-Loop (HiL) test bench evaluating an adaptive cruise control (ACC) system.

近年来，汽车行业的许多创新都涉及驾驶员辅助系统和自动驾驶领域。因此，需要额外的传感器，以及对数字地图和在线服务的需求，导致系统空间不断增加，必须覆盖这些空间。在硬件在环(HiL)领域中建立的测试方法使用预定义的和结构化的测试用例来根据需求测试系统。在系统测试的方法中，评估只针对一个特定的测试用例，分别在一个测试步骤的持续时间内进行。本文提出了驾驶场景或数字测试驾驶的自动质量评估的概念。其目的是在模拟环境中的实际试驾期间对连续功能行为进行分析和后续评估。与传统的系统测试方法相比，所提出的概念允许对测试驱动进行连续评估，从而可以在具有偏离边界条件的类似场景中对系统进行多次评估。这是第一次能够对包含许多场景和情况的完整测试驱动进行功能评估。所提出的方法在评估自适应巡航控制(ACC)系统的硬件在环(HiL)试验台上进行了原型实现和演示。

{"title":"Automated Function Assessment in Driving Scenarios","authors":"Christian King, Lennart Ries, Christopher Kober, C. Wohlfahrt, E. Sax","doi":"10.1109/ICST.2019.00050","DOIUrl":"https://doi.org/10.1109/ICST.2019.00050","url":null,"abstract":"In recent years, numerous innovations in the automotive industry have addressed the field of driver assistance systems and automated driving. Therefore additional required sensors, as well as the need for digital maps and online services, lead to an ever-increasing system space, which must be covered. Established test approaches in the area of Hardware-in-the-Loop (HiL) use predefined and structured test cases to test the systems on the basis of requirements. In the approach of systematic testing, an evaluation is only carried out for a specific test case respectively the duration of a test step. This paper presents a concept for an automated quality assessment of driving scenarios or digital test drives. The aim is the analysis and subsequent evaluation of continuous function behavior during a realistic test drive within a simulated environment. Compared to conventional systematic test approaches, the presented concept allows a continuous evaluation of the test drive, whereby multiple evaluations of systems in similar scenarios with deviating boundary conditions is possible. For the first time, this enables a functional evaluation of a complete test drive comprising numerous scenarios and situations. The presented approach was prototypically implemented and demonstrated on a Hardware-in-the-Loop (HiL) test bench evaluating an adaptive cruise control (ACC) system.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"413 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124410800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Fixing of Security Vulnerabilities in Open Source Projects: A Case Study of Apache HTTP Server and Apache Tomcat 修复开源项目中的安全漏洞:以Apache HTTP服务器和Apache Tomcat为例

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00017

Valentina Piantadosi, Simone Scalabrino, R. Oliveto

Software vulnerabilities are particularly dangerous bugs that may allow an attacker to violate the confidentiality, integrity or availability constraints of a software system. Fixing vulnerabilities soon is of primary importance; besides, it is crucial to release complete patches that do not leave any corner case not covered. In this paper we study the process of vulnerability fixing in Open Source Software. We focus on three dimensions: personal, i.e., who fixes software vulnerabilities; temporal, i.e., how long does it take to release a patch; procedural, i.e., what is the process followed to fix the vulnerability. In the context of our study we analyzed 337 CVE Entries regarding Apache HTTP Server and Apache Tomcat and we manually linked them to the patches written to fix such vulnerabilities and their related commits. The results show that developers who fix software vulnerabilities are much more experienced than the average. Furthermore, we observed that the vulnerabilities are fixed through more than a commit and, surprisingly, that in about 3% of the cases such vulnerabilities show up again in future releases (i.e., they are not actually fixed). In the light of such results, we derived some lessons learned that represent a starting point for future research directions aiming at better supporting developers during the documentation and fixing of vulnerabilities.

软件漏洞是特别危险的错误，它可能允许攻击者违反软件系统的机密性、完整性或可用性约束。尽快修复漏洞是最重要的;此外，发布完整的补丁是至关重要的，不能留下任何未覆盖的角落。本文研究了开源软件的漏洞修复过程。我们关注三个方面:个人，即修复软件漏洞的人;时间，即发布一个补丁需要多长时间;过程性，即修复漏洞所遵循的过程。在我们的研究中，我们分析了337个关于Apache HTTP Server和Apache Tomcat的CVE条目，并将它们手动链接到修复此类漏洞的补丁及其相关提交。结果表明，修复软件漏洞的开发人员比一般人更有经验。此外，我们观察到这些漏洞是通过不止一次提交修复的，令人惊讶的是，在大约3%的情况下，这些漏洞在未来的版本中再次出现(也就是说，它们实际上没有修复)。根据这些结果，我们得出了一些经验教训，这些经验教训代表了未来研究方向的起点，旨在更好地在文档和修复漏洞期间支持开发人员。

{"title":"Fixing of Security Vulnerabilities in Open Source Projects: A Case Study of Apache HTTP Server and Apache Tomcat","authors":"Valentina Piantadosi, Simone Scalabrino, R. Oliveto","doi":"10.1109/ICST.2019.00017","DOIUrl":"https://doi.org/10.1109/ICST.2019.00017","url":null,"abstract":"Software vulnerabilities are particularly dangerous bugs that may allow an attacker to violate the confidentiality, integrity or availability constraints of a software system. Fixing vulnerabilities soon is of primary importance; besides, it is crucial to release complete patches that do not leave any corner case not covered. In this paper we study the process of vulnerability fixing in Open Source Software. We focus on three dimensions: personal, i.e., who fixes software vulnerabilities; temporal, i.e., how long does it take to release a patch; procedural, i.e., what is the process followed to fix the vulnerability. In the context of our study we analyzed 337 CVE Entries regarding Apache HTTP Server and Apache Tomcat and we manually linked them to the patches written to fix such vulnerabilities and their related commits. The results show that developers who fix software vulnerabilities are much more experienced than the average. Furthermore, we observed that the vulnerabilities are fixed through more than a commit and, surprisingly, that in about 3% of the cases such vulnerabilities show up again in future releases (i.e., they are not actually fixed). In the light of such results, we derived some lessons learned that represent a starting point for future research directions aiming at better supporting developers during the documentation and fixing of vulnerabilities.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130745884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Techniques for Evolution-Aware Runtime Verification 进化感知运行时验证技术

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00037

Owolabi Legunsen, Yi Zhang, Milica Hadzi-Tanovic, Grigore Roşu, D. Marinov

Runtime Verification (RV) can help find bugs by monitoring program executions against formal properties. Developers should ideally use RV whenever they run tests, to find more bugs earlier. Despite tremendous research progress, RV still incurs high overhead in (1) machine time to monitor properties and (2) developer time to wait for and inspect violations from test executions that do not satisfy the properties. Moreover, all prior RV techniques consider only one program version and wastefully re-monitor unaffected properties and code as software evolves. We present the first evolution-aware RV techniques that reduce RV overhead across multiple program versions. Regression Property Selection (RPS) re-monitors only properties that can be violated in parts of code affected by changes, reducing machine time and developer time. Violation Message Suppression (VMS) simply shows only new violations to reduce developer time; it does not reduce machine time. Regression Property Prioritization (RPP) splits RV in two phases: properties more likely to find bugs are monitored in a critical phase to provide faster feedback to the developers; the rest are monitored in a background phase. We compare our techniques with the evolution-unaware (base) RV when monitoring test executions in 200 versions of 10 open-source projects. RPS and the RPP critical phase reduce the average RV overhead from 9.4x (for base RV) to 1.8x, without missing any new violations. VMS reduces the average number of violations 540x, from 54 violations per version (for base RV) to one violation per 10 versions.

运行时验证(RV)可以根据正式属性监视程序的执行，从而帮助发现bug。开发人员应该在运行测试时理想地使用RV，以便尽早发现更多的bug。尽管研究取得了巨大的进展，但是RV仍然会带来很高的开销(1)监控属性的机器时间和(2)等待和检查不满足属性的测试执行的违例的开发人员时间。此外，所有以前的RV技术只考虑一个程序版本，并且随着软件的发展浪费地重新监视未受影响的属性和代码。我们提出了第一个进化感知的RV技术，可以减少跨多个程序版本的RV开销。回归属性选择(RPS)只重新监视受更改影响的代码部分中可能违反的属性，从而减少了机器时间和开发人员时间。违规消息抑制(VMS)简单地只显示新的违规，以减少开发人员的时间;它不会减少机器时间。回归属性优先级(RPP)将RV分为两个阶段:在关键阶段监控更容易发现bug的属性，以便向开发人员提供更快的反馈;其余的则在后台进行监控。在监视10个开源项目的200个版本的测试执行时，我们将我们的技术与不了解进化的(基础)RV进行比较。RPS和RPP关键阶段将平均RV开销从9.4倍(基本RV)减少到1.8倍，没有遗漏任何新的违规行为。VMS将违规的平均数量减少了540倍，从每个版本54个违规(对于基本RV)减少到每10个版本一个违规。

{"title":"Techniques for Evolution-Aware Runtime Verification","authors":"Owolabi Legunsen, Yi Zhang, Milica Hadzi-Tanovic, Grigore Roşu, D. Marinov","doi":"10.1109/ICST.2019.00037","DOIUrl":"https://doi.org/10.1109/ICST.2019.00037","url":null,"abstract":"Runtime Verification (RV) can help find bugs by monitoring program executions against formal properties. Developers should ideally use RV whenever they run tests, to find more bugs earlier. Despite tremendous research progress, RV still incurs high overhead in (1) machine time to monitor properties and (2) developer time to wait for and inspect violations from test executions that do not satisfy the properties. Moreover, all prior RV techniques consider only one program version and wastefully re-monitor unaffected properties and code as software evolves. We present the first evolution-aware RV techniques that reduce RV overhead across multiple program versions. Regression Property Selection (RPS) re-monitors only properties that can be violated in parts of code affected by changes, reducing machine time and developer time. Violation Message Suppression (VMS) simply shows only new violations to reduce developer time; it does not reduce machine time. Regression Property Prioritization (RPP) splits RV in two phases: properties more likely to find bugs are monitored in a critical phase to provide faster feedback to the developers; the rest are monitored in a background phase. We compare our techniques with the evolution-unaware (base) RV when monitoring test executions in 200 versions of 10 open-source projects. RPS and the RPP critical phase reduce the average RV overhead from 9.4x (for base RV) to 1.8x, without missing any new violations. VMS reduces the average number of violations 540x, from 54 violations per version (for base RV) to one violation per 10 versions.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125135119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Automatic Visual Verification of Layout Failures in Responsively Designed Web Pages 响应式设计网页中布局失败的自动视觉验证

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00027

Ibrahim Althomali, G. M. Kapfhammer, Phil McMinn

Responsively designed web pages adjust their layout according to the viewport width of the device in use. Although tools exist to help developers test the layout of a responsive web page, they often rely on humans to flag problems. Yet, the considerable number of web-enabled devices with unique viewport widths makes this manual process both time-consuming and error-prone. Capable of detecting some common responsive layout failures, the ReDeCheck tool partially automates this process. Since ReDeCheck focuses on a web page's document object model (DOM), some of the issues it finds are not observable by humans. This paper presents a tool, called Viser, that renders a ReDeCheck-reported layout issue in a browser, adjusting the opacity of certain elements and checking for a visible difference. Unless Viser classifies an issue as a human-observable layout failure, a web developer can ignore it. This paper's experiments reveal the benefit of using Viser to support automated visual verification of layout failures in responsively designed web pages. Viser automatically classified all of the 117 layout failures that ReDeCheck reported for 20 web pages, each of which had to be manually analyzed in a prior study. Viser's automated manipulation of element opacity also highlighted manual classification's subjectivity: it categorized 28 issues differently to manual analysis, including three correctly reclassified as false positives.

响应式设计的网页根据所使用设备的视口宽度调整其布局。虽然有工具可以帮助开发人员测试响应式网页的布局，但它们往往依赖于人类来标记问题。然而，大量具有独特视口宽度的网络设备使得这一手动过程既耗时又容易出错。ReDeCheck工具能够检测到一些常见的响应式布局故障，部分地自动化了这一过程。由于ReDeCheck关注的是网页的文档对象模型(DOM)，它发现的一些问题是人类无法观察到的。本文介绍了一个名为Viser的工具，它可以在浏览器中呈现redecheck报告的布局问题，调整某些元素的不透明度并检查是否存在可见差异。除非Viser将问题归类为人类可观察到的布局失败，否则web开发人员可以忽略它。本文的实验揭示了在响应式设计的网页中使用Viser支持对布局失败进行自动视觉验证的好处。Viser自动对ReDeCheck报告的20个网页的117个布局失败进行了分类，每一个都必须在之前的研究中进行人工分析。Viser对元素不透明度的自动操作也凸显了人工分类的主观性:它对28个问题的分类与人工分析不同，其中有3个问题被正确地重新分类为假阳性。

{"title":"Automatic Visual Verification of Layout Failures in Responsively Designed Web Pages","authors":"Ibrahim Althomali, G. M. Kapfhammer, Phil McMinn","doi":"10.1109/ICST.2019.00027","DOIUrl":"https://doi.org/10.1109/ICST.2019.00027","url":null,"abstract":"Responsively designed web pages adjust their layout according to the viewport width of the device in use. Although tools exist to help developers test the layout of a responsive web page, they often rely on humans to flag problems. Yet, the considerable number of web-enabled devices with unique viewport widths makes this manual process both time-consuming and error-prone. Capable of detecting some common responsive layout failures, the ReDeCheck tool partially automates this process. Since ReDeCheck focuses on a web page's document object model (DOM), some of the issues it finds are not observable by humans. This paper presents a tool, called Viser, that renders a ReDeCheck-reported layout issue in a browser, adjusting the opacity of certain elements and checking for a visible difference. Unless Viser classifies an issue as a human-observable layout failure, a web developer can ignore it. This paper's experiments reveal the benefit of using Viser to support automated visual verification of layout failures in responsively designed web pages. Viser automatically classified all of the 117 layout failures that ReDeCheck reported for 20 web pages, each of which had to be manually analyzed in a prior study. Viser's automated manipulation of element opacity also highlighted manual classification's subjectivity: it categorized 28 issues differently to manual analysis, including three correctly reclassified as false positives.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125199715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Efficiently Repairing Internationalization Presentation Failures by Solving Layout Constraints 通过求解布局约束有效修复国际化表示失败

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00026

Abdulmajeed Alameer, Paul T. Chiou, William G. J. Halfond

Web developers employ internationalization frameworks to automate web page translations and enable their web apps to more easily communicate with a global audience. However, the change of text size in different languages can lead to distortions in the translated web page's layout. These distortions are known as Internationalization Presentation Failures (IPFs). Debugging these IPFs can be a tedious and error-prone process. Previous research efforts to develop an automatic IPF repair technique could compromise the attractiveness and readability of the repaired web page. In this paper, we present a novel approach that can rapidly repair IPFs and maintain the readability and the attractiveness of the web page. Our approach models the correct layout of a web page as a system of constraints. The solution to the system represents the new and correct layout of the web page that resolves its IPFs. In the evaluation, we found that our approach could more quickly produce repairs that were rated as more attractive and more readable than those produced by a prior state-of-the-art technique.

Web开发人员使用国际化框架来自动化Web页面翻译，并使他们的Web应用程序能够更轻松地与全球受众进行通信。然而，不同语言中文本大小的变化会导致翻译后网页布局的扭曲。这些扭曲被称为国际化表示失败(ipf)。调试这些ipf可能是一个繁琐且容易出错的过程。以往的研究开发的自动IPF修复技术可能会损害修复后网页的吸引力和可读性。在本文中，我们提出了一种新的方法，可以快速修复ipf，并保持网页的可读性和吸引力。我们的方法将网页的正确布局建模为约束系统。该系统的解决方案代表了解析其ipf的网页的新的正确布局。在评估中，我们发现我们的方法可以更快地产生修复，被评为更具吸引力和可读性比那些由先前的最先进的技术。

{"title":"Efficiently Repairing Internationalization Presentation Failures by Solving Layout Constraints","authors":"Abdulmajeed Alameer, Paul T. Chiou, William G. J. Halfond","doi":"10.1109/ICST.2019.00026","DOIUrl":"https://doi.org/10.1109/ICST.2019.00026","url":null,"abstract":"Web developers employ internationalization frameworks to automate web page translations and enable their web apps to more easily communicate with a global audience. However, the change of text size in different languages can lead to distortions in the translated web page's layout. These distortions are known as Internationalization Presentation Failures (IPFs). Debugging these IPFs can be a tedious and error-prone process. Previous research efforts to develop an automatic IPF repair technique could compromise the attractiveness and readability of the repaired web page. In this paper, we present a novel approach that can rapidly repair IPFs and maintain the readability and the attractiveness of the web page. Our approach models the correct layout of a web page as a system of constraints. The solution to the system represents the new and correct layout of the web page that resolves its IPFs. In the evaluation, we found that our approach could more quickly produce repairs that were rated as more attractive and more readable than those produced by a prior state-of-the-art technique.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134292511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Parallel Many-Objective Search for Unit Tests 并行多目标搜索单元测试

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00014

Verena Bader, José Campos, G. Fraser

Meta-heuristic search algorithms such as genetic algorithms have been applied successfully to generate unit tests, but typically take long to produce reasonable results, achieve sub-optimal code coverage, and have large variance due to their stochastic nature. Parallel genetic algorithms have been shown to be an effective improvement over sequential algorithms in many domains, but have seen little exploration in the context of unit test generation to date. In this paper, we describe a parallelised version of the many-objective sorting algorithm (MOSA) for test generation. Through the use of island models, where individuals can migrate between independently evolving populations, this algorithm not only reduces the necessary search time, but produces overall better results. Experiments with an implementation of parallel MOSA on the EvoSuite test generation tool using a large corpus of complex open source Java classes confirm that the parallelised MOSA algorithm achieves on average 84% code coverage, compared to 79% achieved by a standard sequential version.

元启发式搜索算法(例如遗传算法)已经成功地应用于生成单元测试，但是通常需要很长时间才能产生合理的结果，实现次优的代码覆盖率，并且由于其随机性质而具有很大的差异。并行遗传算法在许多领域已被证明是对顺序算法的有效改进，但迄今为止在单元测试生成方面的探索很少。在本文中，我们描述了用于测试生成的多目标排序算法(MOSA)的并行化版本。通过使用岛屿模型，个体可以在独立进化的种群之间迁移，该算法不仅减少了必要的搜索时间，而且总体上产生了更好的结果。在EvoSuite测试生成工具上使用大量复杂的开源Java类的语料库实现并行MOSA的实验证实，并行MOSA算法实现了平均84%的代码覆盖率，而标准顺序版本实现了79%。

引用次数: 1

Suspend-Less Debugging for Interactive and/or Realtime Programs 交互式和/或实时程序的无挂起调试

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00028

Haruto Tanno, H. Iwasaki

Programs with interactive and/or realtime activities, such as GUI programs, action game programs, network-based programs, and sensor information processing programs, are not suitable for traditional breakpoint-based debugging, in which execution of the target program is suspended, for two reasons. First, since the timings and order of input event occurrences such as user operations are quite important, such programs do not behave as expected if execution is suspended at a breakpoint. Second, suspending a program to observe its internal states significantly degrades the efficiency of debugging. A debugging method is presented that resolves these problems. It keeps track of both the currently executing statement in a program and the changes in value of expressions of interest, and visualizes them in realtime. The proposed method was implemented as SLDSharp, a debugger for C# programs, by means of a program transformation technique. Through a case study of debugging a practical game program created by using the Unity game engine, it is shown in that SLDSharp makes it possible to efficiently debug.

具有交互和/或实时活动的程序，如GUI程序、动作游戏程序、基于网络的程序和传感器信息处理程序，不适合传统的基于断点的调试，其中目标程序的执行被暂停，原因有两个。首先，由于输入事件(如用户操作)发生的时间和顺序非常重要，如果在断点处暂停执行，这些程序的行为就不会像预期的那样。第二，挂起一个程序来观察它的内部状态会显著降低调试的效率。提出了一种解决这些问题的调试方法。它跟踪程序中当前执行的语句和感兴趣的表达式值的变化，并实时地将它们可视化。该方法通过程序转换技术在c#程序调试器SLDSharp中实现。通过对使用Unity游戏引擎编写的一个实际游戏程序进行调试的案例研究，表明SLDSharp使高效调试成为可能。

引用次数: 1

SeqFuzzer: An Industrial Protocol Fuzzing Framework from a Deep Learning Perspective SeqFuzzer:深度学习视角下的工业协议模糊框架

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00016

Hui Zhao, Zhihui Li, Hansheng Wei, Jianqi Shi, Yanhong Huang

Industrial networks are the cornerstone of modern industrial control systems. Performing security checks of industrial communication processes helps detect unknown risks and vulnerabilities. Fuzz testing is a widely used method for performing security checks that takes advantage of automation. However, there is a big challenge to carry out security checks on industrial network due to the increasing variety and complexity of industrial communication protocols. In this case, existing approaches usually take a long time to model the protocol for generating test cases, which is labor-intensive and time-consuming. This becomes even worse when the target protocol is stateful. To help in addressing this problem, we employed a deep learning model to learn the structures of protocol frames and deal with the temporal features of stateful protocols. We propose a fuzzing framework named SeqFuzzer which automatically learns the protocol frame structures from communication traffic and generates fake but plausible messages as test cases. For proving the usability of our approach, we applied SeqFuzzer to widely-used Ethernet for Control Automation Technology (EtherCAT) devices and successfully detected several security vulnerabilities.

工业网络是现代工业控制系统的基石。执行工业通信过程的安全检查有助于检测未知的风险和漏洞。模糊测试是一种广泛使用的方法，用于执行利用自动化的安全检查。然而，由于工业通信协议的多样性和复杂性的增加，对工业网络进行安全检查带来了很大的挑战。在这种情况下，现有的方法通常需要花费很长时间来为生成测试用例的协议建模，这是一项劳动密集型的工作，而且非常耗时。如果目标协议是有状态的，情况就更糟了。为了帮助解决这个问题，我们采用了一个深度学习模型来学习协议框架的结构，并处理有状态协议的时间特征。我们提出了一个名为SeqFuzzer的模糊测试框架，它自动从通信流量中学习协议框架结构，并生成虚假但可信的消息作为测试用例。为了证明我们方法的可用性，我们将SeqFuzzer应用于广泛使用的以太网控制自动化技术(EtherCAT)设备，并成功检测到几个安全漏洞。

{"title":"SeqFuzzer: An Industrial Protocol Fuzzing Framework from a Deep Learning Perspective","authors":"Hui Zhao, Zhihui Li, Hansheng Wei, Jianqi Shi, Yanhong Huang","doi":"10.1109/ICST.2019.00016","DOIUrl":"https://doi.org/10.1109/ICST.2019.00016","url":null,"abstract":"Industrial networks are the cornerstone of modern industrial control systems. Performing security checks of industrial communication processes helps detect unknown risks and vulnerabilities. Fuzz testing is a widely used method for performing security checks that takes advantage of automation. However, there is a big challenge to carry out security checks on industrial network due to the increasing variety and complexity of industrial communication protocols. In this case, existing approaches usually take a long time to model the protocol for generating test cases, which is labor-intensive and time-consuming. This becomes even worse when the target protocol is stateful. To help in addressing this problem, we employed a deep learning model to learn the structures of protocol frames and deal with the temporal features of stateful protocols. We propose a fuzzing framework named SeqFuzzer which automatically learns the protocol frame structures from communication traffic and generates fake but plausible messages as test cases. For proving the usability of our approach, we applied SeqFuzzer to widely-used Ethernet for Control Automation Technology (EtherCAT) devices and successfully detected several security vulnerabilities.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122479957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32

iDFlakies: A Framework for Detecting and Partially Classifying Flaky Tests flakies:一个检测和部分分类flakies测试的框架

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00038

Wing Lam, Reed Oei, A. Shi, D. Marinov, Tao Xie

Regression testing is increasingly important with the wide use of continuous integration. A desirable requirement for regression testing is that a test failure reliably indicates a problem in the code under test and not a false alarm from the test code or the testing infrastructure. However, some test failures are unreliable, stemming from flaky tests that can nondeterministically pass or fail for the same code under test. There are many types of flaky tests, with order-dependent tests being a prominent type. To help advance research on flaky tests, we present (1) a framework, iDFlakies, to detect and partially classify flaky tests; (2) a dataset of flaky tests in open-source projects; and (3) a study with our dataset. iDFlakies automates experimentation with our tool for Maven-based Java projects. Using iDFlakies, we build a dataset of 422 flaky tests, with 50.5% order-dependent and 49.5% not. Our study of these flaky tests finds the prevalence of two types of flaky tests, probability of a test-suite run to have at least one failure due to flaky tests, and how different test reorderings affect the number of detected flaky tests. We envision that our work can spur research to alleviate the problem of flaky tests.

随着持续集成的广泛使用，回归测试变得越来越重要。回归测试的一个理想需求是，测试失败可靠地指示了测试代码中的问题，而不是来自测试代码或测试基础结构的错误警报。然而，一些测试失败是不可靠的，源于不稳定的测试，这些测试可能不确定地通过或失败测试中的相同代码。片状测试有很多种类型，顺序相关测试是一种突出的类型。为了进一步推进片状测试的研究，我们提出了(1)一个检测和部分分类片状测试的框架iDFlakies;(2)开源项目中片状测试数据集;(3)使用我们的数据集进行研究。iDFlakies为基于maven的Java项目自动使用我们的工具进行实验。使用iDFlakies，我们构建了422个片状测试的数据集，其中50.5%依赖于顺序，49.5%不依赖于顺序。我们对这些片状测试的研究发现了两种类型片状测试的普遍性，测试套件运行至少有一个由于片状测试而失败的概率，以及不同的测试重新排序如何影响检测到的片状测试的数量。我们设想，我们的工作可以促进研究，以减轻不可靠的测试问题。

{"title":"iDFlakies: A Framework for Detecting and Partially Classifying Flaky Tests","authors":"Wing Lam, Reed Oei, A. Shi, D. Marinov, Tao Xie","doi":"10.1109/ICST.2019.00038","DOIUrl":"https://doi.org/10.1109/ICST.2019.00038","url":null,"abstract":"Regression testing is increasingly important with the wide use of continuous integration. A desirable requirement for regression testing is that a test failure reliably indicates a problem in the code under test and not a false alarm from the test code or the testing infrastructure. However, some test failures are unreliable, stemming from flaky tests that can nondeterministically pass or fail for the same code under test. There are many types of flaky tests, with order-dependent tests being a prominent type. To help advance research on flaky tests, we present (1) a framework, iDFlakies, to detect and partially classify flaky tests; (2) a dataset of flaky tests in open-source projects; and (3) a study with our dataset. iDFlakies automates experimentation with our tool for Maven-based Java projects. Using iDFlakies, we build a dataset of 422 flaky tests, with 50.5% order-dependent and 49.5% not. Our study of these flaky tests finds the prevalence of two types of flaky tests, probability of a test-suite run to have at least one failure due to flaky tests, and how different test reorderings affect the number of detected flaky tests. We envision that our work can spur research to alleviate the problem of flaky tests.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126248715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 100

Classifying False Positive Static Checker Alarms in Continuous Integration Using Convolutional Neural Networks 基于卷积神经网络的连续积分静态检查器报警误报分类

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-22 DOI: 10.1109/ICST.2019.00048

Seongmin Lee, Shin Hong, Jungbae Yi, Taeksu Kim, Chul-Joo Kim, S. Yoo

Static code analysis in Continuous Integration (CI) environment can significantly improve the quality of a software system because it enables early detection of defects without any test executions or user interactions. However, being a conservative over-approximation of system behaviours, static analysis also produces a large number of false positive alarms, identification of which takes up valuable developer time. We present an automated classifier based on Convolutional Neural Networks (CNNs). We hypothesise that many false positive alarms can be classified by identifying specific lexical patterns in the parts of the code that raised the alarm: human engineers adopt a similar tactic. We train a CNN based classifier to learn and detect these lexical patterns, using a total of about 10K historical static analysis alarms generated by six static analysis checkers for over 27 million LOC, and their labels assigned by actual developers. The results of our empirical evaluation suggest that our classifier can be highly effective for identifying false positive alarms, with the average precision across all six checkers of 79.72%.

持续集成(CI)环境中的静态代码分析可以显著地提高软件系统的质量，因为它可以在没有任何测试执行或用户交互的情况下早期检测缺陷。然而，作为对系统行为的保守的过度近似，静态分析也会产生大量的误报警报，识别它们占用了开发人员宝贵的时间。提出了一种基于卷积神经网络(cnn)的自动分类器。我们假设，许多误报警报可以通过识别发出警报的代码部分的特定词汇模式来分类:人类工程师采用类似的策略。我们训练了一个基于CNN的分类器来学习和检测这些词汇模式，使用由六个静态分析检查器为超过2700万个LOC生成的总共约10K个历史静态分析警报，以及由实际开发人员分配的标签。我们的经验评估结果表明，我们的分类器可以非常有效地识别假阳性警报，所有六个检查器的平均精度为79.72%。

{"title":"Classifying False Positive Static Checker Alarms in Continuous Integration Using Convolutional Neural Networks","authors":"Seongmin Lee, Shin Hong, Jungbae Yi, Taeksu Kim, Chul-Joo Kim, S. Yoo","doi":"10.1109/ICST.2019.00048","DOIUrl":"https://doi.org/10.1109/ICST.2019.00048","url":null,"abstract":"Static code analysis in Continuous Integration (CI) environment can significantly improve the quality of a software system because it enables early detection of defects without any test executions or user interactions. However, being a conservative over-approximation of system behaviours, static analysis also produces a large number of false positive alarms, identification of which takes up valuable developer time. We present an automated classifier based on Convolutional Neural Networks (CNNs). We hypothesise that many false positive alarms can be classified by identifying specific lexical patterns in the parts of the code that raised the alarm: human engineers adopt a similar tactic. We train a CNN based classifier to learn and detect these lexical patterns, using a total of about 10K historical static analysis alarms generated by six static analysis checkers for over 27 million LOC, and their labels assigned by actual developers. The results of our empirical evaluation suggest that our classifier can be highly effective for identifying false positive alarms, with the average precision across all six checkers of 79.72%.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122586835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21