2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)最新文献

英文中文

ICSE-NIER 2023 Committee Lists ICSE-NIER 2023委员会名单

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-05-01 DOI: 10.1109/icse-nier58687.2023.00006

引用次数: 0

Test-Driven Development Benefits Beyond Design Quality: Flow State and Developer Experience 测试驱动开发的好处超越了设计质量:流程状态和开发人员经验

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-05-01 DOI: 10.1109/ICSE-NIER58687.2023.00025

Pedro Henrique Calais Guerra, Lissa Franzini

Test-driven development (TDD) is a coding technique that combines design and testing in an iterative and incremental fashion. It prescribes that tests written before the production code help the developer to find good interfaces and to evolve the design safely and incrementally. Improvements on the design of code produced by the test-driven development approach have been extensively evaluated in the literature; in this research, we focus on seeking explanations on the benefits of TDD in another dimension which we believe has been undervalued – developer experience. We identified that there is a natural connection between the TDD approach and flow state, a well-known mental state characterized by total immersion, focus, and involvement in a task that promotes increased enjoyment and productivity. We present evidence that the continuous stream of mini-scope, short-lived, red-green-refactor cycles of TDD frame the development task as a structure that creates the pre-conditions reported by neuroscience research to produce flow state, namely (1) clear goals, (2) unambiguous feedback, (3) challenge-skill balance and (4) sense of control. Our work contributes to increase the understanding on the reasons why adopting practices such as TDD can benefit the software development process as a whole and can support its adoption in software development projects.

测试驱动开发(TDD)是一种编码技术，它以迭代和增量的方式结合了设计和测试。它规定，在生产代码之前编写的测试可以帮助开发人员找到良好的接口，并以安全和增量的方式发展设计。测试驱动开发方法对代码设计的改进已经在文献中得到了广泛的评估;在这项研究中，我们专注于在我们认为被低估的另一个维度——开发人员经验——中寻找关于TDD好处的解释。我们发现TDD方法和心流状态之间存在着一种自然的联系，心流状态是一种众所周知的精神状态，其特征是完全沉浸、专注和参与到一项任务中，从而提高了乐趣和生产力。我们提供的证据表明，TDD的小范围、短期、红绿重构周期的连续流将开发任务框架为一个结构，该结构创造了神经科学研究报告的产生流状态的先决条件，即(1)明确的目标，(2)明确的反馈，(3)挑战-技能平衡和(4)控制感。我们的工作有助于增加对采用诸如TDD之类的实践可以使软件开发过程整体受益并支持在软件开发项目中采用TDD的原因的理解。

{"title":"Test-Driven Development Benefits Beyond Design Quality: Flow State and Developer Experience","authors":"Pedro Henrique Calais Guerra, Lissa Franzini","doi":"10.1109/ICSE-NIER58687.2023.00025","DOIUrl":"https://doi.org/10.1109/ICSE-NIER58687.2023.00025","url":null,"abstract":"Test-driven development (TDD) is a coding technique that combines design and testing in an iterative and incremental fashion. It prescribes that tests written before the production code help the developer to find good interfaces and to evolve the design safely and incrementally. Improvements on the design of code produced by the test-driven development approach have been extensively evaluated in the literature; in this research, we focus on seeking explanations on the benefits of TDD in another dimension which we believe has been undervalued – developer experience. We identified that there is a natural connection between the TDD approach and flow state, a well-known mental state characterized by total immersion, focus, and involvement in a task that promotes increased enjoyment and productivity. We present evidence that the continuous stream of mini-scope, short-lived, red-green-refactor cycles of TDD frame the development task as a structure that creates the pre-conditions reported by neuroscience research to produce flow state, namely (1) clear goals, (2) unambiguous feedback, (3) challenge-skill balance and (4) sense of control. Our work contributes to increase the understanding on the reasons why adopting practices such as TDD can benefit the software development process as a whole and can support its adoption in software development projects.","PeriodicalId":297025,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125104600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Assurance Case Development as Data: A Manifesto 保证案例开发作为数据:一个宣言

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-05-01 DOI: 10.1109/ICSE-NIER58687.2023.00030

C. Menghi, Torin Viger, Alessio Di Sandro, Chris Rees, Jeff Joyce, M. Chechik

Safety problems can be costly and catastrophic. Engineers typically rely on assurance cases to ensure their systems are adequately safe. Building safe software systems requires engineers to iteratively design, analyze and refine assurance cases until sufficient safety evidence is identified. The assurance case development is typically manual, time-consuming, and far from being straightforward. This paper presents a manifesto for our forward-looking idea: using assurance cases as data. We argue that engineers produce a lot of data during the assurance case development process, and such data can be collected and used to effectively improve this process. Therefore, in this manifesto, we propose to monitor the assurance case development activities, treat assurance cases as data, and learn suggestions that help safety engineers in designing safer systems.

安全问题可能代价高昂，而且是灾难性的。工程师通常依靠保证案例来确保他们的系统足够安全。构建安全的软件系统需要工程师迭代地设计、分析和细化保证案例，直到确定足够的安全证据。保证用例的开发通常是手工的、耗时的，而且远非直截了当。本文提出了我们的前瞻性思想宣言:使用保证案例作为数据。我们认为工程师在保证案例开发过程中产生了大量的数据，这些数据可以被收集并用于有效地改进这一过程。因此，在本宣言中，我们建议监控保证案例开发活动，将保证案例视为数据，并学习帮助安全工程师设计更安全系统的建议。

引用次数: 0

A Novel and Pragmatic Scenario Modeling Framework with Verification-in-the-loop for Autonomous Driving Systems 一种新颖实用的自动驾驶系统环内验证场景建模框架

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-05-01 DOI: 10.1109/ICSE-NIER58687.2023.00021

Dehui Du, Bo Li, Chenghang Zheng, Xinyuan Zhang

Scenario modeling for Autonomous Driving Systems (ADS) enables scenario-based simulation and verification which are critical for the development of safe ADS. However, with the increasing complexity and uncertainty of ADS, it becomes increasingly challenging to manually model driving scenarios and conduct verification analysis. To tackle these challenges, we propose a novel and pragmatic framework for scenario modeling, simulation and verification. The novelty is that it’s a verification-in-the-loop scenario modeling framework. The scenario modeling language with formal semantics is proposed based on the domain knowledge of ADS. It facilitates scenario verification to analyze the safety of scenario models. Moreover, the scenario simulation is implemented based on the scenario executor. Compared with existing works, our framework can simplify the description of scenarios in a non-programming, user-friendly manner, model stochastic behavior of vehicles, support safe verification of scenario models with UPPAAL-SMC and generate executable scenario in some open-source simulators such as CARLA. To preliminarily demonstrate the effectiveness and feasibility of our approach, we build a prototype tool and apply our approach in several typical scenarios for ADS.

基于场景建模的自动驾驶系统(ADS)实现了基于场景的仿真和验证，这对开发安全的ADS至关重要。然而，随着自动驾驶系统复杂性和不确定性的不断增加，人工建模驾驶场景并进行验证分析变得越来越具有挑战性。为了应对这些挑战，我们提出了一个新颖实用的场景建模、仿真和验证框架。新颖之处在于它是一个循环验证场景建模框架。基于ADS领域知识，提出了具有形式化语义的场景建模语言，便于场景验证，分析场景模型的安全性。此外，基于场景执行器实现场景仿真。与现有的工作相比，我们的框架可以简化场景的描述，以一种非编程、用户友好的方式，模拟车辆的随机行为，支持UPPAAL-SMC对场景模型的安全验证，并在一些开源模拟器(如CARLA)中生成可执行的场景。为了初步验证我们的方法的有效性和可行性，我们构建了一个原型工具，并将我们的方法应用于几个典型的ADS场景。

{"title":"A Novel and Pragmatic Scenario Modeling Framework with Verification-in-the-loop for Autonomous Driving Systems","authors":"Dehui Du, Bo Li, Chenghang Zheng, Xinyuan Zhang","doi":"10.1109/ICSE-NIER58687.2023.00021","DOIUrl":"https://doi.org/10.1109/ICSE-NIER58687.2023.00021","url":null,"abstract":"Scenario modeling for Autonomous Driving Systems (ADS) enables scenario-based simulation and verification which are critical for the development of safe ADS. However, with the increasing complexity and uncertainty of ADS, it becomes increasingly challenging to manually model driving scenarios and conduct verification analysis. To tackle these challenges, we propose a novel and pragmatic framework for scenario modeling, simulation and verification. The novelty is that it’s a verification-in-the-loop scenario modeling framework. The scenario modeling language with formal semantics is proposed based on the domain knowledge of ADS. It facilitates scenario verification to analyze the safety of scenario models. Moreover, the scenario simulation is implemented based on the scenario executor. Compared with existing works, our framework can simplify the description of scenarios in a non-programming, user-friendly manner, model stochastic behavior of vehicles, support safe verification of scenario models with UPPAAL-SMC and generate executable scenario in some open-source simulators such as CARLA. To preliminarily demonstrate the effectiveness and feasibility of our approach, we build a prototype tool and apply our approach in several typical scenarios for ADS.","PeriodicalId":297025,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123631738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Judging Adam: Studying the Performance of Optimization Methods on ML4SE Tasks 判断亚当:ML4SE任务优化方法性能研究

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-03-06 DOI: 10.1109/ICSE-NIER58687.2023.00027

D. Pasechnyuk, Anton Prazdnichnykh, Mikhail Evtikhiev, T. Bryksin

Solving a problem with a deep learning model requires researchers to optimize the loss function with a certain optimization method. The research community has developed more than a hundred different optimizers, yet there is scarce data on optimizer performance in various tasks. In particular, none of the benchmarks test the performance of optimizers on source code-related problems. However, existing benchmark data indicates that certain optimizers may be more efficient for particular domains. In this work, we test the performance of various optimizers on deep learning models for source code and find that the choice of an optimizer can have a significant impact on the model quality, with up to two-fold score differences between some of the relatively well-performing optimizers. We also find that RAdam optimizer (and its modification with the Lookahead envelope) is the best optimizer that almost always performs well on the tasks we consider. Our findings show a need for a more extensive study of the optimizers in code-related tasks, and indicate that the ML4SE community should consider using RAdam instead of Adam as the default optimizer for code-related deep learning tasks.

用深度学习模型来解决问题，需要研究人员用一定的优化方法来优化损失函数。研究界已经开发了一百多种不同的优化器，但是关于优化器在各种任务中的性能的数据很少。特别是，没有一个基准测试测试优化器在源代码相关问题上的性能。然而，现有的基准数据表明，某些优化器可能对特定领域更有效。在这项工作中，我们测试了各种优化器在源代码深度学习模型上的性能，发现优化器的选择会对模型质量产生重大影响，一些性能相对较好的优化器之间的得分差异高达两倍。我们还发现，RAdam优化器(以及它对Lookahead信封的修改)是最好的优化器，几乎总是在我们考虑的任务上执行得很好。我们的研究结果表明，需要对代码相关任务中的优化器进行更广泛的研究，并表明ML4SE社区应该考虑使用RAdam而不是Adam作为代码相关深度学习任务的默认优化器。

{"title":"Judging Adam: Studying the Performance of Optimization Methods on ML4SE Tasks","authors":"D. Pasechnyuk, Anton Prazdnichnykh, Mikhail Evtikhiev, T. Bryksin","doi":"10.1109/ICSE-NIER58687.2023.00027","DOIUrl":"https://doi.org/10.1109/ICSE-NIER58687.2023.00027","url":null,"abstract":"Solving a problem with a deep learning model requires researchers to optimize the loss function with a certain optimization method. The research community has developed more than a hundred different optimizers, yet there is scarce data on optimizer performance in various tasks. In particular, none of the benchmarks test the performance of optimizers on source code-related problems. However, existing benchmark data indicates that certain optimizers may be more efficient for particular domains. In this work, we test the performance of various optimizers on deep learning models for source code and find that the choice of an optimizer can have a significant impact on the model quality, with up to two-fold score differences between some of the relatively well-performing optimizers. We also find that RAdam optimizer (and its modification with the Lookahead envelope) is the best optimizer that almost always performs well on the tasks we consider. Our findings show a need for a more extensive study of the optimizers in code-related tasks, and indicate that the ML4SE community should consider using RAdam instead of Adam as the default optimizer for code-related deep learning tasks.","PeriodicalId":297025,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"33 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121097903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities MLTEing模型:协商、评估和记录模型和系统质量

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-03-03 DOI: 10.1109/ICSE-NIER58687.2023.00012

Katherine R. Maffey, Kyle Dotterrer, Jennifer Niemann, Iain J. Cruickshank, G. Lewis, Christian Kästner

Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles state-of-the-art evaluation techniques into an organizational process for interdisciplinary teams, including model developers, software engineers, system owners, and other stakeholders. MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements, an infrastructure to define, generate, and collect ML evaluation metrics, and the means to communicate results.

许多组织试图确保机器学习(ML)和人工智能(AI)系统在生产中按预期工作，但目前没有一个有凝聚力的方法来做到这一点。为了填补这一空白，我们提出了MLTE(机器学习测试和评估，俗称“melt”)，这是一个评估机器学习模型和系统的框架和实现。框架将最先进的评估技术汇编到跨学科团队的组织过程中，包括模型开发人员、软件工程师、系统所有者和其他涉众。MLTE工具通过提供特定于领域的语言来支持这个过程，团队可以使用这种语言来表达模型需求，提供用于定义、生成和收集ML评估度量的基础结构，以及交流结果的方法。

引用次数: 2

Iterative Assessment and Improvement of DNN Operational Accuracy DNN运算精度的迭代评估与改进

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-03-02 DOI: 10.1109/ICSE-NIER58687.2023.00014

Antonio Guerriero, R. Pietrantuono, S. Russo

Deep Neural Networks (DNN) are nowadays largely adopted in many application domains thanks to their human-like, or even superhuman, performance in specific tasks. However, due to unpredictable/unconsidered operating conditions, unexpected failures show up on field, making the performance of a DNN in operation very different from the one estimated prior to release.In the life cycle of DNN systems, the assessment of accuracy is typically addressed in two ways: offline, via sampling of operational inputs, or online, via pseudo-oracles. The former is considered more expensive due to the need for manual labeling of the sampled inputs. The latter is automatic but less accurate.We believe that emerging iterative industrial-strength life cycle models for Machine Learning systems, like MLOps, offer the possibility to leverage inputs observed in operation not only to provide faithful estimates of a DNN accuracy, but also to improve it through remodeling/retraining actions.We propose DAIC (DNN Assessment and Improvement Cycle), an approach which combines "low-cost" online pseudo-oracles and "high-cost" offline sampling techniques to estimate and improve the operational accuracy of a DNN in the iterations of its life cycle. Preliminary results show the benefits of combining the two approaches and integrating them in the DNN life cycle.

如今，深度神经网络(DNN)由于其在特定任务中的类似人类甚至超人的性能而被广泛应用于许多应用领域。然而，由于不可预测/未考虑的操作条件，在现场出现了意外故障，使得DNN在运行中的性能与发布前的估计有很大不同。在深度神经网络系统的生命周期中，准确性评估通常以两种方式进行:离线，通过操作输入的采样，或在线，通过伪预言机。前者被认为更昂贵，因为需要对采样输入进行人工标记。后者是自动的，但不太准确。我们相信，机器学习系统的新兴迭代工业强度生命周期模型，如MLOps，提供了利用运行中观察到的输入的可能性，不仅可以提供对DNN精度的忠实估计，还可以通过重塑/再训练行动来改进它。我们提出了DAIC(深度神经网络评估和改进周期)，这是一种结合了“低成本”在线伪预言器和“高成本”离线采样技术的方法，用于估计和提高深度神经网络在其生命周期迭代中的运行精度。初步结果表明，将这两种方法结合起来并将它们整合到深度神经网络的生命周期中是有益的。

{"title":"Iterative Assessment and Improvement of DNN Operational Accuracy","authors":"Antonio Guerriero, R. Pietrantuono, S. Russo","doi":"10.1109/ICSE-NIER58687.2023.00014","DOIUrl":"https://doi.org/10.1109/ICSE-NIER58687.2023.00014","url":null,"abstract":"Deep Neural Networks (DNN) are nowadays largely adopted in many application domains thanks to their human-like, or even superhuman, performance in specific tasks. However, due to unpredictable/unconsidered operating conditions, unexpected failures show up on field, making the performance of a DNN in operation very different from the one estimated prior to release.In the life cycle of DNN systems, the assessment of accuracy is typically addressed in two ways: offline, via sampling of operational inputs, or online, via pseudo-oracles. The former is considered more expensive due to the need for manual labeling of the sampled inputs. The latter is automatic but less accurate.We believe that emerging iterative industrial-strength life cycle models for Machine Learning systems, like MLOps, offer the possibility to leverage inputs observed in operation not only to provide faithful estimates of a DNN accuracy, but also to improve it through remodeling/retraining actions.We propose DAIC (DNN Assessment and Improvement Cycle), an approach which combines \"low-cost\" online pseudo-oracles and \"high-cost\" offline sampling techniques to estimate and improve the operational accuracy of a DNN in the iterations of its life cycle. Preliminary results show the benefits of combining the two approaches and integrating them in the DNN life cycle.","PeriodicalId":297025,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133630377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Reasoning-Based Software Testing 基于推理的软件测试

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-03-02 DOI: 10.1109/ICSE-NIER58687.2023.00018

L. Giamattei, R. Pietrantuono, S. Russo

With software systems becoming increasingly pervasive and autonomous, our ability to test for their quality is severely challenged. Many systems are called to operate in uncertain and highly-changing environment, not rarely required to make intelligent decisions by themselves. This easily results in an intractable state space to explore at testing time. The state-of-the-art techniques try to keep the pace, e.g., by augmenting the tester’s intuition with some form of (explicit or implicit) learning from observations to search this space efficiently. For instance, they exploit historical data to drive the search (e.g., ML-driven testing) or the tests execution data itself (e.g., adaptive or search-based testing). Despite the indubitable advances, the need for smartening the search in such a huge space keeps to be pressing.We introduce Reasoning-Based Software Testing (RBST), a new way of thinking at the testing problem as a causal reasoning task. Compared to mere intuition-based or state-of-the-art learning-based strategies, we claim that causal reasoning more naturally emulates the process that a human would do to "smartly" search the space. RBST aims to mimic and amplify, with the power of computation, this ability. The conceptual leap can pave the ground to a new trend of techniques, which can be variously instantiated from the proposed framework, by exploiting the numerous tools for causal discovery and inference. Preliminary results reported in this paper are promising.

随着软件系统变得越来越普及和自治，我们测试其质量的能力受到了严重的挑战。许多系统被要求在不确定和高度变化的环境中运行，很少需要自己做出智能决策。这很容易导致在测试时难以探索的状态空间。最先进的技术试图保持速度，例如，通过从观察中学习某种形式(显式或隐式)来增强测试人员的直觉，从而有效地搜索这个空间。例如，他们利用历史数据来驱动搜索(例如，ml驱动的测试)或测试执行数据本身(例如，自适应或基于搜索的测试)。尽管取得了毋庸置疑的进步，但在如此巨大的空间中，智能搜索的需求仍然非常迫切。我们介绍了基于推理的软件测试(reasoning - based Software Testing, RBST)，这是一种将测试问题作为因果推理任务的新思维方式。与单纯的基于直觉或最先进的基于学习的策略相比，我们声称因果推理更自然地模仿人类“聪明地”搜索空间的过程。RBST旨在通过计算能力来模拟和放大这种能力。概念上的飞跃可以为技术的新趋势铺平道路，通过利用大量的因果发现和推理工具，这些技术可以从所提出的框架中得到各种实例化。本文报道的初步结果是有希望的。

{"title":"Reasoning-Based Software Testing","authors":"L. Giamattei, R. Pietrantuono, S. Russo","doi":"10.1109/ICSE-NIER58687.2023.00018","DOIUrl":"https://doi.org/10.1109/ICSE-NIER58687.2023.00018","url":null,"abstract":"With software systems becoming increasingly pervasive and autonomous, our ability to test for their quality is severely challenged. Many systems are called to operate in uncertain and highly-changing environment, not rarely required to make intelligent decisions by themselves. This easily results in an intractable state space to explore at testing time. The state-of-the-art techniques try to keep the pace, e.g., by augmenting the tester’s intuition with some form of (explicit or implicit) learning from observations to search this space efficiently. For instance, they exploit historical data to drive the search (e.g., ML-driven testing) or the tests execution data itself (e.g., adaptive or search-based testing). Despite the indubitable advances, the need for smartening the search in such a huge space keeps to be pressing.We introduce Reasoning-Based Software Testing (RBST), a new way of thinking at the testing problem as a causal reasoning task. Compared to mere intuition-based or state-of-the-art learning-based strategies, we claim that causal reasoning more naturally emulates the process that a human would do to \"smartly\" search the space. RBST aims to mimic and amplify, with the power of computation, this ability. The conceptual leap can pave the ground to a new trend of techniques, which can be variously instantiated from the proposed framework, by exploiting the numerous tools for causal discovery and inference. Preliminary results reported in this paper are promising.","PeriodicalId":297025,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130533619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Safe-DS: A Domain Specific Language to Make Data Science Safe Safe- ds:一种使数据科学安全的领域特定语言

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-02-28 DOI: 10.1109/ICSE-NIER58687.2023.00019

Lars Reimann, Günter Kniesel-Wünsche

Due to the long runtime of Data Science (DS) pipelines, even small programming mistakes can be very costly, if they are not detected statically. However, even basic static type checking of DS pipelines is difficult because most are written in Python. Static typing is available in Python only via external linters. These require static type annotations for parameters or results of functions, which many DS libraries do not provide.In this paper, we show how the wealth of Python DS libraries can be used in a statically safe way via Safe-DS, a domain specific language (DSL) for DS. Safe-DS catches conventional type errors plus errors related to range restrictions, data manipulation, and call order of functions, going well beyond the abilities of current Python linters. Python libraries are integrated into Safe-DS via a stub language for specifying the interface of its declarations, and an API-Editor that is able to extract type information from the code and documentation of Python libraries, and automatically generate suitable stubs.Moreover, Safe-DS complements textual DS pipelines with a graphical representation that eases safe development by preventing syntax errors. The seamless synchronization of textual and graphic view lets developers always choose the one best suited for their skills and current task.We think that Safe-DS can make DS development easier, faster, and more reliable, significantly reducing development costs.

由于数据科学(DS)管道的运行时间很长，即使是很小的编程错误，如果没有被静态地检测到，也可能是非常昂贵的。然而，即使是DS管道的基本静态类型检查也很困难，因为大多数管道都是用Python编写的。静态类型只能通过外部连接器在Python中使用。这需要对函数的参数或结果进行静态类型注释，而许多DS库不提供这些注释。在本文中，我们展示了如何通过safe -DS以静态安全的方式使用丰富的Python DS库，safe -DS是DS的领域特定语言(DSL)。Safe-DS捕获常规类型错误以及与范围限制、数据操作和函数调用顺序相关的错误，远远超出了当前Python编译器的能力。Python库通过存根语言集成到Safe-DS中，存根语言用于指定其声明的接口，API-Editor能够从Python库的代码和文档中提取类型信息，并自动生成合适的存根。此外，safe -DS用图形表示形式补充了文本DS管道，通过防止语法错误来简化安全开发。文本和图形视图的无缝同步使开发人员能够始终选择最适合他们的技能和当前任务的视图。我们认为Safe-DS可以使DS开发更容易、更快、更可靠，显著降低开发成本。

{"title":"Safe-DS: A Domain Specific Language to Make Data Science Safe","authors":"Lars Reimann, Günter Kniesel-Wünsche","doi":"10.1109/ICSE-NIER58687.2023.00019","DOIUrl":"https://doi.org/10.1109/ICSE-NIER58687.2023.00019","url":null,"abstract":"Due to the long runtime of Data Science (DS) pipelines, even small programming mistakes can be very costly, if they are not detected statically. However, even basic static type checking of DS pipelines is difficult because most are written in Python. Static typing is available in Python only via external linters. These require static type annotations for parameters or results of functions, which many DS libraries do not provide.In this paper, we show how the wealth of Python DS libraries can be used in a statically safe way via Safe-DS, a domain specific language (DSL) for DS. Safe-DS catches conventional type errors plus errors related to range restrictions, data manipulation, and call order of functions, going well beyond the abilities of current Python linters. Python libraries are integrated into Safe-DS via a stub language for specifying the interface of its declarations, and an API-Editor that is able to extract type information from the code and documentation of Python libraries, and automatically generate suitable stubs.Moreover, Safe-DS complements textual DS pipelines with a graphical representation that eases safe development by preventing syntax errors. The seamless synchronization of textual and graphic view lets developers always choose the one best suited for their skills and current task.We think that Safe-DS can make DS development easier, faster, and more reliable, significantly reducing development costs.","PeriodicalId":297025,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132810300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

An Alternative to Cells for Selective Execution of Data Science Pipelines 选择性执行数据科学管道的替代单元

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Pub Date : 2023-02-28 DOI: 10.1109/ICSE-NIER58687.2023.00029

Lars Reimann, Günter Kniesel-Wünsche

Data Scientists often use notebooks to develop Data Science (DS) pipelines, particularly since they allow to selectively execute parts of the pipeline. However, notebooks for DS have many well-known flaws. We focus on the following ones in this paper: (1) Notebooks can become littered with code cells that are not part of the main DS pipeline but exist solely to make decisions (e.g. listing the columns of a tabular dataset). (2) While users are allowed to execute cells in any order, not every ordering is correct, because a cell can depend on declarations from other cells. (3) After making changes to a cell, this cell and all cells that depend on changed declarations must be rerun. (4) Changes to external values necessitate partial re-execution of the notebook. (5) Since cells are the smallest unit of execution, code that is unaffected by changes, can inadvertently be re-executed.To solve these issues, we propose to replace cells as the basis for the selective execution of DS pipelines. Instead, we suggest populating a context-menu for variables with actions fitting their type (like listing columns if the variable is a tabular dataset). These actions are executed based on a data-flow analysis to ensure dependencies between variables are respected and results are updated properly after changes. Our solution separates pipeline code from decision making code and automates dependency management, thus reducing clutter and the risk of making errors.

数据科学家经常使用笔记本来开发数据科学(DS)管道，特别是因为它们允许有选择地执行管道的部分。然而，DS笔记本电脑有许多众所周知的缺陷。在本文中，我们主要关注以下几个方面:(1)笔记本可能会被不属于主要DS管道的代码单元弄得杂乱无章，这些代码单元仅用于做出决策(例如列出表格数据集的列)。(2)虽然允许用户以任何顺序执行单元格，但不是每个顺序都是正确的，因为一个单元格可以依赖于其他单元格的声明。(3)在对单元格进行更改后，必须重新运行此单元格和所有依赖于更改声明的单元格。(4)更改外部值需要部分重新执行笔记本。(5)由于单元格是最小的执行单元，不受更改影响的代码可能会无意中被重新执行。为了解决这些问题，我们建议替换细胞作为选择性执行DS管道的基础。相反，我们建议使用适合变量类型的操作填充变量的上下文菜单(例如，如果变量是表格数据集，则列出列)。这些操作是基于数据流分析执行的，以确保变量之间的依赖关系得到尊重，并在更改后正确更新结果。我们的解决方案将管道代码从决策代码中分离出来，并自动化依赖关系管理，从而减少了混乱和出错的风险。

{"title":"An Alternative to Cells for Selective Execution of Data Science Pipelines","authors":"Lars Reimann, Günter Kniesel-Wünsche","doi":"10.1109/ICSE-NIER58687.2023.00029","DOIUrl":"https://doi.org/10.1109/ICSE-NIER58687.2023.00029","url":null,"abstract":"Data Scientists often use notebooks to develop Data Science (DS) pipelines, particularly since they allow to selectively execute parts of the pipeline. However, notebooks for DS have many well-known flaws. We focus on the following ones in this paper: (1) Notebooks can become littered with code cells that are not part of the main DS pipeline but exist solely to make decisions (e.g. listing the columns of a tabular dataset). (2) While users are allowed to execute cells in any order, not every ordering is correct, because a cell can depend on declarations from other cells. (3) After making changes to a cell, this cell and all cells that depend on changed declarations must be rerun. (4) Changes to external values necessitate partial re-execution of the notebook. (5) Since cells are the smallest unit of execution, code that is unaffected by changes, can inadvertently be re-executed.To solve these issues, we propose to replace cells as the basis for the selective execution of DS pipelines. Instead, we suggest populating a context-menu for variables with actions fitting their type (like listing columns if the variable is a tabular dataset). These actions are executed based on a data-flow analysis to ensure dependencies between variables are respected and results are updated properly after changes. Our solution separates pipeline code from decision making code and automates dependency management, thus reducing clutter and the risk of making errors.","PeriodicalId":297025,"journal":{"name":"2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124391764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀