首页 > 最新文献

NASA Formal Methods最新文献

英文 中文
Multi-Objective Task Assignment and Multiagent Planning with Hybrid GPU-CPU Acceleration GPU-CPU混合加速下的多目标任务分配与多智能体规划
Pub Date : 2023-05-08 DOI: 10.48550/arXiv.2305.04397
T. Robinson, Guoxin Su
Allocation and planning with a collection of tasks and a group of agents is an important problem in multiagent systems. One commonly faced bottleneck is scalability, as in general the multiagent model increases exponentially in size with the number of agents. We consider the combination of random task assignment and multiagent planning under multiple-objective constraints, and show that this problem can be decentralised to individual agent-task models. We present an algorithm of point-oriented Pareto computation, which checks whether a point corresponding to given cost and probability thresholds for our formal problem is feasible or not. If the given point is infeasible, our algorithm finds a Pareto-optimal point which is closest to the given point. We provide the first multi-objective model checking framework that simultaneously uses GPU and multi-core acceleration. Our framework manages CPU and GPU devices as a load balancing problem for parallel computation. Our experiments demonstrate that parallelisation achieves significant run time speed-up over sequential computation.
任务集合和一组智能体的分配和规划是多智能体系统中的一个重要问题。一个常见的瓶颈是可伸缩性,因为通常多代理模型的大小随着代理数量呈指数增长。我们考虑了多目标约束下的随机任务分配和多智能体规划的结合,并证明了该问题可以分散到单个智能体-任务模型中。本文提出了一种面向点的帕累托计算算法,用于检验给定代价和概率阈值对应的点是否可行。如果给定的点是不可行的,我们的算法寻找最接近给定点的帕累托最优点。我们提供了第一个同时使用GPU和多核加速的多目标模型检查框架。我们的框架管理CPU和GPU设备作为并行计算的负载平衡问题。我们的实验表明,与顺序计算相比,并行化实现了显著的运行时间加速。
{"title":"Multi-Objective Task Assignment and Multiagent Planning with Hybrid GPU-CPU Acceleration","authors":"T. Robinson, Guoxin Su","doi":"10.48550/arXiv.2305.04397","DOIUrl":"https://doi.org/10.48550/arXiv.2305.04397","url":null,"abstract":"Allocation and planning with a collection of tasks and a group of agents is an important problem in multiagent systems. One commonly faced bottleneck is scalability, as in general the multiagent model increases exponentially in size with the number of agents. We consider the combination of random task assignment and multiagent planning under multiple-objective constraints, and show that this problem can be decentralised to individual agent-task models. We present an algorithm of point-oriented Pareto computation, which checks whether a point corresponding to given cost and probability thresholds for our formal problem is feasible or not. If the given point is infeasible, our algorithm finds a Pareto-optimal point which is closest to the given point. We provide the first multi-objective model checking framework that simultaneously uses GPU and multi-core acceleration. Our framework manages CPU and GPU devices as a load balancing problem for parallel computation. Our experiments demonstrate that parallelisation achieves significant run time speed-up over sequential computation.","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126569299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Linear Weight Transfer Rule for Local Search 一种局部搜索的线性权传递规则
Pub Date : 2023-03-27 DOI: 10.48550/arXiv.2303.14894
Md. Solimul Chowdhury, Cayden Codel, Marijn J. H. Heule
The Divide and Distribute Fixed Weights algorithm (ddfw) is a dynamic local search SAT-solving algorithm that transfers weight from satisfied to falsified clauses in local minima. ddfw is remarkably effective on several hard combinatorial instances. Yet, despite its success, it has received little study since its debut in 2005. In this paper, we propose three modifications to the base algorithm: a linear weight transfer method that moves a dynamic amount of weight between clauses in local minima, an adjustment to how satisfied clauses are chosen in local minima to give weight, and a weighted-random method of selecting variables to flip. We implemented our modifications to ddfw on top of the solver yalsat. Our experiments show that our modifications boost the performance compared to the original ddfw algorithm on multiple benchmarks, including those from the past three years of SAT competitions. Moreover, our improved solver exclusively solves hard combinatorial instances that refute a conjecture on the lower bound of two Van der Waerden numbers set forth by Ahmed et al. (2014), and it performs well on a hard graph-coloring instance that has been open for over three decades.
定权分置算法(ddfw)是一种动态局部搜索sat求解算法,它将权重从局部最小值的满足子句转移到证伪子句。DDFW在一些困难的组合实例上是非常有效的。然而,尽管它取得了成功,但自2005年首次亮相以来,它几乎没有得到任何研究。在本文中,我们提出了对基本算法的三种修改:一种线性权值转移方法,在局部最小值的子句之间动态移动权值;一种调整在局部最小值中如何选择满足的子句来赋予权值;以及一种加权随机选择变量翻转的方法。我们将对ddfw的修改实现在求解器yalsat之上。我们的实验表明,与原始的ddfw算法相比,我们的修改在多个基准测试中提高了性能,包括过去三年的SAT比赛。此外,我们改进的求解器专门解决了反驳Ahmed等人(2014)提出的关于两个Van der Waerden数下界的猜想的困难组合实例,并且它在已经开放了三十多年的硬图着色实例上表现良好。
{"title":"A Linear Weight Transfer Rule for Local Search","authors":"Md. Solimul Chowdhury, Cayden Codel, Marijn J. H. Heule","doi":"10.48550/arXiv.2303.14894","DOIUrl":"https://doi.org/10.48550/arXiv.2303.14894","url":null,"abstract":"The Divide and Distribute Fixed Weights algorithm (ddfw) is a dynamic local search SAT-solving algorithm that transfers weight from satisfied to falsified clauses in local minima. ddfw is remarkably effective on several hard combinatorial instances. Yet, despite its success, it has received little study since its debut in 2005. In this paper, we propose three modifications to the base algorithm: a linear weight transfer method that moves a dynamic amount of weight between clauses in local minima, an adjustment to how satisfied clauses are chosen in local minima to give weight, and a weighted-random method of selecting variables to flip. We implemented our modifications to ddfw on top of the solver yalsat. Our experiments show that our modifications boost the performance compared to the original ddfw algorithm on multiple benchmarks, including those from the past three years of SAT competitions. Moreover, our improved solver exclusively solves hard combinatorial instances that refute a conjecture on the lower bound of two Van der Waerden numbers set forth by Ahmed et al. (2014), and it performs well on a hard graph-coloring instance that has been open for over three decades.","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121957303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automata-Based Software Model Checking of Hyperproperties 基于自动机的超属性软件模型检测
Pub Date : 2023-03-26 DOI: 10.48550/arXiv.2303.14796
B. Finkbeiner, Hadar Frenkel, Jana Hofmann, Jan-Luca Lohse
We develop model checking algorithms for Temporal Stream Logic (TSL) and Hyper Temporal Stream Logic (HyperTSL) modulo theories. TSL extends Linear Temporal Logic (LTL) with memory cells, functions and predicates, making it a convenient and expressive logic to reason over software and other systems with infinite data domains. HyperTSL further extends TSL to the specification of hyperproperties - properties that relate multiple system executions. As such, HyperTSL can express information flow policies like noninterference in software systems. We augment HyperTSL with theories, resulting in HyperTSL(T),and build on methods from LTL software verification to obtain model checking algorithms for TSL and HyperTSL(T). This results in a sound but necessarily incomplete algorithm for specifications contained in the forall*exists* fragment of HyperTSL(T). Our approach constitutes the first software model checking algorithm for temporal hyperproperties with quantifier alternations that does not rely on a finite-state abstraction.
我们开发了时间流逻辑(TSL)和超时间流逻辑(HyperTSL)模理论的模型检查算法。TSL扩展了线性时间逻辑(LTL)的存储单元、函数和谓词,使其成为具有无限数据域的软件和其他系统的方便和富有表现力的逻辑。HyperTSL进一步将TSL扩展到超属性规范——与多个系统执行相关的属性。因此,HyperTSL可以在软件系统中表达诸如不干扰之类的信息流策略。我们用理论增强HyperTSL,得到HyperTSL(T),并在LTL软件验证方法的基础上获得TSL和HyperTSL(T)的模型检查算法。这将导致HyperTSL(T)的forall*exists*片段中包含的规范的健全但必然不完整的算法。我们的方法构成了第一个不依赖于有限状态抽象的具有量词变化的时间超特性的软件模型检查算法。
{"title":"Automata-Based Software Model Checking of Hyperproperties","authors":"B. Finkbeiner, Hadar Frenkel, Jana Hofmann, Jan-Luca Lohse","doi":"10.48550/arXiv.2303.14796","DOIUrl":"https://doi.org/10.48550/arXiv.2303.14796","url":null,"abstract":"We develop model checking algorithms for Temporal Stream Logic (TSL) and Hyper Temporal Stream Logic (HyperTSL) modulo theories. TSL extends Linear Temporal Logic (LTL) with memory cells, functions and predicates, making it a convenient and expressive logic to reason over software and other systems with infinite data domains. HyperTSL further extends TSL to the specification of hyperproperties - properties that relate multiple system executions. As such, HyperTSL can express information flow policies like noninterference in software systems. We augment HyperTSL with theories, resulting in HyperTSL(T),and build on methods from LTL software verification to obtain model checking algorithms for TSL and HyperTSL(T). This results in a sound but necessarily incomplete algorithm for specifications contained in the forall*exists* fragment of HyperTSL(T). Our approach constitutes the first software model checking algorithm for temporal hyperproperties with quantifier alternations that does not rely on a finite-state abstraction.","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"348 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122648147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Strategy Synthesis in Markov Decision Processes Under Limited Sampling Access 有限采样访问下马尔可夫决策过程的策略综合
Pub Date : 2023-03-22 DOI: 10.48550/arXiv.2303.12718
C. Baier, Clemens Dubslaff, Patrick Wienhöft, S. Kiebel
A central task in control theory, artificial intelligence, and formal methods is to synthesize reward-maximizing strategies for agents that operate in partially unknown environments. In environments modeled by gray-box Markov decision processes (MDPs), the impact of the agents' actions are known in terms of successor states but not the stochastics involved. In this paper, we devise a strategy synthesis algorithm for gray-box MDPs via reinforcement learning that utilizes interval MDPs as internal model. To compete with limited sampling access in reinforcement learning, we incorporate two novel concepts into our algorithm, focusing on rapid and successful learning rather than on stochastic guarantees and optimality: lower confidence bound exploration reinforces variants of already learned practical strategies and action scoping reduces the learning action space to promising actions. We illustrate benefits of our algorithms by means of a prototypical implementation applied on examples from the AI and formal methods communities.
控制理论、人工智能和形式化方法的中心任务是为在部分未知环境中操作的代理综合奖励最大化策略。在由灰盒马尔可夫决策过程(mdp)建模的环境中,代理行为的影响是已知的,但不包括所涉及的随机性。在本文中,我们通过强化学习设计了一种灰盒mdp策略综合算法,该算法利用区间mdp作为内部模型。为了与强化学习中有限的采样访问相竞争,我们将两个新概念纳入我们的算法中,专注于快速和成功的学习,而不是随机保证和最优性:低置信度边界探索强化了已经学习的实际策略的变体,行动范围将学习行动空间减少到有希望的行动。我们通过应用于人工智能和形式化方法社区的示例的原型实现来说明我们算法的好处。
{"title":"Strategy Synthesis in Markov Decision Processes Under Limited Sampling Access","authors":"C. Baier, Clemens Dubslaff, Patrick Wienhöft, S. Kiebel","doi":"10.48550/arXiv.2303.12718","DOIUrl":"https://doi.org/10.48550/arXiv.2303.12718","url":null,"abstract":"A central task in control theory, artificial intelligence, and formal methods is to synthesize reward-maximizing strategies for agents that operate in partially unknown environments. In environments modeled by gray-box Markov decision processes (MDPs), the impact of the agents' actions are known in terms of successor states but not the stochastics involved. In this paper, we devise a strategy synthesis algorithm for gray-box MDPs via reinforcement learning that utilizes interval MDPs as internal model. To compete with limited sampling access in reinforcement learning, we incorporate two novel concepts into our algorithm, focusing on rapid and successful learning rather than on stochastic guarantees and optimality: lower confidence bound exploration reinforces variants of already learned practical strategies and action scoping reduces the learning action space to promising actions. We illustrate benefits of our algorithms by means of a prototypical implementation applied on examples from the AI and formal methods communities.","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127232032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Formalizing Piecewise Affine Activation Functions of Neural Networks in Coq Coq中神经网络分段仿射激活函数的形式化
Pub Date : 2023-01-30 DOI: 10.48550/arXiv.2301.12893
A. Aleksandrov, Kim Völlinger
Verification of neural networks relies on activation functions being piecewise affine (pwa) -- enabling an encoding of the verification problem for theorem provers. In this paper, we present the first formalization of pwa activation functions for an interactive theorem prover tailored to verifying neural networks within Coq using the library Coquelicot for real analysis. As a proof-of-concept, we construct the popular pwa activation function ReLU. We integrate our formalization into a Coq model of neural networks, and devise a verified transformation from a neural network N to a pwa function representing N by composing pwa functions that we construct for each layer. This representation enables encodings for proof automation, e.g. Coq's tactic lra -- a decision procedure for linear real arithmetic. Further, our formalization paves the way for integrating Coq in frameworks of neural network verification as a fallback prover when automated proving fails.
神经网络的验证依赖于激活函数是分段仿射的(pwa),这使得定理证明者能够对验证问题进行编码。在本文中,我们提出了一个交互式定理证明器的pwa激活函数的第一个形式化,该证明器专门用于验证Coq中的神经网络,使用库Coquelicot进行实际分析。作为概念验证,我们构造了流行的pwa激活函数ReLU。我们将我们的形式化集成到神经网络的Coq模型中,并通过组合我们为每层构建的pwa函数,设计了从神经网络N到表示N的pwa函数的验证转换。这种表示使证明自动化的编码成为可能,例如Coq的策略lra——线性实数算法的决策过程。此外,我们的形式化为在自动证明失败时将Coq集成到神经网络验证框架中作为后备证明器铺平了道路。
{"title":"Formalizing Piecewise Affine Activation Functions of Neural Networks in Coq","authors":"A. Aleksandrov, Kim Völlinger","doi":"10.48550/arXiv.2301.12893","DOIUrl":"https://doi.org/10.48550/arXiv.2301.12893","url":null,"abstract":"Verification of neural networks relies on activation functions being piecewise affine (pwa) -- enabling an encoding of the verification problem for theorem provers. In this paper, we present the first formalization of pwa activation functions for an interactive theorem prover tailored to verifying neural networks within Coq using the library Coquelicot for real analysis. As a proof-of-concept, we construct the popular pwa activation function ReLU. We integrate our formalization into a Coq model of neural networks, and devise a verified transformation from a neural network N to a pwa function representing N by composing pwa functions that we construct for each layer. This representation enables encodings for proof automation, e.g. Coq's tactic lra -- a decision procedure for linear real arithmetic. Further, our formalization paves the way for integrating Coq in frameworks of neural network verification as a fallback prover when automated proving fails.","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132091288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Conservative Safety Monitors of Stochastic Dynamical Systems 随机动力系统的保守安全监测
Pub Date : 2023-01-27 DOI: 10.48550/arXiv.2301.11330
Matthew Cleaveland, I. Ruchkin, O. Sokolsky, Insup Lee
Generating accurate runtime safety estimates for autonomous systems is vital to ensuring their continued proliferation. However, exhaustive reasoning about future behaviors is generally too complex to do at runtime. To provide scalable and formal safety estimates, we propose a method for leveraging design-time model checking results at runtime. Specifically, we model the system as a probabilistic automaton (PA) and compute bounded-time reachability probabilities over the states of the PA at design time. At runtime, we combine distributions of state estimates with the model checking results to produce a bounded time safety estimate. We argue that our approach produces well-calibrated safety probabilities, assuming the estimated state distributions are well-calibrated. We evaluate our approach on simulated water tanks.
为自主系统生成准确的运行时安全评估对于确保其持续扩散至关重要。然而,对未来行为的详尽推理通常过于复杂,无法在运行时完成。为了提供可伸缩和正式的安全评估,我们提出了一种在运行时利用设计时模型检查结果的方法。具体来说,我们将系统建模为概率自动机(PA),并在设计时计算PA状态的有界时间可达性概率。在运行时,我们将状态估计的分布与模型检查结果结合起来,以产生有界时间安全估计。我们认为,我们的方法产生校准良好的安全概率,假设估计的状态分布是校准良好的。我们在模拟水箱上评估了我们的方法。
{"title":"Conservative Safety Monitors of Stochastic Dynamical Systems","authors":"Matthew Cleaveland, I. Ruchkin, O. Sokolsky, Insup Lee","doi":"10.48550/arXiv.2301.11330","DOIUrl":"https://doi.org/10.48550/arXiv.2301.11330","url":null,"abstract":"Generating accurate runtime safety estimates for autonomous systems is vital to ensuring their continued proliferation. However, exhaustive reasoning about future behaviors is generally too complex to do at runtime. To provide scalable and formal safety estimates, we propose a method for leveraging design-time model checking results at runtime. Specifically, we model the system as a probabilistic automaton (PA) and compute bounded-time reachability probabilities over the states of the PA at design time. At runtime, we combine distributions of state estimates with the model checking results to produce a bounded time safety estimate. We argue that our approach produces well-calibrated safety probabilities, assuming the estimated state distributions are well-calibrated. We evaluate our approach on simulated water tanks.","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122441393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Open- and Closed-Loop Neural Network Verification using Polynomial Zonotopes 基于多项式带拓扑的开闭环神经网络验证
Pub Date : 2022-07-06 DOI: 10.1007/978-3-031-33170-1_2
Niklas Kochdumper, Christian Schilling, M. Althoff, Stanley Bak
{"title":"Open- and Closed-Loop Neural Network Verification using Polynomial Zonotopes","authors":"Niklas Kochdumper, Christian Schilling, M. Althoff, Stanley Bak","doi":"10.1007/978-3-031-33170-1_2","DOIUrl":"https://doi.org/10.1007/978-3-031-33170-1_2","url":null,"abstract":"","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122165820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
More Programming Than Programming: Teaching Formal Methods in a Software Engineering Programme 与其说是编程,不如说是编程:在软件工程程序中教授形式化方法
Pub Date : 2022-05-02 DOI: 10.48550/arXiv.2205.00787
J. Noble, David Streader, Isaac Oscar Gariano, Miniruwani Samarakoon
. Formal methods for software correctness are critical to the future of software engineering — and so must be an essential part of software engineering education. Unfortunately, formal methods are often resisted by students due to perceived difficulty, mathematicity, and practical irrelevance. We redeveloped our software correctness course by taking a programming intensive approach, using the solver-aided language Dafny to provide instant formative feedback via automated assessment. Our redeveloped course increased student retention and resulted in the best evaluation for the course for at least ten years. Abstract Formal Modelling: We also considered taking an approach based on abstract formal modelling. High-level tools, such as TLA+ [33], Alloy Alloy [27] or SPIN [26], support reasoning and mechanised checking of systems’ properties, based on abstract models of those systems, rather than actual programming and source code. It is clear that these kinds of abstract formal models can play an important role in software engineering projects, at least in project’s the early stages,
. 软件正确性的形式化方法对软件工程的未来至关重要——因此必须成为软件工程教育的重要组成部分。不幸的是,形式化的方法经常受到学生的抵制,因为他们觉得困难、数学和实际无关。我们通过采用编程密集型方法重新开发了软件正确性课程,使用求解器辅助语言Dafny通过自动评估提供即时的形成性反馈。我们重新开发的课程提高了学生的保留率,并获得了至少十年来该课程的最佳评价。抽象形式建模:我们还考虑采用基于抽象形式建模的方法。高级工具,如TLA+[33]、Alloy Alloy[27]或SPIN[26],基于这些系统的抽象模型,而不是实际的编程和源代码,支持对系统属性的推理和机械检查。很明显,这些抽象的形式化模型可以在软件工程项目中扮演重要的角色,至少在项目的早期阶段是这样。
{"title":"More Programming Than Programming: Teaching Formal Methods in a Software Engineering Programme","authors":"J. Noble, David Streader, Isaac Oscar Gariano, Miniruwani Samarakoon","doi":"10.48550/arXiv.2205.00787","DOIUrl":"https://doi.org/10.48550/arXiv.2205.00787","url":null,"abstract":". Formal methods for software correctness are critical to the future of software engineering — and so must be an essential part of software engineering education. Unfortunately, formal methods are often resisted by students due to perceived difficulty, mathematicity, and practical irrelevance. We redeveloped our software correctness course by taking a programming intensive approach, using the solver-aided language Dafny to provide instant formative feedback via automated assessment. Our redeveloped course increased student retention and resulted in the best evaluation for the course for at least ten years. Abstract Formal Modelling: We also considered taking an approach based on abstract formal modelling. High-level tools, such as TLA+ [33], Alloy Alloy [27] or SPIN [26], support reasoning and mechanised checking of systems’ properties, based on abstract models of those systems, rather than actual programming and source code. It is clear that these kinds of abstract formal models can play an important role in software engineering projects, at least in project’s the early stages,","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130966516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Better Test Coverage: Merging Unit Tests for Autonomous Systems 迈向更好的测试覆盖率:合并自治系统的单元测试
Pub Date : 2022-04-06 DOI: 10.48550/arXiv.2204.02541
Josefine B. Graebener, Apurva Badithela, R. Murray
We present a framework for merging unit tests for autonomous systems. Typically, it is intractable to test an autonomous system for every scenario in its operating environment. The question of whether it is possible to design a single test for multiple requirements of the system motivates this work. First, we formally define three attributes of a test: a test specification that characterizes behaviors observed in a test execution, a test environment, and a test policy. Using the merge operator from contract-based design theory, we provide a formalism to construct a merged test specification from two unit test specifications. Temporal constraints on the merged test specification guarantee that non-trivial satisfaction of both unit test specifications is necessary for a successful merged test execution. We assume that the test environment remains the same across the unit tests and the merged test. Given a test specification and a test environment, we synthesize a test policy filter using a receding horizon approach, and use the test policy filter to guide a search procedure (e.g. Monte-Carlo Tree Search) to find a test policy that is guaranteed to satisfy the test specification. This search procedure finds a test policy that maximizes a pre-defined robustness metric for the test while the filter guarantees a test policy for satisfying the test specification. We prove that our algorithm is sound. Furthermore, the receding horizon approach to synthesizing the filter ensures that our algorithm is scalable. Finally, we show that merging unit tests is impactful for designing efficient test campaigns to achieve similar levels of coverage in fewer test executions. We illustrate our framework on two self-driving examples in a discrete-state setting.
我们提出了一个为自治系统合并单元测试的框架。通常,在操作环境中为每个场景测试一个自治系统是很棘手的。是否有可能为系统的多个需求设计单个测试的问题激发了这项工作。首先,我们正式定义测试的三个属性:描述在测试执行中观察到的行为的测试规范、测试环境和测试策略。使用基于契约的设计理论中的合并操作符,我们提供了一种形式化的方法来从两个单元测试规范中构造合并的测试规范。合并测试规范的时间约束保证了两个单元测试规范的非平凡的满足对于成功的合并测试执行是必要的。我们假设测试环境在单元测试和合并测试之间保持相同。在给定测试规范和测试环境的情况下,我们使用后退水平方法合成了一个测试策略过滤器,并使用测试策略过滤器指导搜索过程(例如Monte-Carlo树搜索)来找到保证满足测试规范的测试策略。这个搜索过程找到一个测试策略,该测试策略最大化了测试的预定义健壮性度量,而过滤器保证了满足测试规范的测试策略。我们证明了我们的算法是合理的。此外,采用视界后退法合成滤波器,保证了算法的可扩展性。最后,我们展示了合并单元测试对于设计有效的测试活动,在更少的测试执行中达到相似的覆盖水平是有影响的。我们在离散状态设置的两个自动驾驶示例中说明了我们的框架。
{"title":"Towards Better Test Coverage: Merging Unit Tests for Autonomous Systems","authors":"Josefine B. Graebener, Apurva Badithela, R. Murray","doi":"10.48550/arXiv.2204.02541","DOIUrl":"https://doi.org/10.48550/arXiv.2204.02541","url":null,"abstract":"We present a framework for merging unit tests for autonomous systems. Typically, it is intractable to test an autonomous system for every scenario in its operating environment. The question of whether it is possible to design a single test for multiple requirements of the system motivates this work. First, we formally define three attributes of a test: a test specification that characterizes behaviors observed in a test execution, a test environment, and a test policy. Using the merge operator from contract-based design theory, we provide a formalism to construct a merged test specification from two unit test specifications. Temporal constraints on the merged test specification guarantee that non-trivial satisfaction of both unit test specifications is necessary for a successful merged test execution. We assume that the test environment remains the same across the unit tests and the merged test. Given a test specification and a test environment, we synthesize a test policy filter using a receding horizon approach, and use the test policy filter to guide a search procedure (e.g. Monte-Carlo Tree Search) to find a test policy that is guaranteed to satisfy the test specification. This search procedure finds a test policy that maximizes a pre-defined robustness metric for the test while the filter guarantees a test policy for satisfying the test specification. We prove that our algorithm is sound. Furthermore, the receding horizon approach to synthesizing the filter ensures that our algorithm is scalable. Finally, we show that merging unit tests is impactful for designing efficient test campaigns to achieve similar levels of coverage in fewer test executions. We illustrate our framework on two self-driving examples in a discrete-state setting.","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"154 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122500380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
NNLander-VeriF: A Neural Network Formal Verification Framework for Vision-Based Autonomous Aircraft Landing NNLander-VeriF:一种基于视觉的自主飞机着陆神经网络形式化验证框架
Pub Date : 2022-03-29 DOI: 10.48550/arXiv.2203.15841
Ulices Santa Cruz, Yasser Shoukry
. In this paper, we consider the problem of formally verifying a Neural Network (NN) based autonomous landing system. In such a system, a NN controller processes images from a camera to guide the aircraft while approaching the runway. A central challenge for the safety and liveness verification of vision-based closed-loop systems is the lack of mathematical models that captures the relation between the system states (e.g., position of the aircraft) and the images processed by the vision-based NN controller. Another challenge is the limited abilities of state-of-the-art NN model checkers. Such model checkers can reason only about simple input-output robustness properties of neural networks. This limitation creates a gap between the NN model checker abilities and the need to verify a closed-loop system while considering the aircraft dynamics, the perception components, and the NN controller. To this end, this paper presents NNLander-VeriF, a framework to verify vision-based NN controllers used for autonomous landing. NNLander-VeriF addresses the challenges above by exploiting geometric models of perspective cameras to obtain a mathematical model that captures the relation between the aircraft states and the inputs to the NN controller. By converting this model into a NN (with manually assigned weights) and composing it with the NN controller, one can capture the relation between aircraft states and control actions using one augmented NN. Such an augmented NN model leads to a natural encoding of the closed-loop verification into several NN robustness queries, which state-of-the-art NN model checkers can handle. Finally, we evaluate our framework to formally verify the properties of a trained NN and we show its efficiency. LiDAR scanners and cameras. These data
。本文研究了基于神经网络的自主着陆系统的形式化验证问题。在这样的系统中,一个神经网络控制器处理来自摄像机的图像来引导飞机接近跑道。基于视觉的闭环系统的安全性和活动性验证的一个核心挑战是缺乏捕捉系统状态(例如,飞机的位置)与基于视觉的神经网络控制器处理的图像之间关系的数学模型。另一个挑战是最先进的神经网络模型检查器的能力有限。这种模型检查器只能推断神经网络的简单输入输出鲁棒性。这一限制造成了神经网络模型检查器能力与在考虑飞机动力学、感知组件和神经网络控制器时验证闭环系统的需求之间的差距。为此,本文提出了NNLander-VeriF框架,用于验证用于自主着陆的基于视觉的NN控制器。NNLander-VeriF通过利用透视相机的几何模型来获得捕获飞机状态与神经网络控制器输入之间关系的数学模型,从而解决了上述挑战。通过将该模型转换为神经网络(手动分配权重)并与神经网络控制器组合,可以使用一个增强神经网络捕获飞机状态与控制动作之间的关系。这种增强的神经网络模型将闭环验证自然编码为几个神经网络鲁棒性查询,最先进的神经网络模型检查器可以处理这些查询。最后,我们评估了我们的框架来正式验证训练后的神经网络的属性,并展示了它的效率。激光雷达扫描仪和摄像头。这些数据
{"title":"NNLander-VeriF: A Neural Network Formal Verification Framework for Vision-Based Autonomous Aircraft Landing","authors":"Ulices Santa Cruz, Yasser Shoukry","doi":"10.48550/arXiv.2203.15841","DOIUrl":"https://doi.org/10.48550/arXiv.2203.15841","url":null,"abstract":". In this paper, we consider the problem of formally verifying a Neural Network (NN) based autonomous landing system. In such a system, a NN controller processes images from a camera to guide the aircraft while approaching the runway. A central challenge for the safety and liveness verification of vision-based closed-loop systems is the lack of mathematical models that captures the relation between the system states (e.g., position of the aircraft) and the images processed by the vision-based NN controller. Another challenge is the limited abilities of state-of-the-art NN model checkers. Such model checkers can reason only about simple input-output robustness properties of neural networks. This limitation creates a gap between the NN model checker abilities and the need to verify a closed-loop system while considering the aircraft dynamics, the perception components, and the NN controller. To this end, this paper presents NNLander-VeriF, a framework to verify vision-based NN controllers used for autonomous landing. NNLander-VeriF addresses the challenges above by exploiting geometric models of perspective cameras to obtain a mathematical model that captures the relation between the aircraft states and the inputs to the NN controller. By converting this model into a NN (with manually assigned weights) and composing it with the NN controller, one can capture the relation between aircraft states and control actions using one augmented NN. Such an augmented NN model leads to a natural encoding of the closed-loop verification into several NN robustness queries, which state-of-the-art NN model checkers can handle. Finally, we evaluate our framework to formally verify the properties of a trained NN and we show its efficiency. LiDAR scanners and cameras. These data","PeriodicalId":436677,"journal":{"name":"NASA Formal Methods","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128112292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
NASA Formal Methods
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1