Automatic Metamorphic Test Oracles for Action-Policy Testing

International Conference on Automated Planning and Scheduling Pub Date : 2023-07-01 DOI:10.1609/icaps.v33i1.27185

Jan Eisenhut, Á. Torralba, M. Christakis, Jörg Hoffmann

{"title":"Automatic Metamorphic Test Oracles for Action-Policy Testing","authors":"Jan Eisenhut, Á. Torralba, M. Christakis, Jörg Hoffmann","doi":"10.1609/icaps.v33i1.27185","DOIUrl":null,"url":null,"abstract":"Testing is a promising way to gain trust in learned action policies π. \nPrior work on action-policy testing in AI planning formalized bugs\nas states t where π is sub-optimal with respect to a given testing\nobjective. Deciding whether or not t is a bug is as hard as (optimal)\nplanning itself. How can we design test oracles able to recognize some\nstates t to be bugs efficiently? Recent work introduced metamorphic\noracles which compare policy behavior on state pairs (s,t) where t is\neasier to solve; if π performs worse on t than on s, we know that t\nis a bug. Here, we show how to automatically design such oracles in\nclassical planning, based on simulation relations between states. We\nintroduce two oracle families of this kind: first, morphing query\nstates t to obtain suitable s; second, maintaining and comparing upper\nbounds on h* across the states encountered during testing. Our\nexperiments on ASNet policies show that these oracles can find bugs\nmuch more quickly than the existing alternatives, which are\nsearch-based; and that the combination of our oracles with\nsearch-based ones almost consistently dominates all other oracles.","PeriodicalId":239898,"journal":{"name":"International Conference on Automated Planning and Scheduling","volume":"214 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Automated Planning and Scheduling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icaps.v33i1.27185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Testing is a promising way to gain trust in learned action policies π. Prior work on action-policy testing in AI planning formalized bugs as states t where π is sub-optimal with respect to a given testing objective. Deciding whether or not t is a bug is as hard as (optimal) planning itself. How can we design test oracles able to recognize some states t to be bugs efficiently? Recent work introduced metamorphic oracles which compare policy behavior on state pairs (s,t) where t is easier to solve; if π performs worse on t than on s, we know that t is a bug. Here, we show how to automatically design such oracles in classical planning, based on simulation relations between states. We introduce two oracle families of this kind: first, morphing query states t to obtain suitable s; second, maintaining and comparing upper bounds on h* across the states encountered during testing. Our experiments on ASNet policies show that these oracles can find bugs much more quickly than the existing alternatives, which are search-based; and that the combination of our oracles with search-based ones almost consistently dominates all other oracles.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于动作策略测试的自动变形测试预言机

测试是一种很有前途的方法，可以在学习的行动策略中获得信任。先前关于AI规划中的行动策略测试的工作形式化了错误状态t，其中π相对于给定的测试目标是次优的。决定它是否是一个bug和(最优)计划本身一样困难。我们怎样才能设计出能够有效识别某些状态的测试oracle ?最近的工作引入了比较状态对(s,t)上的策略行为的变形神谕，其中t更容易解决;如果π在t上的表现不如在s上，我们知道这是一个错误。在这里，我们展示了如何基于状态之间的模拟关系自动设计这样的预言机经典规划。我们介绍了这类oracle的两个家族:第一，通过变换查询状态t来获得合适的s;其次，维护和比较在测试过程中遇到的不同状态下h*的上界。我们对ASNet策略的实验表明，这些预言器可以比现有的基于研究的替代方案更快地发现错误;我们的甲骨文和基于搜索的甲骨文的结合几乎一直主导着所有其他的甲骨文。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Conference on Automated Planning and Scheduling

自引率

0.00%

发文量