Training Binary Classifiers as Data Structure Invariants

2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) Pub Date : 2019-05-25 DOI:10.1109/ICSE.2019.00084

F. Molina, Renzo Degiovanni, Pablo Ponzio, Germán Regis, Nazareno Aguirre, M. Frias

{"title":"Training Binary Classifiers as Data Structure Invariants","authors":"F. Molina, Renzo Degiovanni, Pablo Ponzio, Germán Regis, Nazareno Aguirre, M. Frias","doi":"10.1109/ICSE.2019.00084","DOIUrl":null,"url":null,"abstract":"We present a technique to distinguish valid from invalid data structure objects. The technique is based on building an artificial neural network, more precisely a binary classifier, and training it to identify valid and invalid instances of a data structure. The obtained classifier can then be used in place of the data structure's invariant, in order to attempt to identify (in)correct behaviors in programs manipulating the structure. In order to produce the valid objects to train the network, an assumed-correct set of object building routines is randomly executed. Invalid instances are produced by generating values for object fields that \"break\" the collected valid values, i.e., that assign values to object fields that have not been observed as feasible in the assumed-correct executions that led to the collected valid instances. We experimentally assess this approach, over a benchmark of data structures. We show that this learning technique produces classifiers that achieve significantly better accuracy in classifying valid/invalid objects compared to a technique for dynamic invariant detection, and leads to improved bug finding.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"45 1","pages":"759-770"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2019.00084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

We present a technique to distinguish valid from invalid data structure objects. The technique is based on building an artificial neural network, more precisely a binary classifier, and training it to identify valid and invalid instances of a data structure. The obtained classifier can then be used in place of the data structure's invariant, in order to attempt to identify (in)correct behaviors in programs manipulating the structure. In order to produce the valid objects to train the network, an assumed-correct set of object building routines is randomly executed. Invalid instances are produced by generating values for object fields that "break" the collected valid values, i.e., that assign values to object fields that have not been observed as feasible in the assumed-correct executions that led to the collected valid instances. We experimentally assess this approach, over a benchmark of data structures. We show that this learning technique produces classifiers that achieve significantly better accuracy in classifying valid/invalid objects compared to a technique for dynamic invariant detection, and leads to improved bug finding.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

训练二元分类器作为数据结构不变量

我们提出了一种区分有效和无效数据结构对象的技术。该技术基于构建一个人工神经网络，更准确地说是一个二元分类器，并训练它识别数据结构的有效和无效实例。然后，可以使用获得的分类器来代替数据结构的不变量，以便尝试在操作该结构的程序中识别正确的行为。为了产生有效的对象来训练网络，随机执行一组假设正确的对象构建例程。无效实例是通过为“破坏”收集到的有效值的对象字段生成值而产生的，也就是说，将值赋给在导致收集到的有效实例的假定正确执行中未被观察到可行的对象字段。我们在数据结构的基准上对这种方法进行了实验评估。我们表明，与动态不变量检测技术相比，这种学习技术产生的分类器在对有效/无效对象进行分类方面取得了显著更好的准确性，并导致改进的错误发现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量

期刊最新文献

VFix: Value-Flow-Guided Precise Program Repair for Null Pointer Dereferences Search-Based Energy Testing of Android Scalable Approaches for Test Suite Reduction A System Identification Based Oracle for Control-CPS Software Fault Localization Training Binary Classifiers as Data Structure Invariants