{"title":"Database-assisted automata learning","authors":"Hielke Walinga, Robert Baumgartner, Sicco Verwer","doi":"arxiv-2406.07208","DOIUrl":null,"url":null,"abstract":"This paper presents DAALder (Database-Assisted Automata Learning, with Dutch\nsuffix from leerder), a new algorithm for learning state machines, or automata,\nspecifically deterministic finite-state automata (DFA). When learning state\nmachines from log data originating from software systems, the large amount of\nlog data can pose a challenge. Conventional state merging algorithms cannot\nefficiently deal with this, as they require a large amount of memory. To solve\nthis, we utilized database technologies to efficiently query a big trace\ndataset and construct a state machine from it, as databases allow to save large\namounts of data on disk while still being able to query it efficiently.\nBuilding on research in both active learning and passive learning, the proposed\nalgorithm is a combination of the two. It can quickly find a characteristic set\nof traces from a database using heuristics from a state merging algorithm.\nExperiments show that our algorithm has similar performance to conventional\nstate merging algorithms on large datasets, but requires far less memory.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"57 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Formal Languages and Automata Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.07208","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents DAALder (Database-Assisted Automata Learning, with Dutch
suffix from leerder), a new algorithm for learning state machines, or automata,
specifically deterministic finite-state automata (DFA). When learning state
machines from log data originating from software systems, the large amount of
log data can pose a challenge. Conventional state merging algorithms cannot
efficiently deal with this, as they require a large amount of memory. To solve
this, we utilized database technologies to efficiently query a big trace
dataset and construct a state machine from it, as databases allow to save large
amounts of data on disk while still being able to query it efficiently.
Building on research in both active learning and passive learning, the proposed
algorithm is a combination of the two. It can quickly find a characteristic set
of traces from a database using heuristics from a state merging algorithm.
Experiments show that our algorithm has similar performance to conventional
state merging algorithms on large datasets, but requires far less memory.