Ruben Becker, Sung-Hwan Kim, Nicola Prezza, Carlo Tosoni
{"title":"使用前向稳定分区索引有限状态自动机","authors":"Ruben Becker, Sung-Hwan Kim, Nicola Prezza, Carlo Tosoni","doi":"arxiv-2406.02763","DOIUrl":null,"url":null,"abstract":"An index on a finite-state automaton is a data structure able to locate\nspecific patterns on the automaton's paths and consequently on the regular\nlanguage accepted by the automaton itself. Cotumaccio and Prezza [SODA '21],\nintroduced a data structure able to solve pattern matching queries on automata,\ngeneralizing the famous FM-index for strings of Ferragina and Manzini [FOCS\n'00]. The efficiency of their index depends on the width of a particular\npartial order of the automaton's states, the smaller the width of the partial\norder, the faster is the index. However, computing the partial order of minimal\nwidth is NP-hard. This problem was mitigated by Cotumaccio [DCC '22], who\nrelaxed the conditions on the partial order, allowing it to be a partial\npreorder. This relaxation yields the existence of a unique partial preorder of\nminimal width that can be computed in polynomial time. In the paper at hand, we\npresent a new class of partial preorders and show that they have the following\nuseful properties: (i) they can be computed in polynomial time, (ii) their\nwidth is never larger than the width of Cotumaccio's preorders, and (iii) there\nexist infinite classes of automata on which the width of Cotumaccio's pre-order\nis linearly larger than the width of our preorder.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Indexing Finite-State Automata Using Forward-Stable Partitions\",\"authors\":\"Ruben Becker, Sung-Hwan Kim, Nicola Prezza, Carlo Tosoni\",\"doi\":\"arxiv-2406.02763\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An index on a finite-state automaton is a data structure able to locate\\nspecific patterns on the automaton's paths and consequently on the regular\\nlanguage accepted by the automaton itself. Cotumaccio and Prezza [SODA '21],\\nintroduced a data structure able to solve pattern matching queries on automata,\\ngeneralizing the famous FM-index for strings of Ferragina and Manzini [FOCS\\n'00]. The efficiency of their index depends on the width of a particular\\npartial order of the automaton's states, the smaller the width of the partial\\norder, the faster is the index. However, computing the partial order of minimal\\nwidth is NP-hard. This problem was mitigated by Cotumaccio [DCC '22], who\\nrelaxed the conditions on the partial order, allowing it to be a partial\\npreorder. This relaxation yields the existence of a unique partial preorder of\\nminimal width that can be computed in polynomial time. In the paper at hand, we\\npresent a new class of partial preorders and show that they have the following\\nuseful properties: (i) they can be computed in polynomial time, (ii) their\\nwidth is never larger than the width of Cotumaccio's preorders, and (iii) there\\nexist infinite classes of automata on which the width of Cotumaccio's pre-order\\nis linearly larger than the width of our preorder.\",\"PeriodicalId\":501124,\"journal\":{\"name\":\"arXiv - CS - Formal Languages and Automata Theory\",\"volume\":\"22 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Formal Languages and Automata Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.02763\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Formal Languages and Automata Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.02763","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Indexing Finite-State Automata Using Forward-Stable Partitions
An index on a finite-state automaton is a data structure able to locate
specific patterns on the automaton's paths and consequently on the regular
language accepted by the automaton itself. Cotumaccio and Prezza [SODA '21],
introduced a data structure able to solve pattern matching queries on automata,
generalizing the famous FM-index for strings of Ferragina and Manzini [FOCS
'00]. The efficiency of their index depends on the width of a particular
partial order of the automaton's states, the smaller the width of the partial
order, the faster is the index. However, computing the partial order of minimal
width is NP-hard. This problem was mitigated by Cotumaccio [DCC '22], who
relaxed the conditions on the partial order, allowing it to be a partial
preorder. This relaxation yields the existence of a unique partial preorder of
minimal width that can be computed in polynomial time. In the paper at hand, we
present a new class of partial preorders and show that they have the following
useful properties: (i) they can be computed in polynomial time, (ii) their
width is never larger than the width of Cotumaccio's preorders, and (iii) there
exist infinite classes of automata on which the width of Cotumaccio's pre-order
is linearly larger than the width of our preorder.