{"title":"Recognizing Lexicographically Smallest Words and Computing Successors in Regular Languages","authors":"Lukas Fleischer, J. Shallit","doi":"10.1142/S0129054121420028","DOIUrl":null,"url":null,"abstract":"For a formal language [Formula: see text], the problem of language enumeration asks to compute the length-lexicographically smallest word in [Formula: see text] larger than a given input [Formula: see text] (henceforth called the [Formula: see text]-successor of [Formula: see text]). We investigate this problem for regular languages from a computational complexity and state complexity perspective. We first show that if [Formula: see text] is recognized by a DFA with [Formula: see text] states, then [Formula: see text] states are (in general) necessary and sufficient for an unambiguous finite-state transducer to compute [Formula: see text]-successors. As a byproduct, we obtain that if [Formula: see text] is recognized by a DFA with [Formula: see text] states, then [Formula: see text] states are sufficient for a DFA to recognize the subset [Formula: see text] of [Formula: see text] composed of its lexicographically smallest words. We give a matching lower bound that holds even if [Formula: see text] is represented as an NFA. It has been known that [Formula: see text]-successors can be computed in polynomial time, even if the regular language is given as part of the input (assuming a suitable representation of the language, such as a DFA). In this paper, we refine this result in multiple directions. We show that if the regular language is given as part of the input and encoded as a DFA, the problem is in [Formula: see text]. If the regular language [Formula: see text] is fixed, we prove that the enumeration problem of the language is reducible to deciding membership to the Myhill-Nerode equivalence classes of [Formula: see text] under [Formula: see text]-uniform [Formula: see text] reductions. In particular, this implies that fixed star-free languages can be enumerated in [Formula: see text], arbitrary fixed regular languages can be enumerated in [Formula: see text] and that there exist regular languages for which the problem is [Formula: see text]-complete.","PeriodicalId":192109,"journal":{"name":"Int. J. Found. Comput. Sci.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Found. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/S0129054121420028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
For a formal language [Formula: see text], the problem of language enumeration asks to compute the length-lexicographically smallest word in [Formula: see text] larger than a given input [Formula: see text] (henceforth called the [Formula: see text]-successor of [Formula: see text]). We investigate this problem for regular languages from a computational complexity and state complexity perspective. We first show that if [Formula: see text] is recognized by a DFA with [Formula: see text] states, then [Formula: see text] states are (in general) necessary and sufficient for an unambiguous finite-state transducer to compute [Formula: see text]-successors. As a byproduct, we obtain that if [Formula: see text] is recognized by a DFA with [Formula: see text] states, then [Formula: see text] states are sufficient for a DFA to recognize the subset [Formula: see text] of [Formula: see text] composed of its lexicographically smallest words. We give a matching lower bound that holds even if [Formula: see text] is represented as an NFA. It has been known that [Formula: see text]-successors can be computed in polynomial time, even if the regular language is given as part of the input (assuming a suitable representation of the language, such as a DFA). In this paper, we refine this result in multiple directions. We show that if the regular language is given as part of the input and encoded as a DFA, the problem is in [Formula: see text]. If the regular language [Formula: see text] is fixed, we prove that the enumeration problem of the language is reducible to deciding membership to the Myhill-Nerode equivalence classes of [Formula: see text] under [Formula: see text]-uniform [Formula: see text] reductions. In particular, this implies that fixed star-free languages can be enumerated in [Formula: see text], arbitrary fixed regular languages can be enumerated in [Formula: see text] and that there exist regular languages for which the problem is [Formula: see text]-complete.
对于一种形式语言[Formula: see text],语言枚举问题要求计算[Formula: see text]中比给定输入[Formula: see text](今后称为[Formula: see text]- [Formula: see text]的后继者)大的按字典顺序排列的最小单词的长度。我们从计算复杂性和状态复杂性的角度来研究正则语言的这个问题。我们首先表明,如果[Formula: see text]被具有[Formula: see text]状态的DFA识别,那么[Formula: see text]状态(通常)对于一个明确的有限状态传感器计算[Formula: see text]-后继器是必要和充分的。作为一个副产品,我们得到,如果[Formula: see text]被具有[Formula: see text]状态的DFA识别,那么[Formula: see text]状态就足以让DFA识别由字典上最小的单词组成的[Formula: see text]子集[Formula: see text]。我们给出了一个匹配的下界,即使[公式:见文本]被表示为NFA。众所周知,[公式:见文本]-继承者可以在多项式时间内计算,即使正则语言作为输入的一部分给出(假设语言的合适表示,例如DFA)。在本文中,我们从多个方向对这一结果进行了改进。我们表明,如果正则语言作为输入的一部分给出并编码为DFA,问题就在[公式:见文本]中。如果正则语言[公式:见文]是固定的,我们证明了该语言的枚举问题可约简为在[公式:见文]-一致[公式:见文]约简下决定[公式:见文]的Myhill-Nerode等价类的隶属关系。特别地,这意味着固定的无星型语言可以在[公式:见文本]中枚举,任意固定的正则语言可以在[公式:见文本]中枚举,并且存在问题为[公式:见文本]-complete的正则语言。