Franz Nowak, Anej Svete, Alexandra Butoi, Ryan Cotterell
The performance of modern language models (LMs) has been improved by chain-of-thought (CoT) reasoning, i.e., the process of generating intermediate results that guide the model towards a final answer. A possible explanation for this improvement is that CoT reasoning extends an LM's computational power, as RNNs and transformers with additional scratch space are known to be Turing complete. Comparing LMs to Turing machines, however, introduces a category error - Turing machines decide language membership, whereas LMs define distributions over strings. To bridge this gap, we formalize CoT reasoning in a probabilistic setting. We present several results on the representational capacity of recurrent and transformer LMs with CoT reasoning, showing that they can represent the same family of distributions over strings as probabilistic Turing machines.
{"title":"On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning","authors":"Franz Nowak, Anej Svete, Alexandra Butoi, Ryan Cotterell","doi":"arxiv-2406.14197","DOIUrl":"https://doi.org/arxiv-2406.14197","url":null,"abstract":"The performance of modern language models (LMs) has been improved by\u0000chain-of-thought (CoT) reasoning, i.e., the process of generating intermediate\u0000results that guide the model towards a final answer. A possible explanation for\u0000this improvement is that CoT reasoning extends an LM's computational power, as\u0000RNNs and transformers with additional scratch space are known to be Turing\u0000complete. Comparing LMs to Turing machines, however, introduces a category\u0000error - Turing machines decide language membership, whereas LMs define\u0000distributions over strings. To bridge this gap, we formalize CoT reasoning in a\u0000probabilistic setting. We present several results on the representational\u0000capacity of recurrent and transformer LMs with CoT reasoning, showing that they\u0000can represent the same family of distributions over strings as probabilistic\u0000Turing machines.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper is a continuation of a previous study on the so-called measure once finite quantum automata model introduced by Moore and Crutchfield in 2000. We investigate conditions assuring that, given a language recognized by such a device and a language generated by a context-free grammar of finite index or by a matrix context-free grammar, it is recursively decidable whether or not they have a nonempty intersection.
{"title":"Quantum automata and languages of finite index","authors":"Andrea Benso, Flavio D'Alessandro, Paolo Papi","doi":"arxiv-2406.13797","DOIUrl":"https://doi.org/arxiv-2406.13797","url":null,"abstract":"This paper is a continuation of a previous study on the so-called measure\u0000once finite quantum automata model introduced by Moore and Crutchfield in 2000.\u0000We investigate conditions assuring that, given a language recognized by such a\u0000device and a language generated by a context-free grammar of finite index or by\u0000a matrix context-free grammar, it is recursively decidable whether or not they\u0000have a nonempty intersection.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"206 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Contemporary AI applications leverage large language models (LLMs) for their knowledge and inference capabilities in natural language processing tasks. This approach aligns with the concept of oracle Turing machines (OTMs). To capture the essence of these computations, including those desired but not yet in practice, we extend the notion of OTMs by employing a cluster of LLMs as the oracle. We present four variants: basic, augmented, fault-avoidance, and $epsilon$-fault. The first two variants are commonly observed, whereas the latter two are specifically designed to ensure reliable outcomes by addressing LLM hallucinations, biases, and inconsistencies.
{"title":"LLM-Oracle Machines","authors":"Jie Wang","doi":"arxiv-2406.12213","DOIUrl":"https://doi.org/arxiv-2406.12213","url":null,"abstract":"Contemporary AI applications leverage large language models (LLMs) for their\u0000knowledge and inference capabilities in natural language processing tasks. This\u0000approach aligns with the concept of oracle Turing machines (OTMs). To capture\u0000the essence of these computations, including those desired but not yet in\u0000practice, we extend the notion of OTMs by employing a cluster of LLMs as the\u0000oracle. We present four variants: basic, augmented, fault-avoidance, and\u0000$epsilon$-fault. The first two variants are commonly observed, whereas the\u0000latter two are specifically designed to ensure reliable outcomes by addressing\u0000LLM hallucinations, biases, and inconsistencies.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The bandwidth of timed automata characterizes the quantity of information produced/transmitted per time unit. We previously delimited 3 classes of TA according to the nature of their asymptotic bandwidth: meager, normal, and obese. In this paper, we propose a method, based on a finite-state simply-timed abstraction, to compute the actual value of the bandwidth of meager automata. The states of this abstraction correspond to barycenters of the faces of the simplices in the region automaton. Then the bandwidth is $log 1/|z_0|$ where $z_0$ is the smallest root (in modulus) of the characteristic polynomial of this finite-state abstraction.
{"title":"Computing the Bandwidth of Meager Timed Automata","authors":"Eugene Asarin, Aldric Degorre, Catalin Dima, Bernardo Jacobo Inclán","doi":"arxiv-2406.12694","DOIUrl":"https://doi.org/arxiv-2406.12694","url":null,"abstract":"The bandwidth of timed automata characterizes the quantity of information\u0000produced/transmitted per time unit. We previously delimited 3 classes of TA\u0000according to the nature of their asymptotic bandwidth: meager, normal, and\u0000obese. In this paper, we propose a method, based on a finite-state simply-timed\u0000abstraction, to compute the actual value of the bandwidth of meager automata.\u0000The states of this abstraction correspond to barycenters of the faces of the\u0000simplices in the region automaton. Then the bandwidth is $log 1/|z_0|$ where\u0000$z_0$ is the smallest root (in modulus) of the characteristic polynomial of\u0000this finite-state abstraction.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"150 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luc Dartois, Paul Gastin, Loïc Germerie Guizouarn, R. Govind, Shankaranarayanan Krishna
Deterministic two-way transducers capture the class of regular functions. The efficiency of composing two-way transducers has a direct implication in algorithmic problems related to reactive synthesis, where transformation specifications are converted into equivalent transducers. These specifications are presented in a modular way, and composing the resultant machines simulates the full specification. An important result by Dartois et al. shows that composition of two-way transducers enjoy a polynomial composition when the underlying transducer is reversible, that is, if they are both deterministic and co-deterministic. This is a major improvement over general deterministic two-way transducers, for which composition causes a doubly exponential blow-up in the size of the inputs in general. Moreover, they show that reversible two-way transducers have the same expressiveness as deterministic two-way transducers. However, the question of expressiveness of reversible transducers over infinite words is still open. In this article, we introduce the class of reversible two-way transducers over infinite words and show that they enjoy the same expressive power as deterministic two-way transducers over infinite words. This is done through a non-trivial, effective construction inducing a single exponential blow-up in the set of states. Further, we also prove that composing two reversible two-way transducers over infinite words incurs only a polynomial complexity, thereby providing foundations for efficient procedure for composition of transducers over infinite words.
{"title":"Reversible Transducers over Infinite Words","authors":"Luc Dartois, Paul Gastin, Loïc Germerie Guizouarn, R. Govind, Shankaranarayanan Krishna","doi":"arxiv-2406.11488","DOIUrl":"https://doi.org/arxiv-2406.11488","url":null,"abstract":"Deterministic two-way transducers capture the class of regular functions. The\u0000efficiency of composing two-way transducers has a direct implication in\u0000algorithmic problems related to reactive synthesis, where transformation\u0000specifications are converted into equivalent transducers. These specifications\u0000are presented in a modular way, and composing the resultant machines simulates\u0000the full specification. An important result by Dartois et al. shows that\u0000composition of two-way transducers enjoy a polynomial composition when the\u0000underlying transducer is reversible, that is, if they are both deterministic\u0000and co-deterministic. This is a major improvement over general deterministic\u0000two-way transducers, for which composition causes a doubly exponential blow-up\u0000in the size of the inputs in general. Moreover, they show that reversible\u0000two-way transducers have the same expressiveness as deterministic two-way\u0000transducers. However, the question of expressiveness of reversible transducers\u0000over infinite words is still open. In this article, we introduce the class of\u0000reversible two-way transducers over infinite words and show that they enjoy the\u0000same expressive power as deterministic two-way transducers over infinite words.\u0000This is done through a non-trivial, effective construction inducing a single\u0000exponential blow-up in the set of states. Further, we also prove that composing\u0000two reversible two-way transducers over infinite words incurs only a polynomial\u0000complexity, thereby providing foundations for efficient procedure for\u0000composition of transducers over infinite words.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A popular method for modelling reactive systems is to use $omega$-regular languages. These languages can be represented as nondeterministic B"uchi automata (NBAs) or $omega$-regular expressions. Existing methods synthesise expressions from state-based NBAs. Synthesis from transition-based NBAs is traditionally done by transforming transition-based NBAs into state-based NBAs. This transformation, however, can increase the complexity of the synthesised expressions. This paper proposes a novel method for directly synthesising $omega$-regular expressions from transition-based NBAs. We prove that the method is sound and complete. Our empirical results show that the $omega$-regular expressions synthesised from transition-based NBAs are more compact than those synthesised from state-based NBAs. This is particularly the case for NBAs computed from obligation, reactivity, safety and recurrence-type LTL formulas, reporting in the latter case an average reduction of over 50%. We also show that our method successfully synthesises $omega$-regular expressions from more LTL formulas when using a transition-based instead of a state-based NBA.
对反应式系统进行建模的一种流行方法是使用$omega$正则表达式语言。这些语言可以表示为非确定性正则表达式(NBAs)或$omega$-正则表达式。现有的方法是从基于状态的 NBA 合成表达式。然而,这种转换会增加合成表达式的复杂性。本文提出了一种从基于转换的 NBA 直接合成$omega$正则表达式的新方法。我们证明了该方法的合理性和完整性。我们的实证结果表明,从基于过渡的 NBA 合成的$omega$正则表达式比从基于状态的 NBA 合成的表达式更紧凑。尤其是根据义务、反应性、安全性和递推型 LTL 公式计算出的 NBA,在后一种情况下平均减少了 50% 以上。我们还表明,当使用基于转换的 NBA 而不是基于状态的 NBA 时,我们的方法成功地从更多的 LTL 公式中合成了 $omega$-正则表达式。
{"title":"$ω$-regular Expression Synthesis from Transition-Based Büchi Automata","authors":"Charles Pert, Dalal Alrajeh, Alessandra Russo","doi":"arxiv-2406.08136","DOIUrl":"https://doi.org/arxiv-2406.08136","url":null,"abstract":"A popular method for modelling reactive systems is to use $omega$-regular\u0000languages. These languages can be represented as nondeterministic B\"uchi\u0000automata (NBAs) or $omega$-regular expressions. Existing methods synthesise\u0000expressions from state-based NBAs. Synthesis from transition-based NBAs is\u0000traditionally done by transforming transition-based NBAs into state-based NBAs.\u0000This transformation, however, can increase the complexity of the synthesised\u0000expressions. This paper proposes a novel method for directly synthesising\u0000$omega$-regular expressions from transition-based NBAs. We prove that the\u0000method is sound and complete. Our empirical results show that the\u0000$omega$-regular expressions synthesised from transition-based NBAs are more\u0000compact than those synthesised from state-based NBAs. This is particularly the\u0000case for NBAs computed from obligation, reactivity, safety and recurrence-type\u0000LTL formulas, reporting in the latter case an average reduction of over 50%. We\u0000also show that our method successfully synthesises $omega$-regular expressions\u0000from more LTL formulas when using a transition-based instead of a state-based\u0000NBA.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matías Carrasco, Franz Mayr, Sergio Yovine, Johny Kidd, Martín Iturbide, Juan Pedro da Silva, Alejo Garat
We define a congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation. We develop an algorithm for efficiently learning the quotient with respect to this congruence and evaluate it on case studies for analyzing statistical properties of LLM.
{"title":"Analyzing constrained LLM through PDFA-learning","authors":"Matías Carrasco, Franz Mayr, Sergio Yovine, Johny Kidd, Martín Iturbide, Juan Pedro da Silva, Alejo Garat","doi":"arxiv-2406.08269","DOIUrl":"https://doi.org/arxiv-2406.08269","url":null,"abstract":"We define a congruence that copes with null next-symbol probabilities that\u0000arise when the output of a language model is constrained by some means during\u0000text generation. We develop an algorithm for efficiently learning the quotient\u0000with respect to this congruence and evaluate it on case studies for analyzing\u0000statistical properties of LLM.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents DAALder (Database-Assisted Automata Learning, with Dutch suffix from leerder), a new algorithm for learning state machines, or automata, specifically deterministic finite-state automata (DFA). When learning state machines from log data originating from software systems, the large amount of log data can pose a challenge. Conventional state merging algorithms cannot efficiently deal with this, as they require a large amount of memory. To solve this, we utilized database technologies to efficiently query a big trace dataset and construct a state machine from it, as databases allow to save large amounts of data on disk while still being able to query it efficiently. Building on research in both active learning and passive learning, the proposed algorithm is a combination of the two. It can quickly find a characteristic set of traces from a database using heuristics from a state merging algorithm. Experiments show that our algorithm has similar performance to conventional state merging algorithms on large datasets, but requires far less memory.
{"title":"Database-assisted automata learning","authors":"Hielke Walinga, Robert Baumgartner, Sicco Verwer","doi":"arxiv-2406.07208","DOIUrl":"https://doi.org/arxiv-2406.07208","url":null,"abstract":"This paper presents DAALder (Database-Assisted Automata Learning, with Dutch\u0000suffix from leerder), a new algorithm for learning state machines, or automata,\u0000specifically deterministic finite-state automata (DFA). When learning state\u0000machines from log data originating from software systems, the large amount of\u0000log data can pose a challenge. Conventional state merging algorithms cannot\u0000efficiently deal with this, as they require a large amount of memory. To solve\u0000this, we utilized database technologies to efficiently query a big trace\u0000dataset and construct a state machine from it, as databases allow to save large\u0000amounts of data on disk while still being able to query it efficiently.\u0000Building on research in both active learning and passive learning, the proposed\u0000algorithm is a combination of the two. It can quickly find a characteristic set\u0000of traces from a database using heuristics from a state merging algorithm.\u0000Experiments show that our algorithm has similar performance to conventional\u0000state merging algorithms on large datasets, but requires far less memory.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141530216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Germán Vega, Roland Groz, Catherine Oriat, Michael Foster, Neil Walkinshaw, Adenilso Simão
This paper presents an active inference method for Extended Finite State Machines, where inputs and outputs are parametrized, and transitions can be conditioned by guards involving input parameters and internal variables called registers. The method applies to (software) systems that cannot be reset, so it learns an EFSM model of the system on a single trace.
{"title":"Learning EFSM Models with Registers in Guards","authors":"Germán Vega, Roland Groz, Catherine Oriat, Michael Foster, Neil Walkinshaw, Adenilso Simão","doi":"arxiv-2406.07040","DOIUrl":"https://doi.org/arxiv-2406.07040","url":null,"abstract":"This paper presents an active inference method for Extended Finite State\u0000Machines, where inputs and outputs are parametrized, and transitions can be\u0000conditioned by guards involving input parameters and internal variables called\u0000registers. The method applies to (software) systems that cannot be reset, so it\u0000learns an EFSM model of the system on a single trace.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141530217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Attributed tree transducers (atts) have been equipped with regular look-around (i.e., a preprocessing via an attributed relabeling) in order to obtain a more robust class of translations. Here we give further evidence of this robustness: we show that if the class of translations realized by nondeterministic atts with regular look-around is restricted to partial functions, then we obtain exactly the class of translations realized by deterministic atts with regular look-around.
{"title":"Attributed Tree Transducers for Partial Functions","authors":"Sebastian Maneth, Martin Vu","doi":"arxiv-2406.06141","DOIUrl":"https://doi.org/arxiv-2406.06141","url":null,"abstract":"Attributed tree transducers (atts) have been equipped with regular\u0000look-around (i.e., a preprocessing via an attributed relabeling) in order to\u0000obtain a more robust class of translations. Here we give further evidence of\u0000this robustness: we show that if the class of translations realized by\u0000nondeterministic atts with regular look-around is restricted to partial\u0000functions, then we obtain exactly the class of translations realized by\u0000deterministic atts with regular look-around.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}