In modern machine (ML) learning systems, Transformer-based architectures have achieved milestone success across a broad spectrum of tasks, yet understanding their operational mechanisms remains an open problem. To improve the transparency of ML systems, automata extraction methods, which interpret stateful ML models as automata typically through formal languages, have proven effective for explaining the mechanism of recurrent neural networks (RNNs). However, few works have been applied to this paradigm to Transformer models. In particular, understanding their processing of formal languages and identifying their limitations in this area remains unexplored. In this paper, we propose an automata extraction algorithm specifically designed for Transformer models. Treating the Transformer model as a black-box system, we track the model through the transformation process of their internal latent representations during their operations, and then use classical pedagogical approaches like L* algorithm to interpret them as deterministic finite-state automata (DFA). Overall, our study reveals how the Transformer model comprehends the structure of formal languages, which not only enhances the interpretability of the Transformer-based ML systems but also marks a crucial step toward a deeper understanding of how ML systems process formal languages. Code and data are available at https://github.com/Zhang-Yihao/Transfomer2DFA.
在现代机器(ML)学习系统中,基于变形器的架构在广泛的任务中取得了里程碑式的成功,然而,理解其运行机制仍然是一个未决问题。为了提高 ML 系统的透明度,自动机提取方法(通常通过形式语言将有状态的 ML 模型解释为自动机)被证明对解释循环神经网络(RNN)的机制非常有效。特别是在理解它们对形式语言的处理以及识别它们在这方面的局限性方面,仍有待探索。本文提出了一种专为变换器模型设计的自动机提取算法。我们将变换器模型视为一个黑盒系统,在其运行过程中跟踪其内部潜在表征的变换过程,然后使用经典的教学方法(如 L* 算法)将其解释为确定性有限状态自动机(DFA)。总之,我们的研究揭示了变换器模型是如何理解形式语言结构的,这不仅增强了基于变换器的 ML 系统的可解释性,而且标志着我们朝着更深入地理解 ML 系统如何处理形式语言的方向迈出了关键的一步。代码和数据请访问 https://github.com/Zhang-Yihao/Transfomer2DFA。
{"title":"Automata Extraction from Transformers","authors":"Yihao Zhang, Zeming Wei, Meng Sun","doi":"arxiv-2406.05564","DOIUrl":"https://doi.org/arxiv-2406.05564","url":null,"abstract":"In modern machine (ML) learning systems, Transformer-based architectures have\u0000achieved milestone success across a broad spectrum of tasks, yet understanding\u0000their operational mechanisms remains an open problem. To improve the\u0000transparency of ML systems, automata extraction methods, which interpret\u0000stateful ML models as automata typically through formal languages, have proven\u0000effective for explaining the mechanism of recurrent neural networks (RNNs).\u0000However, few works have been applied to this paradigm to Transformer models. In\u0000particular, understanding their processing of formal languages and identifying\u0000their limitations in this area remains unexplored. In this paper, we propose an\u0000automata extraction algorithm specifically designed for Transformer models.\u0000Treating the Transformer model as a black-box system, we track the model\u0000through the transformation process of their internal latent representations\u0000during their operations, and then use classical pedagogical approaches like L*\u0000algorithm to interpret them as deterministic finite-state automata (DFA).\u0000Overall, our study reveals how the Transformer model comprehends the structure\u0000of formal languages, which not only enhances the interpretability of the\u0000Transformer-based ML systems but also marks a crucial step toward a deeper\u0000understanding of how ML systems process formal languages. Code and data are\u0000available at https://github.com/Zhang-Yihao/Transfomer2DFA.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruben Becker, Sung-Hwan Kim, Nicola Prezza, Carlo Tosoni
An index on a finite-state automaton is a data structure able to locate specific patterns on the automaton's paths and consequently on the regular language accepted by the automaton itself. Cotumaccio and Prezza [SODA '21], introduced a data structure able to solve pattern matching queries on automata, generalizing the famous FM-index for strings of Ferragina and Manzini [FOCS '00]. The efficiency of their index depends on the width of a particular partial order of the automaton's states, the smaller the width of the partial order, the faster is the index. However, computing the partial order of minimal width is NP-hard. This problem was mitigated by Cotumaccio [DCC '22], who relaxed the conditions on the partial order, allowing it to be a partial preorder. This relaxation yields the existence of a unique partial preorder of minimal width that can be computed in polynomial time. In the paper at hand, we present a new class of partial preorders and show that they have the following useful properties: (i) they can be computed in polynomial time, (ii) their width is never larger than the width of Cotumaccio's preorders, and (iii) there exist infinite classes of automata on which the width of Cotumaccio's pre-order is linearly larger than the width of our preorder.
{"title":"Indexing Finite-State Automata Using Forward-Stable Partitions","authors":"Ruben Becker, Sung-Hwan Kim, Nicola Prezza, Carlo Tosoni","doi":"arxiv-2406.02763","DOIUrl":"https://doi.org/arxiv-2406.02763","url":null,"abstract":"An index on a finite-state automaton is a data structure able to locate\u0000specific patterns on the automaton's paths and consequently on the regular\u0000language accepted by the automaton itself. Cotumaccio and Prezza [SODA '21],\u0000introduced a data structure able to solve pattern matching queries on automata,\u0000generalizing the famous FM-index for strings of Ferragina and Manzini [FOCS\u0000'00]. The efficiency of their index depends on the width of a particular\u0000partial order of the automaton's states, the smaller the width of the partial\u0000order, the faster is the index. However, computing the partial order of minimal\u0000width is NP-hard. This problem was mitigated by Cotumaccio [DCC '22], who\u0000relaxed the conditions on the partial order, allowing it to be a partial\u0000preorder. This relaxation yields the existence of a unique partial preorder of\u0000minimal width that can be computed in polynomial time. In the paper at hand, we\u0000present a new class of partial preorders and show that they have the following\u0000useful properties: (i) they can be computed in polynomial time, (ii) their\u0000width is never larger than the width of Cotumaccio's preorders, and (iii) there\u0000exist infinite classes of automata on which the width of Cotumaccio's pre-order\u0000is linearly larger than the width of our preorder.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pascal Baumann, Eren Keskin, Roland Meyer, Georg Zetzsche
The omega-regular separability problem for B"uchi VASS coverability languages has recently been shown to be decidable, but with an EXPSPACE lower and a non-primitive recursive upper bound -- the exact complexity remained open. We close this gap and show that the problem is EXPSPACE-complete. A careful analysis of our complexity bounds additionally yields a PSPACE procedure in the case of fixed dimension >= 1, which matches a pre-established lower bound of PSPACE for one dimensional B"uchi VASS. Our algorithm is a non-deterministic search for a witness whose size, as we show, can be suitably bounded. Part of the procedure is to decide the existence of runs in VASS that satisfy certain non-linear properties. Therefore, a key technical ingredient is to analyze a class of systems of inequalities where one variable may occur in non-linear (polynomial) expressions. These so-called singly non-linear systems (SNLS) take the form A(x).y >= b(x), where A(x) and b(x) are a matrix resp. a vector whose entries are polynomials in x, and y ranges over vectors in the rationals. Our main contribution on SNLS is an exponential upper bound on the size of rational solutions to singly non-linear systems. The proof consists of three steps. First, we give a tailor-made quantifier elimination to characterize all real solutions to x. Second, using the root separation theorem about the distance of real roots of polynomials, we show that if a rational solution exists, then there is one with at most polynomially many bits. Third, we insert the solution for x into the SNLS, making it linear and allowing us to invoke standard solution bounds from convex geometry. Finally, we combine the results about SNLS with several techniques from the area of VASS to devise an EXPSPACE decision procedure for omega-regular separability of B"uchi VASS.
最近,有人证明了布内 VASS 可覆盖性语言的欧米伽正则可分性问题是可解的,但它有一个 EXPSPACE 下限和一个非直观递归上界--确切的复杂性仍然没有定论。我们填补了这一空白,并证明这个问题是EXPSPACE-complete的。对我们的复杂度边界进行仔细分析后,在固定维度 >= 1 的情况下,我们还得到了一个 PSPACEprocedure,它与一维 B"uchi VASS 的预设下限 PSPACE 相匹配。我们的算法是一种非确定性的搜索,我们证明,证人的大小可以被适当地限定。该过程的一部分是确定满足某些非线性特性的 VASS 运行的存在性。因此,一个关键的技术要素是分析一类变量可能出现在非线性(多项式)表达式中的不等式系统。这些所谓的单非线性系统(SNLS)的形式为 A(x).y>=b(x),其中 A(x) 和 b(x) 分别是一个矩阵和一个向量,其项是 x 的多项式,而 y 的范围是有理数中的向量。我们对单非线性系统的主要贡献是对单非线性系统的有理数解的大小提出了指数上界。首先,我们给出了一个量子消元法,以描述 x 的所有有理解。其次,利用多项式实根距离的根分离定理,我们证明了如果存在有理解,那么有理解的位数最多为多项式位数。第三,我们将 x 的解插入 SNLS,使其成为线性解,并允许我们引用凸几何中的标准解界值。最后,我们将SNLS的结果与VASS领域的几种技术相结合,设计出一种EXPSPACE决策程序,用于B"uchi VASS的ω-regular-separability。
{"title":"Separability in Büchi Vass and Singly Non-Linear Systems of Inequalities","authors":"Pascal Baumann, Eren Keskin, Roland Meyer, Georg Zetzsche","doi":"arxiv-2406.01008","DOIUrl":"https://doi.org/arxiv-2406.01008","url":null,"abstract":"The omega-regular separability problem for B\"uchi VASS coverability\u0000languages has recently been shown to be decidable, but with an EXPSPACE lower\u0000and a non-primitive recursive upper bound -- the exact complexity remained\u0000open. We close this gap and show that the problem is EXPSPACE-complete. A\u0000careful analysis of our complexity bounds additionally yields a PSPACE\u0000procedure in the case of fixed dimension >= 1, which matches a pre-established\u0000lower bound of PSPACE for one dimensional B\"uchi VASS. Our algorithm is a\u0000non-deterministic search for a witness whose size, as we show, can be suitably\u0000bounded. Part of the procedure is to decide the existence of runs in VASS that\u0000satisfy certain non-linear properties. Therefore, a key technical ingredient is\u0000to analyze a class of systems of inequalities where one variable may occur in\u0000non-linear (polynomial) expressions. These so-called singly non-linear systems (SNLS) take the form A(x).y >=\u0000b(x), where A(x) and b(x) are a matrix resp. a vector whose entries are\u0000polynomials in x, and y ranges over vectors in the rationals. Our main\u0000contribution on SNLS is an exponential upper bound on the size of rational\u0000solutions to singly non-linear systems. The proof consists of three steps.\u0000First, we give a tailor-made quantifier elimination to characterize all real\u0000solutions to x. Second, using the root separation theorem about the distance of\u0000real roots of polynomials, we show that if a rational solution exists, then\u0000there is one with at most polynomially many bits. Third, we insert the solution\u0000for x into the SNLS, making it linear and allowing us to invoke standard\u0000solution bounds from convex geometry. Finally, we combine the results about SNLS with several techniques from the\u0000area of VASS to devise an EXPSPACE decision procedure for omega-regular\u0000separability of B\"uchi VASS.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study connections between linear equations over various semigroups and recursively enumerable sets of positive integers. We give variants of the universal Diophantine representation of recursively enumerable sets of positive integers established by Matiyasevich. These variants use linear equations with one unkwown instead of polynomial equations with several unknowns. As a corollary we get undecidability results for linear equations over morphism semigoups and over matrix semigroups.
{"title":"Linear equations and recursively enumerable sets","authors":"Juha Honkala","doi":"arxiv-2406.00688","DOIUrl":"https://doi.org/arxiv-2406.00688","url":null,"abstract":"We study connections between linear equations over various semigroups and\u0000recursively enumerable sets of positive integers. We give variants of the\u0000universal Diophantine representation of recursively enumerable sets of positive\u0000integers established by Matiyasevich. These variants use linear equations with\u0000one unkwown instead of polynomial equations with several unknowns. As a\u0000corollary we get undecidability results for linear equations over morphism\u0000semigoups and over matrix semigroups.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manfred Droste, Zoltán Fülöp, Andreja Tepavčević, Heiko Vogler
We consider the images of the initial algebra semantics of weighted tree automata over strong bimonoids (hence also over semirings). These images are subsets of the carrier set of the underlying strong bimonoid. We consider locally finite, weakly locally finite, and bi-locally finite strong bimonoids. We show that there exists a strong bimonoid which is weakly locally finite and not locally finite. We also show that if the ranked alphabet contains a binary symbol, then for any finitely generated strong bimonoid, weighted tree automata can generate, via their initial algebra semantics, all elements of the strong bimonoid. As a consequence of these results, for weakly locally finite strong bimonoids which are not locally finite, weighted tree automata can generate infinite images provided that the input ranked alphabet contains at least one binary symbol. This is in sharp contrast to the setting of weighted string automata, where each such image is known to be finite. As a further consequence, for any finitely generated semiring, there exists a weighted tree automaton which generates, via its run semantics, all elements of the semiring.
{"title":"The generating power of weighted tree automata with initial algebra semantics","authors":"Manfred Droste, Zoltán Fülöp, Andreja Tepavčević, Heiko Vogler","doi":"arxiv-2405.20753","DOIUrl":"https://doi.org/arxiv-2405.20753","url":null,"abstract":"We consider the images of the initial algebra semantics of weighted tree\u0000automata over strong bimonoids (hence also over semirings). These images are\u0000subsets of the carrier set of the underlying strong bimonoid. We consider\u0000locally finite, weakly locally finite, and bi-locally finite strong bimonoids.\u0000We show that there exists a strong bimonoid which is weakly locally finite and\u0000not locally finite. We also show that if the ranked alphabet contains a binary\u0000symbol, then for any finitely generated strong bimonoid, weighted tree automata\u0000can generate, via their initial algebra semantics, all elements of the strong\u0000bimonoid. As a consequence of these results, for weakly locally finite strong\u0000bimonoids which are not locally finite, weighted tree automata can generate\u0000infinite images provided that the input ranked alphabet contains at least one\u0000binary symbol. This is in sharp contrast to the setting of weighted string\u0000automata, where each such image is known to be finite. As a further\u0000consequence, for any finitely generated semiring, there exists a weighted tree\u0000automaton which generates, via its run semantics, all elements of the semiring.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew C. Li, Zizhao Chen, Toryn Q. Klassen, Pashootan Vaezipoor, Rodrigo Toro Icarte, Sheila A. McIlraith
Reward Machines provide an automata-inspired structure for specifying instructions, safety constraints, and other temporally extended reward-worthy behaviour. By exposing complex reward function structure, they enable counterfactual learning updates that have resulted in impressive sample efficiency gains. While Reward Machines have been employed in both tabular and deep RL settings, they have typically relied on a ground-truth interpretation of the domain-specific vocabulary that form the building blocks of the reward function. Such ground-truth interpretations can be elusive in many real-world settings, due in part to partial observability or noisy sensing. In this paper, we explore the use of Reward Machines for Deep RL in noisy and uncertain environments. We characterize this problem as a POMDP and propose a suite of RL algorithms that leverage task structure under uncertain interpretation of domain-specific vocabulary. Theoretical analysis exposes pitfalls in naive approaches to this problem, while experimental results show that our algorithms successfully leverage task structure to improve performance under noisy interpretations of the vocabulary. Our results provide a general framework for exploiting Reward Machines in partially observable environments.
{"title":"Reward Machines for Deep RL in Noisy and Uncertain Environments","authors":"Andrew C. Li, Zizhao Chen, Toryn Q. Klassen, Pashootan Vaezipoor, Rodrigo Toro Icarte, Sheila A. McIlraith","doi":"arxiv-2406.00120","DOIUrl":"https://doi.org/arxiv-2406.00120","url":null,"abstract":"Reward Machines provide an automata-inspired structure for specifying\u0000instructions, safety constraints, and other temporally extended reward-worthy\u0000behaviour. By exposing complex reward function structure, they enable\u0000counterfactual learning updates that have resulted in impressive sample\u0000efficiency gains. While Reward Machines have been employed in both tabular and\u0000deep RL settings, they have typically relied on a ground-truth interpretation\u0000of the domain-specific vocabulary that form the building blocks of the reward\u0000function. Such ground-truth interpretations can be elusive in many real-world\u0000settings, due in part to partial observability or noisy sensing. In this paper,\u0000we explore the use of Reward Machines for Deep RL in noisy and uncertain\u0000environments. We characterize this problem as a POMDP and propose a suite of RL\u0000algorithms that leverage task structure under uncertain interpretation of\u0000domain-specific vocabulary. Theoretical analysis exposes pitfalls in naive\u0000approaches to this problem, while experimental results show that our algorithms\u0000successfully leverage task structure to improve performance under noisy\u0000interpretations of the vocabulary. Our results provide a general framework for\u0000exploiting Reward Machines in partially observable environments.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Let G be a context-free grammar (CFG) in Chomsky normal form. We take the number of rules in G to be the size of G. We also assume all CFGs are in Chomsky normal form. We consider the question of, given a string w of length n, what is the smallest CFG such that L(G)={w}? We show the following: 1) For all w, |w|=n, there is a CFG of size with O(n/log n) rules, such that L(G)={w}. 2) There exists a string w, |w|=n, such that every CFG G with L(G)={w} is of size Omega(n/log n). We give two proofs of: one nonconstructive, the other constructive.
让 G 成为乔姆斯基正则形式的无上下文语法 (CFG)。我们将 G 中的规则数视为 G 的大小。我们还假设所有 CFG 都是乔姆斯基正态形式。我们要考虑的问题是:给定长度为 n 的字符串 w,L(G)={w} 的最小 CFG 是什么?我们证明如下:1) 对于所有 w,|w|=n,存在一个大小为 O(n/log n) 规则的 CFG,使得 L(G)={w}.2) 存在一个字符串 w,|w|=n,使得具有 L(G)={w} 的 CFG G 的大小为 Omega(n/log n)。我们给出两个证明:一个是非结构性证明,另一个是结构性证明。
{"title":"The CFG Complexity of Singleton Sets","authors":"Lance Fortnow, William Gasarch","doi":"arxiv-2405.20026","DOIUrl":"https://doi.org/arxiv-2405.20026","url":null,"abstract":"Let G be a context-free grammar (CFG) in Chomsky normal form. We take the\u0000number of rules in G to be the size of G. We also assume all CFGs are in\u0000Chomsky normal form. We consider the question of, given a string w of length n, what is the\u0000smallest CFG such that L(G)={w}? We show the following: 1) For all w, |w|=n, there is a CFG of size with O(n/log n) rules, such that\u0000L(G)={w}. 2) There exists a string w, |w|=n, such that every CFG G with L(G)={w} is of\u0000size Omega(n/log n). We give two proofs of: one nonconstructive, the other\u0000constructive.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"103 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141197585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose DFAMiner, a passive learning tool for learning minimal separating deterministic finite automata (DFA) from a set of labelled samples. Separating automata are an interesting class of automata that occurs generally in regular model checking and has raised interest in foundational questions of parity game solving. We first propose a simple and linear-time algorithm that incrementally constructs a three-valued DFA (3DFA) from a set of labelled samples given in the usual lexicographical order. This 3DFA has accepting and rejecting states as well as don't-care states, so that it can exactly recognise the labelled examples. We then apply our tool to mining a minimal separating DFA for the labelled samples by minimising the constructed automata via a reduction to solving SAT problems. Empirical evaluation shows that our tool outperforms current state-of-the-art tools significantly on standard benchmarks for learning minimal separating DFAs from samples. Progress in the efficient construction of separating DFAs can also lead to finding the lower bound of parity game solving, where we show that DFAMiner can create optimal separating automata for simple languages with up to 7 colours. Future improvements might offer inroads to better data structures.
{"title":"DFAMiner: Mining minimal separating DFAs from labelled samples","authors":"Daniele Dell'Erba, Yong Li, Sven Schewe","doi":"arxiv-2405.18871","DOIUrl":"https://doi.org/arxiv-2405.18871","url":null,"abstract":"We propose DFAMiner, a passive learning tool for learning minimal separating\u0000deterministic finite automata (DFA) from a set of labelled samples. Separating\u0000automata are an interesting class of automata that occurs generally in regular\u0000model checking and has raised interest in foundational questions of parity game\u0000solving. We first propose a simple and linear-time algorithm that incrementally\u0000constructs a three-valued DFA (3DFA) from a set of labelled samples given in\u0000the usual lexicographical order. This 3DFA has accepting and rejecting states\u0000as well as don't-care states, so that it can exactly recognise the labelled\u0000examples. We then apply our tool to mining a minimal separating DFA for the\u0000labelled samples by minimising the constructed automata via a reduction to\u0000solving SAT problems. Empirical evaluation shows that our tool outperforms\u0000current state-of-the-art tools significantly on standard benchmarks for\u0000learning minimal separating DFAs from samples. Progress in the efficient\u0000construction of separating DFAs can also lead to finding the lower bound of\u0000parity game solving, where we show that DFAMiner can create optimal separating\u0000automata for simple languages with up to 7 colours. Future improvements might\u0000offer inroads to better data structures.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141197579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When monitoring a cyber-physical system (CPS) from a remote server, keeping the monitored data secret is crucial, particularly when they contain sensitive information, e.g., biological or location data. Recently, Banno et al. (CAV'22) proposed a protocol for online LTL monitoring that keeps data concealed from the server using Fully Homomorphic Encryption (FHE). We build on this protocol to allow arithmetic operations over encrypted values, e.g., to compute a safety measurement combining distance, velocity, and so forth. Overall, our protocol enables oblivious online monitoring of discrete-time real-valued signals against signal temporal logic (STL) formulas. Our protocol combines two FHE schemes, CKKS and TFHE, leveraging their respective strengths. We employ CKKS to evaluate arithmetic predicates in STL formulas while utilizing TFHE to process them using a DFA derived from the STL formula. We conducted case studies on monitoring blood glucose levels and vehicles' behavior against the Responsibility-Sensitive Safety (RSS) rules. Our results suggest the practical relevance of our protocol.
{"title":"Oblivious Monitoring for Discrete-Time STL via Fully Homomorphic Encryption","authors":"Masaki Waga, Kotaro Matsuoka, Takashi Suwa, Naoki Matsumoto, Ryotaro Banno, Song Bian, Kohei Suenaga","doi":"arxiv-2405.16767","DOIUrl":"https://doi.org/arxiv-2405.16767","url":null,"abstract":"When monitoring a cyber-physical system (CPS) from a remote server, keeping\u0000the monitored data secret is crucial, particularly when they contain sensitive\u0000information, e.g., biological or location data. Recently, Banno et al. (CAV'22)\u0000proposed a protocol for online LTL monitoring that keeps data concealed from\u0000the server using Fully Homomorphic Encryption (FHE). We build on this protocol\u0000to allow arithmetic operations over encrypted values, e.g., to compute a safety\u0000measurement combining distance, velocity, and so forth. Overall, our protocol\u0000enables oblivious online monitoring of discrete-time real-valued signals\u0000against signal temporal logic (STL) formulas. Our protocol combines two FHE\u0000schemes, CKKS and TFHE, leveraging their respective strengths. We employ CKKS\u0000to evaluate arithmetic predicates in STL formulas while utilizing TFHE to\u0000process them using a DFA derived from the STL formula. We conducted case\u0000studies on monitoring blood glucose levels and vehicles' behavior against the\u0000Responsibility-Sensitive Safety (RSS) rules. Our results suggest the practical\u0000relevance of our protocol.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"164 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141172667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, recurrent models based on linear state space models (SSMs) have shown promising performance in language modeling (LM), competititve with transformers. However, there is little understanding of the in-principle abilities of such models, which could provide useful guidance to the search for better LM architectures. We present a comprehensive theoretical study of the capacity of such SSMs as it compares to that of transformers and traditional RNNs. We find that SSMs and transformers have overlapping but distinct strengths. In star-free state tracking, SSMs implement straightforward and exact solutions to problems that transformers struggle to represent exactly. They can also model bounded hierarchical structure with optimal memory even without simulating a stack. On the other hand, we identify a design choice in current SSMs that limits their expressive power. We discuss implications for SSM and LM research, and verify results empirically on a recent SSM, Mamba.
{"title":"The Expressive Capacity of State Space Models: A Formal Language Perspective","authors":"Yash Sarrof, Yana Veitsman, Michael Hahn","doi":"arxiv-2405.17394","DOIUrl":"https://doi.org/arxiv-2405.17394","url":null,"abstract":"Recently, recurrent models based on linear state space models (SSMs) have\u0000shown promising performance in language modeling (LM), competititve with\u0000transformers. However, there is little understanding of the in-principle\u0000abilities of such models, which could provide useful guidance to the search for\u0000better LM architectures. We present a comprehensive theoretical study of the\u0000capacity of such SSMs as it compares to that of transformers and traditional\u0000RNNs. We find that SSMs and transformers have overlapping but distinct\u0000strengths. In star-free state tracking, SSMs implement straightforward and\u0000exact solutions to problems that transformers struggle to represent exactly.\u0000They can also model bounded hierarchical structure with optimal memory even\u0000without simulating a stack. On the other hand, we identify a design choice in\u0000current SSMs that limits their expressive power. We discuss implications for\u0000SSM and LM research, and verify results empirically on a recent SSM, Mamba.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"98 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141172738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}