LLMs as Probabilistic Minimally Adequate Teachers for DFA Learning

arXiv - CS - Formal Languages and Automata Theory Pub Date : 2024-08-06 DOI:arxiv-2408.02999

Lekai Chen, Ashutosh Trivedi, Alvaro Velasquez

{"title":"LLMs as Probabilistic Minimally Adequate Teachers for DFA Learning","authors":"Lekai Chen, Ashutosh Trivedi, Alvaro Velasquez","doi":"arxiv-2408.02999","DOIUrl":null,"url":null,"abstract":"The emergence of intelligence in large language models (LLMs) has inspired\ninvestigations into their integration into automata learning. This paper\nintroduces the probabilistic Minimally Adequate Teacher (pMAT) formulation,\nwhich leverages a probabilistic oracle that could give persistent errors\nrandomly during answering the membership queries for deterministic finite\nautomata (DFA) learning. Given the tendency of LLMs to produce hallucinatory\ncontent, we have developed techniques to improve answer accuracy and ensure the\ncorrectness of the learned automata. We propose the $\\mathtt{Discrimination}$\nprompt as well as the $\\mathtt{Verification}$ prompt and explore their\nadvantages over common prompts. Additionally, we compare DFA learning\nperformance between the TTT algorithm and common active learning algorithms. To\naddress the exponential number of persistent errors, we implement a dynamic\nquery cache refinement algorithm that identifies and corrects conflicting\nqueries by combining the active and passive learning algorithms. The empirical\nresults demonstrate the robustness and efficiency of our approach, providing a\ntheoretical foundation for automata learning with LLMs in the loop.","PeriodicalId":501124,"journal":{"name":"arXiv - CS - Formal Languages and Automata Theory","volume":"62 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Formal Languages and Automata Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02999","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The emergence of intelligence in large language models (LLMs) has inspired investigations into their integration into automata learning. This paper introduces the probabilistic Minimally Adequate Teacher (pMAT) formulation, which leverages a probabilistic oracle that could give persistent errors randomly during answering the membership queries for deterministic finite automata (DFA) learning. Given the tendency of LLMs to produce hallucinatory content, we have developed techniques to improve answer accuracy and ensure the correctness of the learned automata. We propose the $\mathtt{Discrimination}$ prompt as well as the $\mathtt{Verification}$ prompt and explore their advantages over common prompts. Additionally, we compare DFA learning performance between the TTT algorithm and common active learning algorithms. To address the exponential number of persistent errors, we implement a dynamic query cache refinement algorithm that identifies and corrects conflicting queries by combining the active and passive learning algorithms. The empirical results demonstrate the robustness and efficiency of our approach, providing a theoretical foundation for automata learning with LLMs in the loop.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

LLM 作为 DFA 学习的概率最小充分教师

大型语言模型（LLMs）中出现的智能激发了人们将其融入自动机学习的研究。本文介绍了概率最小适格教师（pMAT）公式，它利用了概率甲骨文，这种甲骨文在回答确定性有限自动机（DFA）学习的成员查询时可能会随机给出持续性错误。鉴于 LLMs 容易产生幻觉内容，我们开发了一些技术来提高回答的准确性，并确保所学自动机的正确性。我们提出了 $\mathtt{Discrimination}$ 提示和 $\mathtt{Verification}$ 提示，并探讨了它们与常见提示相比的优势。此外，我们还比较了 TTT 算法和普通主动学习算法的 DFA 学习性能。为了解决指数级数量的持续性错误，我们实现了一种动态查询缓存完善算法，通过结合主动和被动学习算法来识别和纠正冲突查询。实证结果证明了我们方法的稳健性和效率，为循环中的 LLM 自动学习提供了理论基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Formal Languages and Automata Theory

自引率

0.00%

发文量

期刊最新文献

Query Learning of Advice and Nominal Automata Well-Behaved (Co)algebraic Semantics of Regular Expressions in Dafny Run supports and initial algebra supports of weighted automata Alternating hierarchy of sushifts defined by nondeterministic plane-walking automata $\mathbb{N}$-polyregular functions arise from well-quasi-orderings