On the Expressive Power of Regular Expressions with Backreferences

Taisei Nogami, Tachio Terauchi
{"title":"On the Expressive Power of Regular Expressions with Backreferences","authors":"Taisei Nogami, Tachio Terauchi","doi":"10.48550/arXiv.2307.08531","DOIUrl":null,"url":null,"abstract":"A rewb is a regular expression extended with a feature called backreference. It is broadly known that backreference is a practical extension of regular expressions, and is supported by most modern regular expression engines, such as those in the standard libraries of Java, Python, and more. Meanwhile, indexed languages are the languages generated by indexed grammars, a formal grammar class proposed by A.V.Aho. We show that these two models' expressive powers are related in the following way: every language described by a rewb is an indexed language. As the smallest formal grammar class previously known to contain rewbs is the class of context sensitive languages, our result strictly improves the known upper-bound. Moreover, we prove the following two claims: there exists a rewb whose language does not belong to the class of stack languages, which is a proper subclass of indexed languages, and the language described by a rewb without a captured reference is in the class of nonerasing stack languages, which is a proper subclass of stack languages. Finally, we show that the hierarchy investigated in a prior study, which separates the expressive power of rewbs by the notion of nested levels, is within the class of nonerasing stack languages.","PeriodicalId":369104,"journal":{"name":"International Symposium on Mathematical Foundations of Computer Science","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Mathematical Foundations of Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2307.08531","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A rewb is a regular expression extended with a feature called backreference. It is broadly known that backreference is a practical extension of regular expressions, and is supported by most modern regular expression engines, such as those in the standard libraries of Java, Python, and more. Meanwhile, indexed languages are the languages generated by indexed grammars, a formal grammar class proposed by A.V.Aho. We show that these two models' expressive powers are related in the following way: every language described by a rewb is an indexed language. As the smallest formal grammar class previously known to contain rewbs is the class of context sensitive languages, our result strictly improves the known upper-bound. Moreover, we prove the following two claims: there exists a rewb whose language does not belong to the class of stack languages, which is a proper subclass of indexed languages, and the language described by a rewb without a captured reference is in the class of nonerasing stack languages, which is a proper subclass of stack languages. Finally, we show that the hierarchy investigated in a prior study, which separates the expressive power of rewbs by the notion of nested levels, is within the class of nonerasing stack languages.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
带反向引用的正则表达式的表达能力
rewb是一个正则表达式,扩展了一个称为反向引用的特性。众所周知,反向引用是正则表达式的一种实用扩展,大多数现代正则表达式引擎都支持它,比如Java、Python等标准库中的正则表达式引擎。索引语言是由索引语法生成的语言,索引语法是A.V.Aho提出的形式化语法类。我们证明了这两个模型的表达能力之间的关系如下:rewb描述的每一种语言都是索引语言。由于已知包含rewb的最小形式语法类是上下文敏感语言类,因此我们的结果严格提高了已知的上界。此外,我们证明了以下两个命题:存在一个rewb,它的语言不属于堆栈语言类,它是索引语言的适当子类;一个rewb所描述的语言没有捕获到引用,它属于非擦除堆栈语言类,它是堆栈语言的适当子类。最后,我们表明在先前的研究中调查的层次结构,通过嵌套层次的概念分离rewb的表达能力,属于非擦除堆栈语言类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
FPT Approximation and Subexponential Algorithms for Covering Few or Many Edges Dynamic constant time parallel graph algorithms with sub-linear work Polynomial-Delay Enumeration of Large Maximal Common Independent Sets in Two Matroids Entropic Risk for Turn-Based Stochastic Games On the Expressive Power of Regular Expressions with Backreferences
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1