Extracting Forbidden Factors from Regular Stringsets

Mathematics of Language Pub Date : 2017-07-01 DOI:10.18653/v1/W17-3404

J. Rogers, D. Lambert

{"title":"Extracting Forbidden Factors from Regular Stringsets","authors":"J. Rogers, D. Lambert","doi":"10.18653/v1/W17-3404","DOIUrl":null,"url":null,"abstract":"The work presented here continues a program of completely characterizing the constraints on the distribution of stress in human languages that are documented in the StressTyp2 database with respect to the Local and Piecewise sub-regular hierarchies. We introduce algorithms that, given a Finite-State Automaton, compute a set of forbidden words, units, initial factors, free factors and final factors that define a Strictly Local (SL) approximation of the stringset recognized by the FSA, along with a minimal DFA that recognizes the residue set: the set of strings in the approximation that are not in the stringset recognized by the FSA. If the FSA recognizes an SL stringset, then the approximation is exact (otherwise it overgenerates). We have applied these tools to the 106 lects that have associated DFAs in the StressTyp2 database, a wide-coverage corpus of stress patterns that are attested in human languages. The results include a large number of strictly local constraints that have not been included in prior work categorizing these patterns with respect to the Local and Piecewise Sub-Regular hierarchies of Rogers et al. (2012), although, of course, they do not contradict the central result of that work, which establishes an upper bound on their complexity that includes strictly local constraints.","PeriodicalId":133680,"journal":{"name":"Mathematics of Language","volume":"139 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematics of Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W17-3404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

The work presented here continues a program of completely characterizing the constraints on the distribution of stress in human languages that are documented in the StressTyp2 database with respect to the Local and Piecewise sub-regular hierarchies. We introduce algorithms that, given a Finite-State Automaton, compute a set of forbidden words, units, initial factors, free factors and final factors that define a Strictly Local (SL) approximation of the stringset recognized by the FSA, along with a minimal DFA that recognizes the residue set: the set of strings in the approximation that are not in the stringset recognized by the FSA. If the FSA recognizes an SL stringset, then the approximation is exact (otherwise it overgenerates). We have applied these tools to the 106 lects that have associated DFAs in the StressTyp2 database, a wide-coverage corpus of stress patterns that are attested in human languages. The results include a large number of strictly local constraints that have not been included in prior work categorizing these patterns with respect to the Local and Piecewise Sub-Regular hierarchies of Rogers et al. (2012), although, of course, they do not contradict the central result of that work, which establishes an upper bound on their complexity that includes strictly local constraints.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从正则字符串集中提取禁止因子

本文介绍的工作是对人类语言中有关局部和分段次规则层次结构的压力分布的约束进行完整描述的一个程序的延续。这些约束记录在StressTyp2数据库中。我们介绍了一种算法，给定一个有限状态自动机，计算一组禁止词、单位、初始因子、自由因子和最终因子，这些因子定义了FSA识别的严格局部(SL)近似，以及识别剩余集的最小DFA:逼近中不在FSA识别的字符串集中的字符串集。如果FSA识别出SL字符串集，则近似值是精确的(否则它会过度生成)。我们已经将这些工具应用到与StressTyp2数据库中的dfa相关联的106个片段中，该数据库是一个广泛覆盖的重音模式语料库，已在人类语言中得到验证。结果包括大量严格的局部约束，这些约束没有包括在先前的工作中，这些工作是根据Rogers等人(2012)的局部和分段次规则层次对这些模式进行分类的，尽管，当然，它们并不与该工作的中心结果相矛盾，该工作建立了包括严格局部约束的复杂性的上限。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Mathematics of Language

自引率

0.00%

发文量

期刊最新文献

(Re)introducing Regular Graph Languages Extracting Forbidden Factors from Regular Stringsets How Many Stemmata with Root Degree k? On the Logical Complexity of Autosegmental Representations A Proof-Theoretic Semantics for Transitive Verbs with an Implicit Object