{"title":"因子正则确定性随机语言的极大似然估计","authors":"Chihiro Shibata, Jeffrey Heinz","doi":"10.18653/v1/W19-5709","DOIUrl":null,"url":null,"abstract":"This paper proves that for every class C of stochastic languages defined with the coemission product of finitely many probabilistic, deterministic finite-state acceptors (PDFA) and for every data sequence D of finitely many strings drawn i.i.d. from some stochastic language, the Maximum Likelihood Estimate of D with respect to C can be found efficiently by locally optimizing the parameter values. We show that a consequence of the co-emission product is that each PDFA behaves like an independent factor in a joint distribution. Thus, the likelihood function decomposes in a natural way. We also show that the negative log likelihood function is convex. These results are motivated by the study of Strictly k-Piecewise (SPk) Stochastic Languages, which form a class of stochastic languages which is both linguistically motivated and naturally understood in terms of the coemission product of certain PDFAs.","PeriodicalId":298538,"journal":{"name":"Proceedings of the 16th Meeting on the Mathematics of Language","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Maximum Likelihood Estimation of Factored Regular Deterministic Stochastic Languages\",\"authors\":\"Chihiro Shibata, Jeffrey Heinz\",\"doi\":\"10.18653/v1/W19-5709\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proves that for every class C of stochastic languages defined with the coemission product of finitely many probabilistic, deterministic finite-state acceptors (PDFA) and for every data sequence D of finitely many strings drawn i.i.d. from some stochastic language, the Maximum Likelihood Estimate of D with respect to C can be found efficiently by locally optimizing the parameter values. We show that a consequence of the co-emission product is that each PDFA behaves like an independent factor in a joint distribution. Thus, the likelihood function decomposes in a natural way. We also show that the negative log likelihood function is convex. These results are motivated by the study of Strictly k-Piecewise (SPk) Stochastic Languages, which form a class of stochastic languages which is both linguistically motivated and naturally understood in terms of the coemission product of certain PDFAs.\",\"PeriodicalId\":298538,\"journal\":{\"name\":\"Proceedings of the 16th Meeting on the Mathematics of Language\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 16th Meeting on the Mathematics of Language\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/W19-5709\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th Meeting on the Mathematics of Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W19-5709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Maximum Likelihood Estimation of Factored Regular Deterministic Stochastic Languages
This paper proves that for every class C of stochastic languages defined with the coemission product of finitely many probabilistic, deterministic finite-state acceptors (PDFA) and for every data sequence D of finitely many strings drawn i.i.d. from some stochastic language, the Maximum Likelihood Estimate of D with respect to C can be found efficiently by locally optimizing the parameter values. We show that a consequence of the co-emission product is that each PDFA behaves like an independent factor in a joint distribution. Thus, the likelihood function decomposes in a natural way. We also show that the negative log likelihood function is convex. These results are motivated by the study of Strictly k-Piecewise (SPk) Stochastic Languages, which form a class of stochastic languages which is both linguistically motivated and naturally understood in terms of the coemission product of certain PDFAs.