The impact of state merging on predictive accuracy in probabilistic tree automata: Dietze's conjecture revisited

IF 1.1 3区 计算机科学 Q1 BUSINESS, FINANCE Journal of Computer and System Sciences Pub Date : 2024-06-27 DOI:10.1016/j.jcss.2024.103563
{"title":"The impact of state merging on predictive accuracy in probabilistic tree automata: Dietze's conjecture revisited","authors":"","doi":"10.1016/j.jcss.2024.103563","DOIUrl":null,"url":null,"abstract":"<div><p>Dietze's conjecture concerns the problem of equipping a tree automaton <em>M</em> with weights to make it probabilistic, in such a way that the resulting automaton <em>N</em> predicts a given corpus <span><math><mi>C</mi></math></span> as accurately as possible. The conjecture states that the accuracy cannot increase if the states in <em>M</em> are merged with respect to an equivalence relation ∼ so that the result is a smaller automaton <span><math><msup><mrow><mi>M</mi></mrow><mrow><mo>∼</mo></mrow></msup></math></span>. Put differently, merging states can never improve predictions. This is under the assumption that both <em>M</em> and <span><math><msup><mrow><mi>M</mi></mrow><mrow><mo>∼</mo></mrow></msup></math></span> are bottom-up deterministic and accept every tree in <span><math><mi>C</mi></math></span>. We prove that the conjecture holds, using a construction that turns any probabilistic version <span><math><msup><mrow><mi>N</mi></mrow><mrow><mo>∼</mo></mrow></msup></math></span> of <span><math><msup><mrow><mi>M</mi></mrow><mrow><mo>∼</mo></mrow></msup></math></span> into a probabilistic version <em>N</em> of <em>M</em>, such that <em>N</em> assigns at least as great a weight to each tree in <span><math><mi>C</mi></math></span> as <span><math><msup><mrow><mi>N</mi></mrow><mrow><mo>∼</mo></mrow></msup></math></span> does.</p></div>","PeriodicalId":50224,"journal":{"name":"Journal of Computer and System Sciences","volume":"146 ","pages":"Article 103563"},"PeriodicalIF":1.1000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022000024000588/pdfft?md5=9e1c1d599bbcfc040fd29b857c6c21e8&pid=1-s2.0-S0022000024000588-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer and System Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022000024000588","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 0

Abstract

Dietze's conjecture concerns the problem of equipping a tree automaton M with weights to make it probabilistic, in such a way that the resulting automaton N predicts a given corpus C as accurately as possible. The conjecture states that the accuracy cannot increase if the states in M are merged with respect to an equivalence relation ∼ so that the result is a smaller automaton M. Put differently, merging states can never improve predictions. This is under the assumption that both M and M are bottom-up deterministic and accept every tree in C. We prove that the conjecture holds, using a construction that turns any probabilistic version N of M into a probabilistic version N of M, such that N assigns at least as great a weight to each tree in C as N does.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
状态合并对概率树自动机预测准确性的影响:迪茨猜想再探讨
迪茨猜想涉及的问题是为树状自动机 M 添加权重,使其具有概率性,从而使由此产生的自动机 N 能尽可能准确地预测给定语料 C。该猜想指出,如果根据等价关系 ∼ 合并 M 中的状态,从而得到一个更小的自动机 M∼,那么准确度就不会提高。换句话说,合并状态永远无法改善预测结果。我们使用一种构造证明猜想成立,这种构造将 M∼ 的任何概率版本 N 转变成 M 的概率版本 N,使得 N 对 C 中每棵树赋予的权重至少与 N 一样大。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Computer and System Sciences
Journal of Computer and System Sciences 工程技术-计算机:理论方法
CiteScore
3.70
自引率
0.00%
发文量
58
审稿时长
68 days
期刊介绍: The Journal of Computer and System Sciences publishes original research papers in computer science and related subjects in system science, with attention to the relevant mathematical theory. Applications-oriented papers may also be accepted and they are expected to contain deep analytic evaluation of the proposed solutions. Research areas include traditional subjects such as: • Theory of algorithms and computability • Formal languages • Automata theory Contemporary subjects such as: • Complexity theory • Algorithmic Complexity • Parallel & distributed computing • Computer networks • Neural networks • Computational learning theory • Database theory & practice • Computer modeling of complex systems • Security and Privacy.
期刊最新文献
Embedding hypercubes into torus and Cartesian product of paths and/or cycles for minimizing wirelength Algorithms and Turing kernels for detecting and counting small patterns in unit disk graphs Backwards-reachability for cooperating multi-pushdown systems On computing optimal temporal branchings and spanning subgraphs Parameterized results on acyclic matchings with implications for related problems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1