Structural learning of simple staged trees

IF 2.8 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Data Mining and Knowledge Discovery Pub Date : 2024-02-15 DOI:10.1007/s10618-024-01007-0

引用次数: 0

Abstract

Bayesian networks faithfully represent the symmetric conditional independences existing between the components of a random vector. Staged trees are an extension of Bayesian networks for categorical random vectors whose graph represents non-symmetric conditional independences via vertex coloring. However, since they are based on a tree representation of the sample space, the underlying graph becomes cluttered and difficult to visualize as the number of variables increases. Here, we introduce the first structural learning algorithms for the class of simple staged trees, entertaining a compact coalescence of the underlying tree from which non-symmetric independences can be easily read. We show that data-learned simple staged trees often outperform Bayesian networks in model fit and illustrate how the coalesced graph is used to identify non-symmetric conditional independences.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

简单分期树木的结构学习

摘要贝叶斯网络忠实地表示了随机向量成分之间存在的对称条件独立性。分阶段树是贝叶斯网络对分类随机向量的扩展，其图形通过顶点着色来表示非对称条件独立性。然而，由于它们基于样本空间的树形表示，随着变量数量的增加，底层图变得杂乱无章，难以可视化。在这里，我们介绍了第一种针对简单分期树类的结构学习算法，它将底层树紧凑地凝聚在一起，从中可以轻松读取非对称无关性。我们展示了数据学习的简单分期树在模型拟合方面往往优于贝叶斯网络，并说明了如何利用凝聚图来识别非对称条件独立性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Data Mining and Knowledge Discovery 工程技术-计算机：人工智能

CiteScore

10.40

自引率

4.20%

发文量

审稿时长

10 months

期刊介绍： Advances in data gathering, storage, and distribution have created a need for computational tools and techniques to aid in data analysis. Data Mining and Knowledge Discovery in Databases (KDD) is a rapidly growing area of research and application that builds on techniques and theories from many fields, including statistics, databases, pattern recognition and learning, data visualization, uncertainty modelling, data warehousing and OLAP, optimization, and high performance computing.

期刊最新文献

Missing value replacement in strings and applications. FRUITS: feature extraction using iterated sums for time series classification Bounding the family-wise error rate in local causal discovery using Rademacher averages Evaluating the disclosure risk of anonymized documents via a machine learning-based re-identification attack Efficient learning with projected histograms