{"title":"测量同源性 I:对离散成本矩阵特征类型进行解析下最大和最小成本的综合测量。","authors":"Jennifer F Hoyal Cuthill, Graeme T Lloyd","doi":"10.1111/cla.12582","DOIUrl":null,"url":null,"abstract":"<p><p>Here, we propose, prove mathematically and discuss maximum and minimum measures of maximum parsimony evolution across 12 discrete phylogenetic character types, classified across 4467 morphological and molecular datasets. Covered character types are: constant, binary symmetric, multistate unordered (non-additive) symmetric, multistate linear ordered symmetric, multistate non-linear ordered symmetric, binary irreversible, multistate irreversible, binary Dollo, multistate Dollo, multistate custom symmetric, binary custom asymmetric and multistate custom asymmetric characters. We summarize published solutions and provide and prove a range of new formulae for the algebraic calculation of minimum (m), maximum (g) and maximum possible (g<sub>max</sub>) character cost for applicable character types. Algorithms for exhaustive calculation of m, g and g<sub>max</sub> applicable to all classified character types (within computational limits on the numbers of taxa and states) are also provided. The general algorithmic solution for minimum steps (m) is identical to a minimum spanning tree on the state graph or minimum weight spanning arborescence on the state digraph. Algorithmic solutions for character g and g<sub>max</sub> are based on matrix mathematics equivalent to optimization on the star tree, respectively for given state frequencies and all possible state frequencies meeting specified numbers of taxa and states. We show that maximizing possible cost (g<sub>max</sub>) with given transition costs can be equivalent to maximizing, across all possible state frequency combinations, the lowest implied cost of state transitions if any one state is ancestral on the star tree, via the solution of systems of linear equations. The methods we present, implemented in the Claddis R package, extend to a comprehensive range, the fundamental character types for which homoplasy may be measured under parsimony using m, g and g<sub>max</sub>, including extra cost (h), consistency index (ci), retention index (ri) or indices based thereon.</p>","PeriodicalId":50688,"journal":{"name":"Cladistics","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Measuring homoplasy I: comprehensive measures of maximum and minimum cost under parsimony across discrete cost matrix character types.\",\"authors\":\"Jennifer F Hoyal Cuthill, Graeme T Lloyd\",\"doi\":\"10.1111/cla.12582\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Here, we propose, prove mathematically and discuss maximum and minimum measures of maximum parsimony evolution across 12 discrete phylogenetic character types, classified across 4467 morphological and molecular datasets. Covered character types are: constant, binary symmetric, multistate unordered (non-additive) symmetric, multistate linear ordered symmetric, multistate non-linear ordered symmetric, binary irreversible, multistate irreversible, binary Dollo, multistate Dollo, multistate custom symmetric, binary custom asymmetric and multistate custom asymmetric characters. We summarize published solutions and provide and prove a range of new formulae for the algebraic calculation of minimum (m), maximum (g) and maximum possible (g<sub>max</sub>) character cost for applicable character types. Algorithms for exhaustive calculation of m, g and g<sub>max</sub> applicable to all classified character types (within computational limits on the numbers of taxa and states) are also provided. The general algorithmic solution for minimum steps (m) is identical to a minimum spanning tree on the state graph or minimum weight spanning arborescence on the state digraph. Algorithmic solutions for character g and g<sub>max</sub> are based on matrix mathematics equivalent to optimization on the star tree, respectively for given state frequencies and all possible state frequencies meeting specified numbers of taxa and states. We show that maximizing possible cost (g<sub>max</sub>) with given transition costs can be equivalent to maximizing, across all possible state frequency combinations, the lowest implied cost of state transitions if any one state is ancestral on the star tree, via the solution of systems of linear equations. The methods we present, implemented in the Claddis R package, extend to a comprehensive range, the fundamental character types for which homoplasy may be measured under parsimony using m, g and g<sub>max</sub>, including extra cost (h), consistency index (ci), retention index (ri) or indices based thereon.</p>\",\"PeriodicalId\":50688,\"journal\":{\"name\":\"Cladistics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cladistics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1111/cla.12582\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EVOLUTIONARY BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cladistics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/cla.12582","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
在此,我们提出并用数学方法证明和讨论了 12 种离散系统发育特征类型的最大解析进化的最大和最小度量,这些特征类型在 4467 个形态学和分子数据集中进行了分类。涵盖的特征类型有:常量、二元对称、多态无序(非相加)对称、多态线性有序对称、多态非线性有序对称、二元不可逆、多态不可逆、二元多罗、多态多罗、多态自定义对称、二元自定义不对称和多态自定义不对称特征。我们总结了已发表的解决方案,并提供和证明了一系列新公式,用于代数计算适用字符类型的最小(m)、最大(g)和最大可能(gmax)字符成本。此外,还提供了适用于所有分类特征类型的 m、g 和 gmax 的穷举计算算法(在分类群和状态数的计算限制范围内)。最小步数(m)的一般算法解决方案与状态图上的最小生成树或状态数图上的最小权重生成树状图相同。特征 g 和 gmax 的算法解决方案基于矩阵数学,相当于星形树上的优化,分别适用于给定的状态频率和满足指定类群和状态数的所有可能状态频率。我们的研究表明,在给定转换成本的情况下,最大化可能成本(gmax)等同于在所有可能的状态频率组合中,通过线性方程组的求解,最大化状态转换的最低隐含成本(如果任何一个状态都是星形树上的祖先状态)。我们介绍的方法是在 Claddis R 软件包中实现的,这些方法扩展到了一个全面的范围,即在解析法下可以使用 m、g 和 gmax 测量同源性的基本特征类型,包括额外成本(h)、一致性指数(ci)、保留指数(ri)或基于这些指数的指数。
Measuring homoplasy I: comprehensive measures of maximum and minimum cost under parsimony across discrete cost matrix character types.
Here, we propose, prove mathematically and discuss maximum and minimum measures of maximum parsimony evolution across 12 discrete phylogenetic character types, classified across 4467 morphological and molecular datasets. Covered character types are: constant, binary symmetric, multistate unordered (non-additive) symmetric, multistate linear ordered symmetric, multistate non-linear ordered symmetric, binary irreversible, multistate irreversible, binary Dollo, multistate Dollo, multistate custom symmetric, binary custom asymmetric and multistate custom asymmetric characters. We summarize published solutions and provide and prove a range of new formulae for the algebraic calculation of minimum (m), maximum (g) and maximum possible (gmax) character cost for applicable character types. Algorithms for exhaustive calculation of m, g and gmax applicable to all classified character types (within computational limits on the numbers of taxa and states) are also provided. The general algorithmic solution for minimum steps (m) is identical to a minimum spanning tree on the state graph or minimum weight spanning arborescence on the state digraph. Algorithmic solutions for character g and gmax are based on matrix mathematics equivalent to optimization on the star tree, respectively for given state frequencies and all possible state frequencies meeting specified numbers of taxa and states. We show that maximizing possible cost (gmax) with given transition costs can be equivalent to maximizing, across all possible state frequency combinations, the lowest implied cost of state transitions if any one state is ancestral on the star tree, via the solution of systems of linear equations. The methods we present, implemented in the Claddis R package, extend to a comprehensive range, the fundamental character types for which homoplasy may be measured under parsimony using m, g and gmax, including extra cost (h), consistency index (ci), retention index (ri) or indices based thereon.
期刊介绍:
Cladistics publishes high quality research papers on systematics, encouraging debate on all aspects of the field, from philosophy, theory and methodology to empirical studies and applications in biogeography, coevolution, conservation biology, ontogeny, genomics and paleontology.
Cladistics is read by scientists working in the research fields of evolution, systematics and integrative biology and enjoys a consistently high position in the ISI® rankings for evolutionary biology.