Philipp Mitteroecker, Michael L Collyer, Dean C Adams
{"title":"通过最大化布隆伯格 K 值探索多变量表型中的系统发育信号","authors":"Philipp Mitteroecker, Michael L Collyer, Dean C Adams","doi":"10.1093/sysbio/syae035","DOIUrl":null,"url":null,"abstract":"<p><p>Due to the hierarchical structure of the tree of life, closely related species often resemble each other more than distantly related species; a pattern termed phylogenetic signal. Numerous univariate statistics have been proposed as measures of phylogenetic signal for single phenotypic traits, but the study of phylogenetic signal for multivariate data, as is common in modern biology, remains challenging. Here we introduce a new method to explore phylogenetic signal in multivariate phenotypes. Our approach decomposes the data into linear combinations with maximal (or minimal) phylogenetic signal, as measured by Blomberg's K. The loading vectors of these phylogenetic components or K-components can be biologically interpreted, and scatterplots of the scores can be used as a low-dimensional ordination of the data that maximally (or minimally) preserves phylogenetic signal. We present algebraic and statistical properties, along with two new summary statistics, KA and KG, of phylogenetic signal in multivariate data. Simulation studies showed that KA and KG have higher statistical power than the previously suggested statistic Kmult, especially if phylogenetic signal is low or concentrated in a few trait dimensions. In two empirical applications to vertebrate cranial shape (crocodyliforms and papionins), we found statistically significant phylogenetic signal concentrated in a few trait dimensions. The finding that phylogenetic signal can be highly variable across the dimensions of multivariate phenotypes has important implications for current maximum likelihood approaches to phylogenetic signal in multivariate data.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":""},"PeriodicalIF":6.1000,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring Phylogenetic Signal in Multivariate Phenotypes by Maximizing Blomberg's K.\",\"authors\":\"Philipp Mitteroecker, Michael L Collyer, Dean C Adams\",\"doi\":\"10.1093/sysbio/syae035\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Due to the hierarchical structure of the tree of life, closely related species often resemble each other more than distantly related species; a pattern termed phylogenetic signal. Numerous univariate statistics have been proposed as measures of phylogenetic signal for single phenotypic traits, but the study of phylogenetic signal for multivariate data, as is common in modern biology, remains challenging. Here we introduce a new method to explore phylogenetic signal in multivariate phenotypes. Our approach decomposes the data into linear combinations with maximal (or minimal) phylogenetic signal, as measured by Blomberg's K. The loading vectors of these phylogenetic components or K-components can be biologically interpreted, and scatterplots of the scores can be used as a low-dimensional ordination of the data that maximally (or minimally) preserves phylogenetic signal. We present algebraic and statistical properties, along with two new summary statistics, KA and KG, of phylogenetic signal in multivariate data. Simulation studies showed that KA and KG have higher statistical power than the previously suggested statistic Kmult, especially if phylogenetic signal is low or concentrated in a few trait dimensions. In two empirical applications to vertebrate cranial shape (crocodyliforms and papionins), we found statistically significant phylogenetic signal concentrated in a few trait dimensions. The finding that phylogenetic signal can be highly variable across the dimensions of multivariate phenotypes has important implications for current maximum likelihood approaches to phylogenetic signal in multivariate data.</p>\",\"PeriodicalId\":22120,\"journal\":{\"name\":\"Systematic Biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2024-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Systematic Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/sysbio/syae035\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EVOLUTIONARY BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syae035","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
由于生命树的层次结构,近缘物种往往比远缘物种更相似;这种模式被称为系统发生信号。人们提出了许多单变量统计量来衡量单一表型性状的系统发生信号,但对于现代生物学中常见的多变量数据的系统发生信号研究仍具有挑战性。在此,我们介绍一种探索多元表型系统发生信号的新方法。我们的方法将数据分解成具有最大(或最小)系统发生信号的线性组合,以布隆伯格 K 值衡量。这些系统发生成分或 K 成分的载荷向量可以从生物学角度进行解释,分数的散点图可以用作数据的低维排序,从而最大(或最小)地保留系统发生信号。我们介绍了多元数据中系统发生信号的代数和统计特性,以及两个新的汇总统计量 KA 和 KG。模拟研究表明,KA 和 KG 比之前建议的统计量 Kmult 具有更高的统计能力,尤其是当系统发生信号较低或集中在几个性状维度时。在对脊椎动物颅骨形状(鳄形目和乳头状目)的两个经验应用中,我们发现具有统计意义的系统发生信号集中在几个性状维度上。系统发生学信号在多变量表型的各个维度上都可能存在很大的差异,这一发现对目前在多变量数据中系统发生学信号的最大似然法有重要影响。
Exploring Phylogenetic Signal in Multivariate Phenotypes by Maximizing Blomberg's K.
Due to the hierarchical structure of the tree of life, closely related species often resemble each other more than distantly related species; a pattern termed phylogenetic signal. Numerous univariate statistics have been proposed as measures of phylogenetic signal for single phenotypic traits, but the study of phylogenetic signal for multivariate data, as is common in modern biology, remains challenging. Here we introduce a new method to explore phylogenetic signal in multivariate phenotypes. Our approach decomposes the data into linear combinations with maximal (or minimal) phylogenetic signal, as measured by Blomberg's K. The loading vectors of these phylogenetic components or K-components can be biologically interpreted, and scatterplots of the scores can be used as a low-dimensional ordination of the data that maximally (or minimally) preserves phylogenetic signal. We present algebraic and statistical properties, along with two new summary statistics, KA and KG, of phylogenetic signal in multivariate data. Simulation studies showed that KA and KG have higher statistical power than the previously suggested statistic Kmult, especially if phylogenetic signal is low or concentrated in a few trait dimensions. In two empirical applications to vertebrate cranial shape (crocodyliforms and papionins), we found statistically significant phylogenetic signal concentrated in a few trait dimensions. The finding that phylogenetic signal can be highly variable across the dimensions of multivariate phenotypes has important implications for current maximum likelihood approaches to phylogenetic signal in multivariate data.
期刊介绍:
Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.