Gil Yardeni, Michael H J Barfuss, Walter Till, Matthew R Thornton, Clara Groot Crego, Christian Lexer, Thibault Leroy, Ovidiu Paun
The recent rapid radiation of Tillandsia subgenus Tillandsia (Bromeliaceae) provides an attractive system to study the drivers and constraints of species diversification. This species-rich Neotropical monocot clade includes predominantly epiphytic species displaying vast phenotypic diversity. Recent in-depth phylogenomic work revealed that the subgenus originated within the last 7 myr, with one major expansion from South into Central America within the last 5 myr. However, disagreements between phylogenies and lack of resolution at shallow nodes suggest that hybridization may have occurred throughout the radiation, together with frequent incomplete lineage sorting and rapid gene family evolution. We used whole-genome resequencing data to explore the evolutionary history of representative ingroup species employing both tree-based and network approaches. Our results indicate that lineage co-occurrence does not predict relatedness and confirm significant deviations from a tree-like structure, coupled with pervasive gene-tree discordance. Focusing on hybridization, ABBA-BABA and related statistics were used to infer the rates and relative timing of introgression, whereas topology weighting uncovered high heterogeneity of the phylogenetic signal along the genome. High rates of hybridization within and among subclades suggest that, contrary to previous hypotheses, the expansion of subgenus Tillandsia into Central America proceeded through several dispersal events, punctuated by episodes of diversification and gene flow. Network analysis revealed reticulation as a plausible propeller during radiation and establishment across different ecological niches. This work contributes a plant example of prevalent hybridization during rapid species diversification, supporting the hypothesis that interspecific gene flow facilitates explosive diversification.
{"title":"The Explosive Radiation of the Neotropical Tillandsia Subgenus Tillandsia (Bromeliaceae) Has Been Accompanied by Pervasive Hybridization.","authors":"Gil Yardeni, Michael H J Barfuss, Walter Till, Matthew R Thornton, Clara Groot Crego, Christian Lexer, Thibault Leroy, Ovidiu Paun","doi":"10.1093/sysbio/syaf039","DOIUrl":"10.1093/sysbio/syaf039","url":null,"abstract":"<p><p>The recent rapid radiation of Tillandsia subgenus Tillandsia (Bromeliaceae) provides an attractive system to study the drivers and constraints of species diversification. This species-rich Neotropical monocot clade includes predominantly epiphytic species displaying vast phenotypic diversity. Recent in-depth phylogenomic work revealed that the subgenus originated within the last 7 myr, with one major expansion from South into Central America within the last 5 myr. However, disagreements between phylogenies and lack of resolution at shallow nodes suggest that hybridization may have occurred throughout the radiation, together with frequent incomplete lineage sorting and rapid gene family evolution. We used whole-genome resequencing data to explore the evolutionary history of representative ingroup species employing both tree-based and network approaches. Our results indicate that lineage co-occurrence does not predict relatedness and confirm significant deviations from a tree-like structure, coupled with pervasive gene-tree discordance. Focusing on hybridization, ABBA-BABA and related statistics were used to infer the rates and relative timing of introgression, whereas topology weighting uncovered high heterogeneity of the phylogenetic signal along the genome. High rates of hybridization within and among subclades suggest that, contrary to previous hypotheses, the expansion of subgenus Tillandsia into Central America proceeded through several dispersal events, punctuated by episodes of diversification and gene flow. Network analysis revealed reticulation as a plausible propeller during radiation and establishment across different ecological niches. This work contributes a plant example of prevalent hybridization during rapid species diversification, supporting the hypothesis that interspecific gene flow facilitates explosive diversification.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"22-38"},"PeriodicalIF":5.7,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805668/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144498008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: The Fossilized Birth-Death Model Is Identifiable.","authors":"","doi":"10.1093/sysbio/syaf074","DOIUrl":"10.1093/sysbio/syaf074","url":null,"abstract":"","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"193"},"PeriodicalIF":5.7,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805664/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145401960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mathieu Fourment, Matthew Macaulay, Christiaan J Swanepoel, Xiang Ji, Marc A Suchard, Frederick A Matsen Iv
Bayesian inference has predominantly relied on the Markov chain Monte Carlo (MCMC) algorithm for many years. However, MCMC is computationally laborious, especially for complex phylogenetic models of time trees. This bottleneck has led to the search for alternatives, such as variational Bayes, which can scale better to large data sets. In this paper, we introduce torchtree, a framework written in Python that allows developers to easily implement rich phylogenetic models and algorithms using a fixed tree topology. One can either use automatic differentiation or leverage torchtree's plug-in system to compute gradients analytically for model components for which automatic differentiation is slow. We demonstrate that the torchtree variational inference framework performs similarly to BEAST in terms of speed, and delivers promising approximation results, though accuracy varies across scenarios. Furthermore, we explore the use of the forward Kullback-Leibler (KL) divergence as an optimizing criterion for variational inference, which can handle discontinuous and nondifferentiable models. Our experiments show that inference using the forward KL divergence is frequently faster per iteration compared with the evidence lower bound (ELBO) criterion, although the ELBO-based inference may converge faster in some cases. Overall, torchtree provides a flexible and efficient framework for phylogenetic model development and inference using PyTorch.
{"title":"torchtree: Flexible Phylogenetic Model Development and Inference Using PyTorch.","authors":"Mathieu Fourment, Matthew Macaulay, Christiaan J Swanepoel, Xiang Ji, Marc A Suchard, Frederick A Matsen Iv","doi":"10.1093/sysbio/syaf047","DOIUrl":"10.1093/sysbio/syaf047","url":null,"abstract":"<p><p>Bayesian inference has predominantly relied on the Markov chain Monte Carlo (MCMC) algorithm for many years. However, MCMC is computationally laborious, especially for complex phylogenetic models of time trees. This bottleneck has led to the search for alternatives, such as variational Bayes, which can scale better to large data sets. In this paper, we introduce torchtree, a framework written in Python that allows developers to easily implement rich phylogenetic models and algorithms using a fixed tree topology. One can either use automatic differentiation or leverage torchtree's plug-in system to compute gradients analytically for model components for which automatic differentiation is slow. We demonstrate that the torchtree variational inference framework performs similarly to BEAST in terms of speed, and delivers promising approximation results, though accuracy varies across scenarios. Furthermore, we explore the use of the forward Kullback-Leibler (KL) divergence as an optimizing criterion for variational inference, which can handle discontinuous and nondifferentiable models. Our experiments show that inference using the forward KL divergence is frequently faster per iteration compared with the evidence lower bound (ELBO) criterion, although the ELBO-based inference may converge faster in some cases. Overall, torchtree provides a flexible and efficient framework for phylogenetic model development and inference using PyTorch.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"39-51"},"PeriodicalIF":5.7,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805669/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144561212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A somewhat personal account of the development and acceptance of numerical taxonomic methods during the early years of the journal Systematic Zoology. Includes a few perspectives on the changes in taxonomy and the journal after 75 years.
{"title":"Too Many Numbers?","authors":"F James Rohlf","doi":"10.1093/sysbio/syaf076","DOIUrl":"10.1093/sysbio/syaf076","url":null,"abstract":"<p><p>A somewhat personal account of the development and acceptance of numerical taxonomic methods during the early years of the journal Systematic Zoology. Includes a few perspectives on the changes in taxonomy and the journal after 75 years.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"14-21"},"PeriodicalIF":5.7,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145309144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Unrooted phylogenetic networks are commonly used to represent evolutionary data in the presence of incompatibilities. Although rooted phylogenetic networks offer a more explicit framework for depicting evolutionary histories involving reticulate events, they are reported less frequently, probably due to a lack of tools that are as easily applicable as those for unrooted networks. Here, we introduce PhyloFusion, a fast and user-friendly method for constructing rooted phylogenetic networks from sets of rooted phylogenetic trees. The resulting networks have the tree-child property. The algorithm accommodates trees with unresolved nodes-often resulting from the contraction of low-support edges-as well as some degree of missing taxa. We demonstrate its application to the analysis of functionally related gene groups and show that it can efficiently handle data sets comprising tens of trees or hundreds of taxa. An open source implementation of PhyloFusion is available as part of the SplitsTree app: https://www.github.com/husonlab/splitstree6. All data available here: https://doi.org/10.5061/dryad.k3j9kd5h5.
{"title":"PhyloFusion-Fast and Easy Fusion of Rooted Phylogenetic Trees into Rooted Phylogenetic Networks.","authors":"Louxin Zhang, Banu Cetinkaya, Daniel H Huson","doi":"10.1093/sysbio/syaf049","DOIUrl":"10.1093/sysbio/syaf049","url":null,"abstract":"<p><p>Unrooted phylogenetic networks are commonly used to represent evolutionary data in the presence of incompatibilities. Although rooted phylogenetic networks offer a more explicit framework for depicting evolutionary histories involving reticulate events, they are reported less frequently, probably due to a lack of tools that are as easily applicable as those for unrooted networks. Here, we introduce PhyloFusion, a fast and user-friendly method for constructing rooted phylogenetic networks from sets of rooted phylogenetic trees. The resulting networks have the tree-child property. The algorithm accommodates trees with unresolved nodes-often resulting from the contraction of low-support edges-as well as some degree of missing taxa. We demonstrate its application to the analysis of functionally related gene groups and show that it can efficiently handle data sets comprising tens of trees or hundreds of taxa. An open source implementation of PhyloFusion is available as part of the SplitsTree app: https://www.github.com/husonlab/splitstree6. All data available here: https://doi.org/10.5061/dryad.k3j9kd5h5.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"88-99"},"PeriodicalIF":5.7,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805670/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144650538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nearly all modern studies that address evolutionary questions require consideration of the phylogenetic relationships among species. But what, exactly, is a species tree? And how do we go about estimating such a tree from genomic data? In this Evolving View, I consider the historical development of the field of species tree inference and discuss both progress and controversies within the field at present. I conclude by suggesting future directions and highlighting challenges the field is likely to face in the coming years.
{"title":"An evolving view of species tree inference.","authors":"Laura Kubatko","doi":"10.1093/sysbio/syag002","DOIUrl":"https://doi.org/10.1093/sysbio/syag002","url":null,"abstract":"Nearly all modern studies that address evolutionary questions require consideration of the phylogenetic relationships among species. But what, exactly, is a species tree? And how do we go about estimating such a tree from genomic data? In this Evolving View, I consider the historical development of the field of species tree inference and discuss both progress and controversies within the field at present. I conclude by suggesting future directions and highlighting challenges the field is likely to face in the coming years.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"20 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ana Serra Silva, Karen Siu-Ting, Christopher J Creevey, Davide Pisani, Mark Wilkinson
Missing data is a long-standing issue in phylogenetic inference, which often results in high levels of taxonomic instability, obscuring otherwise well-supported relationships. Multiple approaches have been developed to deal with the negative effects of ineffective overlap on tree resolution, often by identifying taxa for removal. Here, we repurpose a heuristic method developed to identify unstable taxa in morphological data matrices, concatabominations, and combine it with a novel gene-tree jackknifing on matrix representation of trees to identify candidates for targeted sequencing. Using a multilocus caecilian data set, we illustrate the method's capacity to identify candidate taxa and loci for additional sequencing, compare the results with those of the mathematics-based gene sampling sufficiency approach, and explore the terrace space associated with the multilocus data set. We show that our approach yields tractable numbers of loci/taxa for targeted sequencing that successfully mitigate topological instability due to ineffective overlap, even when modest amounts of data are added.
{"title":"Coping with Ineffective Overlap in Multilocus Phylogenetics.","authors":"Ana Serra Silva, Karen Siu-Ting, Christopher J Creevey, Davide Pisani, Mark Wilkinson","doi":"10.1093/sysbio/syaf044","DOIUrl":"10.1093/sysbio/syaf044","url":null,"abstract":"<p><p>Missing data is a long-standing issue in phylogenetic inference, which often results in high levels of taxonomic instability, obscuring otherwise well-supported relationships. Multiple approaches have been developed to deal with the negative effects of ineffective overlap on tree resolution, often by identifying taxa for removal. Here, we repurpose a heuristic method developed to identify unstable taxa in morphological data matrices, concatabominations, and combine it with a novel gene-tree jackknifing on matrix representation of trees to identify candidates for targeted sequencing. Using a multilocus caecilian data set, we illustrate the method's capacity to identify candidate taxa and loci for additional sequencing, compare the results with those of the mathematics-based gene sampling sufficiency approach, and explore the terrace space associated with the multilocus data set. We show that our approach yields tractable numbers of loci/taxa for targeted sequencing that successfully mitigate topological instability due to ineffective overlap, even when modest amounts of data are added.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"52-71"},"PeriodicalIF":5.7,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805666/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144561211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kin Onn Chan,Dario N Neokleous,Shahrul Anuar,Rafe M Brown,Carl R Hutter,Indraneil Das,Stefan T Hertwig
The evolutionary dynamics of cryptic species remain poorly understood, and their detection relies primarily on methods that quantify divergence, assuming that gene flow is absent. Here, we examine how gene flow shapes the evolutionary trajectories and species boundaries in Bornean Fanged Frogs, a renowned example of cryptic diversity where a single species has been split into 18 genetically divergent yet morphologically indistinguishable species. We employed target-capture data from over 13,000 loci to assess lineage independence of 14 nominal species distributed across Malaysian Borneo by evaluating both divergence and cohesion using network multispecies coalescent (NMSC) and MSC + migration approaches. Under the Unified Species Concept, only six of the 14 nominal species unambiguously form independently evolving lineages; the remainder represent cohesive metapopulation lineages nested within those six species. While mitochondrial p-distances varied substantially (up to 10%), genome-wide net divergences (Da) were more consistent, ranging from 0.5-2%, placing all the hypothesized "cryptic species" within the empirical gray zone of the speciation continuum. We show that diversification in the gray zone is unpredictable and heavily impacted by gene flow, leading to two key phenomena that confound species delimitation: (1) the artifactual branch effect, where admixed lineages are inferred as long, early-diverging branches, creating an illusion of deep divergence; and (2) the species-definition anomaly zone, where intraspecific pairwise sequence distances exceed interspecific ones. We further demonstrate that divergence in the gray zone varies among metrics and genomic regions, reflecting heterogeneity in evolutionary dynamics across the genome. Different genomic markers also vary considerably in phylogenetic discordance and their ability to retain signatures of gene flow. Loci from anchored hybrid enrichment (AHE) and ultraconserved elements (UCE) produced less phylogenetic discordance and retained signals of older introgression but failed to detect recent migration, making them suitable for phylogenetic reconstruction and inferring ancient introgression, but not ongoing gene flow. Recognizing the central role of gene flow reframes our understanding of cryptic species; rather than being considered as genetically distinct units that failed to evolve morphological differentiation, they are manifestations of continuous diversification in the gray zone. This shift in perspective offers a new and dynamic evolutionary framework for identifying and interpreting cryptic biodiversity across the Tree of Life.
{"title":"A Genomic Perspective on Cryptic Species Reveals Complex Evolutionary Dynamics in the Gray Zone of the Speciation Continuum.","authors":"Kin Onn Chan,Dario N Neokleous,Shahrul Anuar,Rafe M Brown,Carl R Hutter,Indraneil Das,Stefan T Hertwig","doi":"10.1093/sysbio/syag001","DOIUrl":"https://doi.org/10.1093/sysbio/syag001","url":null,"abstract":"The evolutionary dynamics of cryptic species remain poorly understood, and their detection relies primarily on methods that quantify divergence, assuming that gene flow is absent. Here, we examine how gene flow shapes the evolutionary trajectories and species boundaries in Bornean Fanged Frogs, a renowned example of cryptic diversity where a single species has been split into 18 genetically divergent yet morphologically indistinguishable species. We employed target-capture data from over 13,000 loci to assess lineage independence of 14 nominal species distributed across Malaysian Borneo by evaluating both divergence and cohesion using network multispecies coalescent (NMSC) and MSC + migration approaches. Under the Unified Species Concept, only six of the 14 nominal species unambiguously form independently evolving lineages; the remainder represent cohesive metapopulation lineages nested within those six species. While mitochondrial p-distances varied substantially (up to 10%), genome-wide net divergences (Da) were more consistent, ranging from 0.5-2%, placing all the hypothesized \"cryptic species\" within the empirical gray zone of the speciation continuum. We show that diversification in the gray zone is unpredictable and heavily impacted by gene flow, leading to two key phenomena that confound species delimitation: (1) the artifactual branch effect, where admixed lineages are inferred as long, early-diverging branches, creating an illusion of deep divergence; and (2) the species-definition anomaly zone, where intraspecific pairwise sequence distances exceed interspecific ones. We further demonstrate that divergence in the gray zone varies among metrics and genomic regions, reflecting heterogeneity in evolutionary dynamics across the genome. Different genomic markers also vary considerably in phylogenetic discordance and their ability to retain signatures of gene flow. Loci from anchored hybrid enrichment (AHE) and ultraconserved elements (UCE) produced less phylogenetic discordance and retained signals of older introgression but failed to detect recent migration, making them suitable for phylogenetic reconstruction and inferring ancient introgression, but not ongoing gene flow. Recognizing the central role of gene flow reframes our understanding of cryptic species; rather than being considered as genetically distinct units that failed to evolve morphological differentiation, they are manifestations of continuous diversification in the gray zone. This shift in perspective offers a new and dynamic evolutionary framework for identifying and interpreting cryptic biodiversity across the Tree of Life.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"15 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mélina A Celik, Carmelo Fruciano, Kaylene Butler, Vera Weisbecker, Matthew J Phillips
Reconstructing phylogeny from morphological data remains mired in investigator biases, including subjective inclusion and discretisation of phenotypic variation. Geometric morphometrics and multivariate statistical analyses provide an alternative array of tools for studying variation in morphological traits. However, direct analysis of landmark data is often unreliable for phylogeny reconstruction. Morphological variation is typically highly correlated among nearby landmarks and may evolve saltationally between adaptive peaks instead of gradually, thereby violating the assumptions of typical continuous models. To address these concerns, we developed an approach to more objectively discretise morphometric data and applied it to 3D surface scans of mandibles and postcranial elements of Macropodiformes (kangaroos, bettongs and rat-kangaroos). The scanned elements were partitioned into sets of locally co-varying landmarks which approximate functional units. These subregions were discretised into "atomised" characters using novel approaches to combine the objectivity of continuous shape variation for delineating discrete states with the model flexibility offered for multistate and binary characters. This allows us to (1) potentially reduce the influence of non-independence among neighbouring landmarks, (2) accommodate multimodal variation from saltational evolution, (3) accommodate missing data, such as from fragmentary fossils, and (4) promote tree-search efficiency. We built discrete morphological character matrices using three alternative approaches: commonly used clustering algorithms (UPGMA, k-means, k-medoids, Gaussian mixture modelling), a minimum evolution branch length criterion, and a tree sampling procedure. Our phylogenetic analyses with these novel matrices generally succeeded in recovering genera and several deep-level macropodiform clades, but failed to accurately reconstruct intergeneric relationships within the rapid diversification of the macropodine sub-family; those relationships were also not recovered with continuous morphological data or traditionally discretised characters and are the most poorly resolved with DNA data. On balance, our atomised characters, which derive from only mandibular and three postcranial elements, show promise for improving objectivity, accuracy and clocklikeness in morphological phylogenetics and provide pathways for accommodating correlated homoplasy and for more accurately estimating rates of morphological evolution, and thereby better integrating phenotypic and genomic data for phylogenetic inference.
{"title":"Phylogenetic Inference from Atomised 3D Morphometric Data: a Case Study using Kangaroos.","authors":"Mélina A Celik, Carmelo Fruciano, Kaylene Butler, Vera Weisbecker, Matthew J Phillips","doi":"10.1093/sysbio/syaf091","DOIUrl":"https://doi.org/10.1093/sysbio/syaf091","url":null,"abstract":"<p><p>Reconstructing phylogeny from morphological data remains mired in investigator biases, including subjective inclusion and discretisation of phenotypic variation. Geometric morphometrics and multivariate statistical analyses provide an alternative array of tools for studying variation in morphological traits. However, direct analysis of landmark data is often unreliable for phylogeny reconstruction. Morphological variation is typically highly correlated among nearby landmarks and may evolve saltationally between adaptive peaks instead of gradually, thereby violating the assumptions of typical continuous models. To address these concerns, we developed an approach to more objectively discretise morphometric data and applied it to 3D surface scans of mandibles and postcranial elements of Macropodiformes (kangaroos, bettongs and rat-kangaroos). The scanned elements were partitioned into sets of locally co-varying landmarks which approximate functional units. These subregions were discretised into \"atomised\" characters using novel approaches to combine the objectivity of continuous shape variation for delineating discrete states with the model flexibility offered for multistate and binary characters. This allows us to (1) potentially reduce the influence of non-independence among neighbouring landmarks, (2) accommodate multimodal variation from saltational evolution, (3) accommodate missing data, such as from fragmentary fossils, and (4) promote tree-search efficiency. We built discrete morphological character matrices using three alternative approaches: commonly used clustering algorithms (UPGMA, k-means, k-medoids, Gaussian mixture modelling), a minimum evolution branch length criterion, and a tree sampling procedure. Our phylogenetic analyses with these novel matrices generally succeeded in recovering genera and several deep-level macropodiform clades, but failed to accurately reconstruct intergeneric relationships within the rapid diversification of the macropodine sub-family; those relationships were also not recovered with continuous morphological data or traditionally discretised characters and are the most poorly resolved with DNA data. On balance, our atomised characters, which derive from only mandibular and three postcranial elements, show promise for improving objectivity, accuracy and clocklikeness in morphological phylogenetics and provide pathways for accommodating correlated homoplasy and for more accurately estimating rates of morphological evolution, and thereby better integrating phenotypic and genomic data for phylogenetic inference.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145865621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Statistically defensible species diagnoses are essential for producing robust taxonomies, which underpin much of biological research. Yet, most species descriptions remain largely descriptive, often lack rigorous statistical validation, and suffer from confounding factors stemming from sampling bias, geographic variation, and intraspecific diversity. These limitations are further compounded by the steep learning curve of advanced statistical and data visualization tools, which poses a major barrier for early-career scientists and researchers in under-resourced regions who may lack programming experience or access to expert support. To address these challenges, we developed GroupStruct2, a powerful yet accessible R-based Shiny application that democratizes robust statistical analysis and data visualization for species diagnosis across any taxonomic group. GroupStruct2 is implemented through a user-friendly graphical user interface (GUI) that requires no coding experience. The user-friendly interface intuitively guides users through a comprehensive workflow, from raw data upload and outlier detection to assumption testing, adaptive statistical analyses, and the generation of highly customizable, publication-ready visualizations based on the ggplot2 architecture. It supports allometric body-size correction and widely used dimension-reduction techniques, including Principal Component Analysis (PCA), Discriminant Analysis of Principal Components (DAPC), and, critically, Multiple Factor Analysis (MFA), which enables the joint analysis of meristic, morphometric, and categorical trait data within a single integrative taxonomic framework. We showcase GroupStruct2’s capabilities using two empirical datasets, demonstrating how to conduct robust statistical analyses and produce publication-quality visualizations in just a few clicks. By lowering technical barriers without compromising analytical rigor, GroupStruct2 empowers researchers of all backgrounds working on any taxonomic group to conduct statistically sound and produce visually compelling species diagnoses, thereby advancing both the accessibility and the scientific rigor of taxonomy, species delimitation, and biodiversity research.
{"title":"GroupStruct2: A User-Friendly Graphical User Interface for Statistical and Visual Support in Species Diagnosis","authors":"Kin Onn Chan, L Lee Grismer","doi":"10.1093/sysbio/syaf090","DOIUrl":"https://doi.org/10.1093/sysbio/syaf090","url":null,"abstract":"Statistically defensible species diagnoses are essential for producing robust taxonomies, which underpin much of biological research. Yet, most species descriptions remain largely descriptive, often lack rigorous statistical validation, and suffer from confounding factors stemming from sampling bias, geographic variation, and intraspecific diversity. These limitations are further compounded by the steep learning curve of advanced statistical and data visualization tools, which poses a major barrier for early-career scientists and researchers in under-resourced regions who may lack programming experience or access to expert support. To address these challenges, we developed GroupStruct2, a powerful yet accessible R-based Shiny application that democratizes robust statistical analysis and data visualization for species diagnosis across any taxonomic group. GroupStruct2 is implemented through a user-friendly graphical user interface (GUI) that requires no coding experience. The user-friendly interface intuitively guides users through a comprehensive workflow, from raw data upload and outlier detection to assumption testing, adaptive statistical analyses, and the generation of highly customizable, publication-ready visualizations based on the ggplot2 architecture. It supports allometric body-size correction and widely used dimension-reduction techniques, including Principal Component Analysis (PCA), Discriminant Analysis of Principal Components (DAPC), and, critically, Multiple Factor Analysis (MFA), which enables the joint analysis of meristic, morphometric, and categorical trait data within a single integrative taxonomic framework. We showcase GroupStruct2’s capabilities using two empirical datasets, demonstrating how to conduct robust statistical analyses and produce publication-quality visualizations in just a few clicks. By lowering technical barriers without compromising analytical rigor, GroupStruct2 empowers researchers of all backgrounds working on any taxonomic group to conduct statistically sound and produce visually compelling species diagnoses, thereby advancing both the accessibility and the scientific rigor of taxonomy, species delimitation, and biodiversity research.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"46 1","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145801174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}