Explaining global species richness patterns is a major goal of evolution, ecology, and biogeography. These richness patterns are often attributed to spatial variation in diversification rates (speciation minus extinction). Surprisingly, prominent studies of birds, fish, and plants have reported higher speciation and/or diversification rates at higher latitudes, where species richness is lower. We hypothesize that these surprising findings are explained by the focus of those studies on relatively recent macroevolutionary rates, within the last ~20 million years. Here, we analyze global richness patterns among 10,213 squamates (lizards and snakes) and explore their underlying causes. We find that when diversification rates were quantified at more recent timescales, we observed mismatched patterns of rates and richness, similar to previous studies in other taxa. Importantly, diversification rates estimated over longer timescales were instead positively related to geographic richness patterns. These observations may help resolve the paradoxical results of previous studies in other taxa. We found that diversification rates were largely unrelated to climate, even though climate and richness were related. Instead, higher tropical richness was related to the ancient occupation of tropical regions, with colonization time the variable that explained the most variation in richness overall. We suggest that large-scale diversity patterns might be best understood by considering climate, deep-time diversification rates, and the time spent in different regions, rather than recent diversification rates alone.
Nuclear genome sequencing for phylogenetics is resource-intensive while mitochondrial genomes can be sequenced and analyzed with relative ease for building densely sampled phylogenetic trees of the most species-rich lineages of animals. Here, we develop a conceptual approach and bioinformatics workflow for combining nuclear single-copy orthologs with less informative but densely sampled mitochondrial genomes, for a detailed tree of Coleoptera (beetles). Basal relationships of Coleoptera were first inferred from > 2,000 BUSCO loci mined from GenBank's Short Read Archive for 119 exemplars of all major lineages under various substitution models and levels of matrix completion, to reveal universally supported nodes. Second, the corresponding mitogenomes were extracted and combined with an additional 373 species selected for broad taxonomic and biogeographic coverage, roughly in proportion to the known global species diversity of Coleoptera. Bioinformatic processing of mitogenomes was conducted with a novel pipeline for rapid, accurate annotation of protein-coding genes. Finally, phylogenetic trees from all 491 mitogenomes were generated under a backbone constraint from the universal basal nodes, which produced a well-supported tree of the major lineages at the family and superfamily level. Being genetically unlinked and showing unique character variation, mitogenomes provide a unique perspective of the phylogeny. Comparison with 3 recent nuclear phylogenomic studies resulted in the recognition of > 80 nodes universally present across all analyses. These may now support the higher classification of Coleoptera and serve as backbone of further studies, as numerous full mitogenomes and mitochondrial DNA barcodes are added to an increasingly complete phylogenetic tree of this super-diverse insect order.
Obtaining a timescale for bacterial evolution is crucial to understand early life evolution but is difficult owing to the scarcity of bacterial fossils. Here, we introduce multiple new time constraints to calibrate bacterial evolution based on ancient symbiosis. This idea is implemented using a bacterial tree constructed with genes found in the mitochondrial lineages phylogenetically embedded within Proteobacteria. The expanded mitochondria-bacterial tree allows the node age constraints of eukaryotes established by their abundant fossils to be propagated to ancient co-evolving bacterial symbionts and across the bacterial tree of life. Importantly, we formulate a new probabilistic framework that considers uncertainty in inference of the ancestral lifestyle of modern symbionts to apply 19 relative time constraints each informed by host-symbiont association to constrain bacterial symbionts no older than their eukaryotic host. Moreover, we develop an approach to incorporating substitution mixture models that better accommodate substitutional saturation and compositional heterogeneity for dating deep phylogenies. Our analysis estimates that the last bacterial common ancestor occurred approximately 4.0-3.5 billion years ago (Ga), followed by rapid divergence of major bacterial clades. It is generally robust to alternative root ages, root positions, tree topologies, fossil ages, ancestral lifestyle reconstruction, gene sets, among other factors. The obtained timetree serves as a foundation for testing hypotheses regarding bacterial diversification and its correlation with geobiological events across different timescales.
The two most popular tree models used in phylogenetics are the birth-death process (BD) and the Kingman coalescent (KC). These two models differ in several respects, notably: (i) the curve of the population size through time is a stochastic process in the BD, versus a parametrized curve in the KC, (ii) the BD makes assumptions about the way samples are collected, while the KC conditions on the number of samples and the collection times, thus bypassing the need to describe the sampling procedure. These two models have been applied to different contexts: the BD in macroevolutionary studies of clades of species, and the KC for populations. The exception is the field of phylogenetic epidemiology which uses both models. This then asks the question of how such different models can be used in the same context. In this paper, we study large-population limits of the BD, in a search for a mathematical link between the BD and the KC. We show that the KC is the large-population limit of a BD conditioned on a given population trajectory, and we provide the formula for the parameter θ of the limiting KC. This formula appears in earlier studies, but the present article is the first to show formally how the correspondence arises as a large-population limit, and that the BD needs to be conditioned for the KC to arise. Besides these fundamentally mathematical results, we demonstrate how our findings can be used practically in phylogenetic inference. In particular, we propose a new method for phylogenetic epidemiology, called CalicoBird, ensuing from our results. We conjecture that this new method, used in conjunction with auxiliary data (e.g. prevalence or incidence data), should allow estimating important epidemiological parameters (e.g. the prevalence and the effective reproduction number), in a way that is robust to the data-generating model and the sampling procedure. Future studies will be needed to put our claims to the test.

