Genetic data contain a record of our evolutionary history. The availability of large-scale datasets of human populations from various geographic areas and timescales, coupled with advances in the computational methods to analyze these data, has transformed our ability to use genetic data to learn about our evolutionary past. Here, we review some of the widely used statistical methods to explore and characterize population relationships and history using genomic data. We describe the intuition behind commonly used approaches, their interpretation, and important limitations. For illustration, we apply some of these techniques to genome-wide autosomal data from 929 individuals representing 53 worldwide populations that are part of the Human Genome Diversity Project. Finally, we discuss the new frontiers in genomic methods to learn about population history. In sum, this review highlights the power (and limitations) of DNA to infer features of human evolutionary history, complementing the knowledge gleaned from other disciplines, such as archaeology, anthropology, and linguistics.
The Human Cell Atlas (HCA) is striving to build an open community that is inclusive of all researchers adhering to its principles and as open as possible with respect to data access and use. However, open data sharing can pose certain challenges. For instance, being a global initiative, the HCA must contend with a patchwork of local and regional privacy rules. A notable example is the implementation of the European Union General Data Protection Regulation (GDPR), which caused some concern in the biomedical and genomic data-sharing community. We examine how the HCA's large, international group of researchers is investing tremendous efforts into ensuring appropriate sharing of data. We describe the HCA's objectives and governance, how it defines open data sharing, and ethico-legal challenges encountered early in its development; in particular, we describe the challenges prompted by the GDPR. Finally, we broaden the discussion to address tools and strategies that can be used to address ethical data governance.
DECIPHER (Database of Genomic Variation and Phenotype in Humans Using Ensembl Resources) shares candidate diagnostic variants and phenotypic data from patients with genetic disorders to facilitate research and improve the diagnosis, management, and therapy of rare diseases. The platform sits at the boundary between genomic research and the clinical community. DECIPHER aims to ensure that the most up-to-date data are made rapidly available within its interpretation interfaces to improve clinical care. Newly integrated cardiac case-control data that provide evidence of gene-disease associations and inform variant interpretation exemplify this mission. New research resources are presented in a format optimized for use by a broad range of professionals supporting the delivery of genomic medicine. The interfaces within DECIPHER integrate and contextualize variant and phenotypic data, helping to determine a robust clinico-molecular diagnosis for rare-disease patients, which combines both variant classification and clinical fit. DECIPHER supports discovery research, connecting individuals within the rare-disease community to pursue hypothesis-driven research.
Recent advancements in single-cell technologies have enabled expression quantitative trait locus (eQTL) analysis across many individuals at single-cell resolution. Compared with bulk RNA sequencing, which averages gene expression across cell types and cell states, single-cell assays capture the transcriptional states of individual cells, including fine-grained, transient, and difficult-to-isolate populations at unprecedented scale and resolution. Single-cell eQTL (sc-eQTL) mapping can identify context-dependent eQTLs that vary with cell states, including some that colocalize with disease variants identified in genome-wide association studies. By uncovering the precise contexts in which these eQTLs act, single-cell approaches can unveil previously hidden regulatory effects and pinpoint important cell states underlying molecular mechanisms of disease. Here, we present an overview of recently deployed experimental designs in sc-eQTL studies. In the process, we consider the influence of study design choices such as cohort, cell states, and ex vivo perturbations. We then discuss current methodologies, modeling approaches, and technical challenges as well as future opportunities and applications.
In meiosis, homologous chromosome synapsis is mediated by a supramolecular protein structure, the synaptonemal complex (SC), that assembles between homologous chromosome axes. The mammalian SC comprises at least eight largely coiled-coil proteins that interact and self-assemble to generate a long, zipper-like structure that holds homologous chromosomes in close proximity and promotes the formation of genetic crossovers and accurate meiotic chromosome segregation. In recent years, numerous mutations in human SC genes have been associated with different types of male and female infertility. Here, we integrate structural information on the human SC with mouse and human genetics to describe the molecular mechanisms by which SC mutations can result in human infertility. We outline certain themes in which different SC proteins are susceptible to different types of disease mutation and how genetic variants with seemingly minor effects on SC proteins may act as dominant-negative mutations in which the heterozygous state is pathogenic.
The p-arms of the five human acrocentric chromosomes bear nucleolar organizer regions (NORs) comprising ribosomal gene (rDNA) repeats that are organized in a homogeneous tandem array and transcribed in a telomere-to-centromere direction. Precursor ribosomal RNA transcripts are processed and assembled into ribosomal subunits, the nucleolus being the physical manifestation of this process. I review current understanding of nucleolar chromosome biology and describe current exploration into a role for the NOR chromosomal context. Full DNA sequences for acrocentric p-arms are now emerging, aided by the current revolution in long-read sequencing and genome assembly. Acrocentric p-arms vary from 10.1 to 16.7 Mb, accounting for ∼2.2% of the genome. Bordering rDNA arrays, distal junctions, and proximal junctions are shared among the p-arms, with distal junctions showing evidence of functionality. The remaining p-arm sequences comprise multiple satellite DNA classes and segmental duplications that facilitate recombination between heterologous chromosomes, which is likely also involved in Robertsonian translocations.

