Biomolecular condensates are thought to create subcellular microenvironments that regulate specific biochemical activities. Extensive in vitro work has helped link condensate formation to a wide range of cellular processes, including gene expression, nuclear transport, signalling and stress responses. However, testing the relationship between condensate formation and function in cells is more challenging. In particular, the extent to which the cellular functions of condensates depend on the nature of the molecular interactions through which the condensates form is a major outstanding question. Here, we review results from recent genetic complementation experiments in cells, and highlight how genetic complementation provides important insights into cellular functions and functional specificity of biomolecular condensates. Combined with observations from human genetic disease, these experiments suggest that diverse condensate-promoting regions within cellular proteins confer different condensate compositions, biophysical properties and functions.
Decades of genetic association testing in human cohorts have provided important insights into the genetic architecture and biological underpinnings of complex traits and diseases. However, for certain traits, genome-wide association studies (GWAS) for common SNPs are approaching signal saturation, which underscores the need to explore other types of genetic variation to understand the genetic basis of traits and diseases. Copy number variation (CNV) is an important source of heritability that is well known to functionally affect human traits. Recent technological and computational advances enable the large-scale, genome-wide evaluation of CNVs, with implications for downstream applications such as polygenic risk scoring and drug target identification. Here, we review the current state of CNV-GWAS, discuss current limitations in resource infrastructure that need to be overcome to enable the wider uptake of CNV-GWAS results, highlight emerging opportunities and suggest guidelines and standards for future GWAS for genetic variation beyond SNPs at scale.
Genomic data from millions of individuals have been generated worldwide to drive discovery and clinical impact in precision medicine. Lowering the barriers to using these data collectively is needed to equitably realize the benefits of the diversity and scale of population data. We examine the current landscape of global genomic data sharing, including the evolution of data sharing models from data aggregation through to data visiting, and for certain use cases, cross-cohort analysis using federated approaches across multiple environments. We highlight emerging examples of best practice relating to participant, patient and community engagement; evolution of technical standards, tools and infrastructure; and impact of research and health-care policy. We outline 12 actions we can all take together to scale up efforts to enable safe global data sharing and move beyond projects demonstrating feasibility to routinely cross-analysing research and clinical data sets, optimizing benefit.
Since the discovery of RNA splicing and its role in gene expression, researchers have sought a set of rules, an algorithm or a computational model that could predict the splice isoforms, and their frequencies, produced from any transcribed gene in a specific cellular context. Over the past 30 years, these models have evolved from simple position weight matrices to deep-learning models capable of integrating sequence data across vast genomic distances. Most recently, new model architectures are moving the field closer to context-specific alternative splicing predictions, and advances in sequencing technologies are expanding the type of data that can be used to inform and interpret such models. Together, these developments are driving improved understanding of splicing regulatory mechanisms and emerging applications of the splicing code to the rational design of RNA- and splicing-based therapeutics.