We present a novel method for analyzing the folding of intrinsically disordered proteins (IDPs), such as Tau and phosphorylated Tau (pTau), in solution. Using cross-linking mass spectrometry (XL-MS) combined with a new downstream analysis framework, we construct weighted interaction networks from cross-link-derived residue pairs without relying on predefined secondary structure assumptions. Structural differences between protein conformations are quantified by comparing the organization of loop structures within their cross-link networks. Validation with bovine serum albumin (BSA) in native and denatured states shows that at least 500 cross-links-requiring 5-10 replicate measurements-are needed for reliable detection of structural divergence. Leave-one-out analysis confirms that structural transitions are global, highlighting the importance of comprehensive cross-link datasets. The coverage of unique cross-links was evaluated using accumulation curves from randomized permutations. Saturation levels were found to be 9.7%, 5.0%, and 6.2% of the total 528 and 10,731 possible cross-links after 30, 84, and 62 technical replicates, respectively, for myoglobin, native BSA, and denatured BSA. For Tau and pTau, coverage reached 10.8% and 5.5% of the upper limit (8,256). Finally, applying our structural analysis to Tau and pTau during arachidonic acid-induced aggregation revealed distinct patterns of structural evolution between the two proteins.
There is increasing interest in artificially selecting or breeding microbial communities, but experiments have reported modest success. Here, we develop computational models to simulate two previously known selection methods and compare them to a new "disassembly" method. We evaluate all three methods in their ability to find a community that could efficiently degrade toxins, whereby investment into degradation results in slower growth. Our disassembly method relies on repeatedly competing different communities of known species combinations against one another, while regularly shuffling around their species combinations. This approach allows many species combinations to be explored, thereby maintaining enough between-community diversity for selection to act on, and resulting in communities with high performance. Nevertheless, selection at the community level in our simulations did not counteract selection at the individual level, nor the communities' ecological dynamics. Species in our model evolved to invest less into community function and more into growth, but increased growth compensated for reduced investment, such that overall community performance was barely affected by within-species evolution. Within-community ecological dynamics were more of a challenge, as we could control them during the selection process, but community composition and function dropped in the longer term. Our work shows that the strength of disassembly lies mainly in its ability to explore different species combinations, and helps to propose alternative designs for community selection experiments.
Cells frequently employ extracellular vesicles, or exosomes, to signal across long distances and coordinate collective actions. Exosomes diffuse slowly, can be actively degraded, and contain stochastic amounts of molecular cargo. These features raise the question of the efficacy of exosomes as a directional signal, but this question has not be systematically investigated. We develop a theoretical and computational approach to quantify the limits of exosome-mediated chemotaxis at the individual cell level. In our model, a leader cell secretes exosomes, which diffuse in the extracellular space, and a follower cell guides its migration by integrating discrete exosome detections over a finite memory window. We combine analytical calculations and stochastic simulations and show that the chemotactic velocity exhibits a non-monotonic dependence on the exosome cargo size. Small exosomes produce frequent but weak signals, whereas large exosomes produce strong but infrequent encounters. In the presence of nonlinear signal transduction, this tradeoff leads to an optimal cargo size that maximizes information throughput, as quantified by the average speed of the follower cell. Using a reduced one-dimensional model, we derive closed-form expressions coupling the optimal cargo size to follower speed as a function of secretion rate, memory time, and detection sensitivity. These results identify molecular packaging and memory integration as key determinants of exosome-mediated information transmission and highlight general design principles for optimization of migration under guidance by discrete and diffusible signaling particles.
Spatially resolved transcriptomics (SRT) enables the simultaneous capture of gene expression profiles and spatial localization, providing valuable insights into tissue architecture. However, the preservation of spatial information requires additional experimental procedures, which often introduce substantial technical noise. Existing methods typically perform denoising and spatial domain identification in separate steps, leading to suboptimal performance and limiting their applicability. To address this limitation, we propose an integrative network model, stACN ( spatial transcriptomics Attribute Cell Network), that jointly denoises gene expression data and identifies spatial domains in SRT. Specifically, stACN first learns clean dual cell networks using a graph noise model, and then derives compatible cell features through joint tensor decomposition of the denoised networks. Experimental results demonstrate that stACN effectively enhances data quality, as measured by clustering agreement with reference annotations (Adjusted Rand Index, ARI), and facilitates spatial domain analysis in SRT datasets.
There are various well-validated taxonomic classifiers for profiling shotgun metagenomics data, with two popular methods, MetaPhlAn (marker-gene-based) and Kraken (k-mer-based), at the forefront of many studies. Despite differences between classification approaches and calls for the development of consensus methods, most analyses of shotgun metagenomics data for microbiome studies use a single taxonomic classifier. In this study, we compare inferences from two broadly used classifiers, MetaPhlAn4 and Kraken2, applied to stool metagenomic samples from participants in the Integrative Longevity Omics study to measure associations of taxonomic diversity and relative abundance with age, replicating analyses in an independent cohort. We also introduce consensus and meta-analytic approaches to compare and integrate results from multiple classifiers. While many results are consistent across the two classifiers, we find classifier-specific inferences that would be lost when using one classifier alone. Both classifiers captured similar age-associated changes in diversity across cohorts, with variability in species alpha diversity driven by differences by classifier. When using a correlated meta-analysis approach (AdjMaxP) across classifiers, differential abundance analysis captures more age-associated taxa, including 17 taxa robustly age-associated across cohorts. This study emphasizes the value of employing multiple classifiers and recommends novel approaches that facilitate the integration of results from multiple methodologies.
Genome annotations provide the essential framework for genomic analyses, capturing our current knowledge of gene structure and function as inferred from computational predictions and experimental evidence. Even as automated annotation pipelines become more sophisticated, their accuracy in representing unconventional gene expression events remains largely untested. Here, we address this gap by examining the most common form of translational recoding: the insertion of selenocysteine (Sec), a non-canonical amino acid incorporated into selenoproteins, oxidoreductase enzymes carrying essential roles in redox homeostasis. Sec insertion occurs in response to UGA, normally interpreted as stop codon, but recoded in selenoprotein mRNAs. Owing to the dual function of UGA, the identification of selenoprotein genes poses a challenge. We show that the vertebrate selenoprotein genes are widely misannotated in major public databases. Only 11% and 5% of selenoprotein genes are well annotated in Ensembl and NCBI GenBank, respectively, due to the lack of dedicated selenoprotein annotation pipelines. In most cases (81% and 84%), overlapping flawed annotations are present which lack the Sec-encoding UGA. In contrast, NCBI RefSeq employs a dedicated selenoprotein pipeline, yet with some shortcomings: its selenoprotein annotations are correct in 77% of cases, and most errors affect families with a C-terminal Sec residue. We argue that selenoproteins must be correctly annotated in public databases and that must occur via automated pipelines, to keep the pace with genome sequencing. To facilitate this task, we present a new version of Selenoprofiles, an homology based tool for selenoprotein prediction that produces predictions with accuracy comparable to manual curation, and can be easily deployed and integrated in existing annotation pipelines.
Understanding the heterogeneity of population-level viral fitness dynamics, which reflect the interplay between intrinsic viral properties and population immunity, is critical for pandemic preparedness. However, how these dynamics vary across diverse immune backgrounds and mutational landscapes remain poorly characterized. We present Geno-GNN, a graph representation learning approach for retrospectively characterizing the viral fitness dynamics of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Geno-GNN accurately predicts angiotensin-converting enzyme 2 (ACE2) binding affinity and immune escape potential across multiple external datasets. Using Geno-GNN, we identified temporal patterns in SARS-CoV-2 fitness and detected varying rates of fitness change associated with distinct immune backgrounds. Virtual mutation scanning revealed two fitness trajectories: broad immune evasion at the cost of ACE2 affinity and ACE2 affinity maintenance at or above the Wuhan-Hu-1 level along with moderate immune escape. Notably, real-world SARS-CoV-2 variants predominantly followed the latter trajectory, sustaining ACE2 affinity via fixed mutations. These findings underscore the heterogeneous, immune-contextualized nature of viral fitness dynamics and the complex evolutionary pathways of SARS-CoV-2.
The human spleen significantly influences red blood cell (RBC) dynamics due to its ability to retain and/or remove RBCs from peripheral blood circulation. This filtering can mediate a range of malaria disease manifestations, depending on the physiological properties of the spleen. Data collected from patients undergoing splenectomy in Papua, Indonesia, revealed that in asymptomatic infections the spleen harboured substantially more infected RBCs than were circulating in the peripheral blood and that the spleen is also congested with uninfected RBCs. We hypothesise that two conditions hold for the spleen to retain such a high proportion of infected and uninfected RBCs: (i) the retention rate of uninfected RBCs is significantly higher than in uninfected patients; and (ii) phagocytosing macrophages cannot clear all of the infected RBCs from the spleen. In this paper, we present a mathematical model of RBC dynamics that includes, for the first time, the spleen as a compartment capable of retaining large numbers of infected and uninfected RBCs in Plasmodium falciparum and P. vivax infections. By calibrating the model to the Papuan data, we demonstrate that the spleen plays a significant role in removing not only infected RBCs but also uninfected RBCs. Uninfected RBC retention in the spleen, attributable to malaria, is substantially higher than circulating RBC loss due to parasitisation, for infections by both Plasmodium species. In chronic infections, the ratio of circulating uninfected RBCs lost to splenic retention per circulating uninfected RBC lost to parasitisation is 17:1 for P. falciparum and 82:1 for P. vivax. These ratios are larger than previously published estimates for acute clinical infections.
Blood vessel pruning during angiogenesis is the optimization process of the branching pattern to improve the transport properties of a vascular network. Recent studies show that part of endothelial cells (ECs) subjected to lower shear stress migrate toward vessels with higher shear stress in opposition to the blood flow for vessel regression. While dynamic changes of blood flow and local mechano-stress could coordinately modulate EC migration for vessel regression within the closed circulatory system, the effect of complexity of haemodynamic forces and vessel properties on vessel pruning remains elusive. Here, we reconstructed a 3-dimentsional (3D) vessel structure from 2D confocal images of the growing vessels in the mouse retina, and numerically obtained the local information of blood flow, shear stress and blood pressure in the vasculature. Moreover, we developed a predictive model for vessel pruning based on machine learning. We found that the combination of shear stress and blood pressure with vessel radius was tightly correlated to vessel pruning sites. Our results highlighted that orchestrated contribution of local haemodynamic parameters was important for the vessel pruning.

