DNA sequence alterations within DNA repeat domains inexplicably enhance the stability and delay the expansion of interrupted repeat domains. Here we propose mechanisms that rationalise such unanticipated outcomes. Specifically, we describe how interruption of a DNA repeat domain restricts the ensemble space available to dynamic, slip out, repeat bulge loops by introducing energetic barriers to loop migration. We explain how such barriers arise because some possible loop isomers result in energetically costly mismatches in the duplex portion of the repeat domain. We propose that the reduced ensemble space is the causative feature for the observed delay in repeat DNA expansion. We further posit that the observed loss of the interrupting repeat in some expanded DNAs reflects the transient occupation of loop isomer positions that result in a mismatch in the duplex stem due to 'leakiness' in the energy barrier. We propose that if the lifetime of such a low probability event allows for recognition by the mismatch repair system, then 'repair' of the repeat interruption can occur; thereby rationalising the absence of the interruption in the final expanded DNA 'product.' Our proposed mechanistic pathways provide reasoned explanations for what have been described as 'puzzling' observations, while also yielding insights into a biomedically important set of coupled genotypic phenomena that map the linkage between DNA origami thermodynamics and phenotypic disease states.
Phase separation plays an important role in the formation of membraneless compartments within the cell and intrinsically disordered proteins with low-complexity sequences can drive this compartmentalisation. Various intermolecular forces, such as aromatic-aromatic and cation-aromatic interactions, promote phase separation. However, little is known about how the ability of proteins to phase separate under physiological conditions is encoded in their energy landscapes and this is the focus of the present investigation. Our results provide a first glimpse into how the energy landscapes of minimal peptides that contain - and cation- interactions differ from the peptides that lack amino acids with such interactions. The peaks in the heat capacity () as a function of temperature report on alternative low-lying conformations that differ significantly in terms of their enthalpic and entropic contributions. The analysis and subsequent quantification of frustration of the energy landscape suggest that the interactions that promote phase separation lead to features (peaks or inflection points) at low temperatures in . More features may occur for peptides containing residues with better phase separation propensity and the energy landscape is more frustrated for such peptides. Overall, this work links the features in the underlying single-molecule potential energy landscapes to their collective phase separation behaviour and identifies quantities ( and frustration metric) that can be utilised in soft material design.
The human chaperone DNAJB6b increases the solubility of proteins involved in protein aggregation diseases and suppresses the nucleation of amyloid structures. Due to such favourable properties, DNAJB6b has gained increasing attention over the past decade. The understanding of how DNAJB6b operates on a molecular level may aid the design of inhibitors against amyloid formation. In this work, fundamental aspects of DNAJB6b self-assembly have been examined, providing a basis for future experimental designs and conclusions. The results imply the formation of large chaperone clusters in a concentration-dependent manner. Microfluidic diffusional sizing (MDS) was used to evaluate how DNAJB6b average hydrodynamic radius varies with concentration. We found that, in 20 mM sodium phosphate buffer, 0.2 mM EDTA, at pH 8.0 and room temperature, DNAJB6b displays a micellar behaviour, with a critical micelle concentration (CMC) of around 120 nM. The average hydrodynamic radius appears to be concentration independent between ∼10 μM and 100 μM, with a mean radius of about 12 nm. The CMC found by MDS is supported by native agarose gel electrophoresis and the size distribution appears bimodal in the DNAJB6b concentration range ∼100 nM to 4 μM.
The convergence of free-energy calculations based on importance sampling depends heavily on the choice of collective variables (CVs), which in principle, should include the slow degrees of freedom of the biological processes to be investigated. Autoencoders (AEs), as emerging data-driven dimension reduction tools, have been utilised for discovering CVs. AEs, however, are often treated as black boxes, and what AEs actually encode during training, and whether the latent variables from encoders are suitable as CVs for further free-energy calculations remains unknown. In this contribution, we review AEs and their time-series-based variants, including time-lagged AEs (TAEs) and modified TAEs, as well as the closely related model variational approach for Markov processes networks (VAMPnets). We then show through numerical examples that AEs learn the high-variance modes instead of the slow modes. In stark contrast, time series-based models are able to capture the slow modes. Moreover, both modified TAEs with extensions from slow feature analysis and the state-free reversible VAMPnets (SRVs) can yield orthogonal multidimensional CVs. As an illustration, we employ SRVs to discover the CVs of the isomerizations of N-acetyl-N'-methylalanylamide and trialanine by iterative learning with trajectories from biased simulations. Last, through numerical experiments with anisotropic diffusion, we investigate the potential relationship of time-series-based models and committor probabilities.
We previously presented a computational protocol to predict the enzymatic (enantio)selectivity of an ω-transaminase towards a set of ligands (Ramírez-Palacios et al. (2021) Journal of Chemical Information and Modeling 61(11), 5569-5580) by counting the number of binding poses present in molecular dynamics (MD) simulations that met a defined set of geometric criteria. The geometric criteria consisted of a hand-crafted set of distances, angles and dihedrals deemed to be important for the enzymatic reaction to take place. In this work, the MD trajectories are reanalysed using a deep-learning approach to predict the enantiopreference of the enzyme without the need for hand-crafted criteria. We show that a convolutional neural network is capable of classifying the trajectories as belonging to the 'reactive' or 'non-reactive' enantiomer (binary classification) with a good accuracy (>0.90). The new method reduces the computational cost of the methodology, because it does not necessitate the sampling approach from the previous work. We also show that analysing how neural networks reach specific decisions can aid hand-crafted approaches (e.g. definition of near-attack conformations, or binding poses).
Divalent sulfur (S) forms a chalcogen bond (Ch-bond) via its σ-holes and a hydrogen bond (H-bond) via its lone pairs. The relevance of these interactions and their interplay for protein structure and function is unclear. Based on the analyses of the crystal structures of small organic/organometallic molecules and proteins and their molecular electrostatic surface potential, we show that the reciprocity of the substituent-dependent strength of the σ-holes and lone pairs correlates with the formation of either Ch-bond or H-bond. In proteins, cystines preferentially form Ch-bonds, metal-chelated cysteines form H-bonds, while methionines form either of them with comparable frequencies. This has implications for the positioning of these residues and their role in protein structure and function. Computational analyses reveal that the S-mediated interactions stabilise protein secondary structures by mechanisms such as helix capping and protecting free β-sheet edges by negative design. The study highlights the importance of S-mediated Ch-bond and H-bond for understanding protein folding and function, the development of improved strategies for protein/peptide structure prediction and design and structure-based drug discovery.