Genome-wide identification of binding profiles for DNA-binding proteins from the limited number of intracellular pathogens in infection studies is crucial for understanding virulence and cellular processes but remains challenging, as the current ChIP-exo is designed for high-input bacterial cells (>1010). Here, we developed an optimized ChIP-mini method, a low-input ChIP-exo utilizing a 5,000-fold reduced number of initial bacterial cells and an analysis pipeline, to identify genome-wide binding dynamics of DNA-binding proteins in host-infected pathogens. Applying ChIP-mini to intracellular Salmonella Typhimurium, we identified 642 and 1,837 binding sites of H-NS and RpoD, respectively, elucidating changes in their binding position and binding intensity during infection. Post-infection, we observed 21 significant reductions in H-NS binding at intergenic regions, exposing the promoter region of virulence genes, such as those in Salmonella pathogenicity islands-2, 3 and effectors. Furthermore, we revealed the crucial phenomenon that novel and significantly increased RpoD bindings were found within regions exhibiting diminished H-NS binding, thereby facilitating substantial upregulation of virulence genes. These findings markedly enhance our understanding of how H-NS and RpoD simultaneously coordinate the transcription initiation of virulence genes within macrophages. Collectively, this work demonstrates a broadly adaptable tool that will enable the elucidation of DNA-binding protein dynamics in diverse intracellular pathogens during infection.
RNA-binding proteins (RBPs) are central components of gene regulatory networks. The differentiation of heterocysts in filamentous cyanobacteria is an example of cell differentiation in prokaryotes. Although multiple non-coding transcripts are involved in this process, no RBPs have been implicated thus far. Here we used quantitative mass spectrometry to analyze the differential fractionation of RNA-protein complexes after RNase treatment in density gradients yielding 333 RNA-associated proteins, while a bioinformatic prediction yielded 311 RBP candidates in Nostoc sp. PCC 7120. We validated in vivo the RNA-binding capacity of six RBP candidates. Some participate in essential physiological aspects, such as photosynthesis (Alr2890), thylakoid biogenesis (Vipp1) or heterocyst differentiation (PrpA, PatU3), but their association with RNA was unknown. Validated RBPs Asl3888 and Alr1700 were not previously characterized. Alr1700 is an RBP with two oligonucleotide/oligosaccharide-binding (OB)-fold-like domains that is differentially expressed in heterocysts and interacts with non-coding regulatory RNAs. Deletion of alr1700 led to complete deregulation of the cell differentiation process, a striking increase in the number of heterocyst-like cells, and was ultimately lethal in the absence of combined nitrogen. These observations characterize this RBP as a master regulator of the heterocyst patterning and differentiation process, leading us to rename Alr1700 to PatR.
Increasing antifungal drug resistance is a major concern associated with human fungal pathogens like Aspergillus fumigatus. Genetic mutation and epimutation mechanisms clearly drive resistance, yet the epitranscriptome remains relatively untested. Here, deletion of the A. fumigatus transfer RNA (tRNA)-modifying isopentenyl transferase ortholog, Mod5, led to altered stress response and unexpected resistance against the antifungal drug 5-fluorocytosine (5-FC). After confirming the canonical isopentenylation activity of Mod5 by liquid chromatography-tandem mass spectrometry and Nano-tRNAseq, we performed simultaneous profiling of transcriptomes and proteomes to reveal a comparable overall response to 5-FC stress; however, a premature activation of cross-pathway control (CPC) genes in the knockout was further increased after antifungal treatment. We identified several orthologues of the Aspergillus nidulans Major Facilitator Superfamily transporter nmeA as specific CPC-client genes in A. fumigatus. Overexpression of Mod5-target tRNATyrGΨA in the Δmod5 strain rescued select phenotypes but failed to reverse 5-FC resistance, whereas deletion of nmeA largely, but incompletely, reverted the resistance phenotype, implying additional relevant exporters. In conclusion, 5-FC resistance in the absence of Mod5 and i6A likely originates from multifaceted transcriptional and translational changes that skew the fungus towards premature CPC-dependent activation of antifungal toxic-intermediate exporter nmeA, offering a potential mechanism reliant on RNA modification to facilitate transient antifungal resistance.
The DdmDE antiplasmid system, consisting of the helicase-nuclease DdmD and the prokaryotic Argonaute (pAgo) protein DdmE, plays a crucial role in defending Vibrio cholerae against plasmids. Guided by DNA, DdmE specifically targets plasmids, disassembles the DdmD dimer, and forms a DdmD-DdmE handover complex to facilitate plasmid degradation. However, the precise ATP-dependent DNA translocation mechanism of DdmD has remained unclear. Here, we present cryo-EM structures of DdmD bound to single-stranded DNA (ssDNA) in nucleotide-free, ATPγS-bound, and ADP-bound states. These structures, combined with biochemical analysis, reveal a unique "gate-clamp" mechanism for ssDNA translocation by DdmD. Upon ATP binding, arginine finger residues R855 and R858 reorient to interact with the γ-phosphate, triggering HD2 domain movement. This shift repositions the gate residue Q781, causing a flip of the 3' flank base, which is then clamped by residue F639. After ATP hydrolysis, the arginine finger releases the nucleotide, inducing HD2 to return to its open state. This conformational change enables DdmD to translocate along ssDNA by one nucleotide in the 5' to 3' direction. This study provides new insights into the ATP-dependent translocation of DdmD and contributes to understanding the mechanistic diversity within SF2 helicases.
Machine learning (ML) has shown great potential in the adaptive immune receptor repertoire (AIRR) field. However, there is a lack of large-scale ground-truth experimental AIRR data suitable for AIRR-ML-based disease diagnostics and therapeutics discovery. Simulated ground-truth AIRR data are required to complement the development and benchmarking of robust and interpretable AIRR-ML methods where experimental data is currently inaccessible or insufficient. The challenge for simulated data to be useful is incorporating key features observed in experimental repertoires. These features, such as antigen or disease-associated immune information, cause AIRR-ML problems to be challenging. Here, we introduce LIgO, a software suite, which simulates AIRR data for the development and benchmarking of AIRR-ML methods. LIgO incorporates different types of immune information both on the receptor and the repertoire level and preserves native-like generation probability distribution. Additionally, LIgO assists users in determining the computational feasibility of their simulations. We show two examples where LIgO supports the development and validation of AIRR-ML methods: (i) how individuals carrying out-of-distribution immune information impacts receptor-level prediction performance and (ii) how immune information co-occurring in the same AIRs impacts the performance of conventional receptor-level encoding and repertoire-level classification approaches. LIgO guides the advancement and assessment of interpretable AIRR-ML methods.
The RNA chaperone Hfq plays crucial roles in bacterial gene expression and is a major facilitator of small regulatory RNA (sRNA) action. The toroidal architecture of the Hfq hexamer presents three well-characterized surfaces that allow it to bind sRNAs to stabilize them and engage target transcripts. Hfq-interacting sRNAs are categorized into two classes based on the surfaces they use to bind Hfq. By characterizing a systematic alanine mutant library of Hfq to identify amino acid residues that impact survival of Escherichia coli experiencing nitrogen (N) starvation, we corroborated the important role of the three RNA-binding surfaces for Hfq function. We uncovered two, previously uncharacterized, conserved residues, V22 and G34, in the hydrophobic core of Hfq, to have a profound impact on Hfq's RNA-binding activity in vivo. Transcriptome-scale analysis revealed that V22A and G34A Hfq mutants cause widespread destabilization of both sRNA classes, to the same extent as seen in bacteria devoid of Hfq. However, the alanine substitutions at these residues resulted in only modest alteration in stability and structure of Hfq. We propose that V22 and G34 have impact on Hfq function, especially critical under cellular conditions when there is an increased demand for Hfq, such as N starvation.
Precursor (pre)-CRISPR RNA (crRNA) processing can occur in both the repeat and spacer regions, leading to the removal of specific segments from the repeat and spacer sequences, thereby facilitating crRNA maturation. The processing of pre-crRNA repeat by Cas effector and ribonuclease has been observed in CRISPR-Cas9 and CRISPR-Cas12a systems. However, no evidence of pre-crRNA spacer cleavage by any enzyme has been reported in these systems. In this study, we demonstrate that DNA target binding triggers efficient cleavage of pre-crRNA spacers by type II and V Cas effectors such as Cas12a, Cas12b, Cas12i, Cas12j and Cas9. We show that the pre-crRNA spacer cleavage catalyzed by Cas12a and Cas9 has distinct characteristics. Activation of the cleavage activity in Cas12a is induced by both single-stranded DNA (ssDNA) and double-stranded DNA target binding, whereas only ssDNA target binding triggers cleavage in Cas9 toward the pre-crRNA spacer. We present a series of structures elucidating the underlying mechanisms governing conformational activation in both Cas12a and Cas9. Furthermore, leveraging the trans-cutting activity of the pre-crRNA spacer, we develop a one-step DNA detection method characterized by its simplicity, high sensitivity, and excellent specificity.
Genome graphs, including the recently released draft human pangenome graph, can represent the breadth of genetic diversity and thus transcend the limits of traditional linear reference genomes. However, there are no genome-graph-compatible tools for analyzing whole genome bisulfite sequencing (WGBS) data. To close this gap, we introduce methylGrapher, a tool tailored for accurate DNA methylation analysis by mapping WGBS data to a genome graph. Notably, methylGrapher can reconstruct methylation patterns along haplotype paths precisely and efficiently. To demonstrate the utility of methylGrapher, we analyzed the WGBS data derived from five individuals whose genomes were included in the first Human Pangenome draft as well as WGBS data from ENCODE (EN-TEx). Along with standard performance benchmarking, we show that methylGrapher fully recapitulates DNA methylation patterns defined by classic linear genome analysis approaches. Importantly, methylGrapher captures a substantial number of CpG sites that are missed by linear methods, and improves overall genome coverage while reducing alignment reference bias. Thus, methylGrapher is a first step toward unlocking the full potential of Human Pangenome graphs in genomic DNA methylation analysis.