HeLa cell transfection with plasmid DNA (pDNA) is widely used to materialize biologicals and as a preclinical test of nucleic acid-based vaccine efficacy. We sought to genetically encode mammalian transfection sensor (Trensor) circuits and test their utility in HeLa cells for detecting molecules and methods for their propensity to influence transfection. We intended these Trensor circuits to be triggered if their host cell was treated with polyplexed pDNA or certain small-molecule modulators of transfection. We prioritized three promoters, implicated by others in feedback responses as cells import and process foreign material and stably integrated each into the genomes of three different cell lines, each upstream of a green fluorescent protein (GFP) open reading frame within a transgene. All three Trensor circuits showed an increase in their GFP expression when their host HeLa cells were incubated with pDNA and the degraded polyamidoamine dendrimer reagent, SuperFect. We next experimentally demonstrated the modulation of PEI-mediated HeLa cell transient transfection by four different small molecules, with Trichostatin A (TSA) showing the greatest propensity to boost transgene expression. The Trensor circuit based on the TRA2B promoter (Trensor-T) was triggered by incubation with TSA alone and not the other three small molecules. These data suggest that mammalian reporter circuits could enable low-cost, high-throughput screening to identify novel transfection methods and reagents without the need to perform actual transfections requiring costly plasmids or expensive fluorescent labels.
Genome integration enables host organisms to stably carry heterologous DNA messages, introducing new genotypes and phenotypes for expanded applications. While several genome integration approaches have been reported, a scalable tool for DNA message storage within site-specific genome landing pads is still lacking. Here, we introduce an iterative genome integration method utilizing orthogonal serine integrases, enabling the stable storage of multiple heterologous genes in the chromosome of Escherichia coli MG1655. By leveraging serine integrases TP901-1, Bxb1, and PhiC31, along with engineered integration vectors, we demonstrate high-efficiency, marker-free integration of DNA fragments up to 13 kb in length. To further simplify the procedure, we then develop a streamlined integration method and showcase the system’s versatility by constructing an engineered E. coli strain capable of storing and expressing multiple genes from diverse species. Additionally, we illustrate the potential utility of these engineered strains for synthetic biology applications, including in vivo and in vitro protein expression. Our work extends the application scope of serine integrases for scalable gene integration cascades, with implications for genome manipulation and gene storage applications in synthetic biology.
N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. This paper introduces a deep learning/synthetic biology codesigned few-shot training workflow for NCS optimization. Our method utilizes k-nearest encoding followed by word2vec to encode the NCS, then performs feature extraction using attention mechanisms, before constructing a time-series network for predicting gene expression intensity, and finally a direct search algorithm identifies the optimal NCS with limited training data. We took green fluorescent protein (GFP) expressed by Bacillus subtilis as a reporting protein of NCSs, and employed the fluorescence enhancement factor as the metric of NCS optimization. Within just six iterative experiments, our model generated an NCS (MLD62) that increased average GFP expression by 5.41-fold, outperforming the state-of-the-art NCS designs. Extending our findings beyond GFP, we showed that our engineered NCS (MLD62) can effectively boost the production of N-acetylneuraminic acid by enhancing the expression of the crucial rate-limiting GNA1 gene, demonstrating its practical utility. We have open-sourced our NCS expression database and experimental procedures for public use.
The progress and utility of synthetic biology is currently hindered by the lengthy process of studying literature and replicating poorly documented work. Reconstruction of crucial design information through post hoc curation is highly noisy and error-prone. To combat this, author participation during the curation process is crucial. To encourage author participation without overburdening them, an ML-assisted curation tool called SeqImprove has been developed. Using named entity recognition, called entity normalization, and sequence matching, SeqImprove creates machine-accessible sequence data and metadata annotations, which authors can then review and edit before submitting a final sequence file. SeqImprove makes it easier for authors to submit sequence data that is FAIR (findable, accessible, interoperable, and reusable).
Mathematical modeling is indispensable in synthetic biology but remains underutilized. Tackling problems, from optimizing gene networks to simulating intracellular dynamics, can be facilitated by the ever-growing body of modeling approaches, be they mechanistic, stochastic, data-driven, or AI-enabled. Thanks to progress in the AI community, robust frameworks have emerged to enable researchers to access complex computational hardware and compilation. Previously, these frameworks focused solely on deep learning, but they have been developed to the point where running different forms of computation is relatively simple, as made possible, notably, by the JAX library. Running simulations at scale on GPUs speeds up research, which compounds enable larger-scale experiments and greater usability of code. As JAX remains underexplored in computational biology, we demonstrate its utility in three example projects ranging from synthetic biology to directed evolution, each with an accompanying demonstrative Jupyter notebook. We hope that these tutorials serve to democratize the flexible scaling, faster run-times, easy GPU portability, and mathematical enhancements (such as automatic differentiation) that JAX brings, all with only minor restructuring of code.
Methylotrophic yeast Ogataea polymorpha has become a promising cell factory due to its efficient utilization of methanol to produce high value-added chemicals. However, the low homologous recombination (HR) efficiency in O. polymorpha greatly hinders extensive metabolic engineering for industrial applications. Overexpression of HR-related genes successfully improved HR efficiency, which however brought cellular stress and reduced chemical production due to constitutive expression of the HR-related gene. Here, we engineered an HR repair pathway using the dynamically regulated gene ScRAD51 under the control of the l-rhamnose-induced promoter PLRA3 based on the previously constructed CRISPR-Cas9 system in O. polymorpha. Under the optimal inducible conditions, the appropriate expression level of ScRAD51 achieved up to 60% of HR rates without any detectable influence on cell growth in methanol, which was 10-fold higher than that of the wild-type strain. While adopting as the chassis strain for bioproductions, the dynamically regulated recombination system had 50% higher titers of fatty alcohols than that static regulation system. Therefore, this study provided a feasible platform in O. polymorpha for convenient genetic manipulation without perturbing cellular fitness.
Differentiation within multicellular organisms is a complex process that helps to establish spatial patterning and tissue formation within the body. Often, the differentiation of cells is governed by morphogens and intercellular signaling molecules that guide the fate of each cell, frequently using toggle-like regulatory components. Synthetic biologists have long sought to recapitulate patterned differentiation with engineered cellular communities, and various methods for differentiating bacteria have been invented. Here, we couple a synthetic corepressive toggle switch with intercellular signaling pathways to create a “quorum-sensing toggle”. We show that this circuit not only exhibits population-wide bistability in a well-mixed liquid environment but also generates patterns of differentiation in colonies grown on agar containing an externally supplied morphogen. If coupled to other metabolic processes, circuits such as the one described here would allow for the engineering of spatially patterned, differentiated bacteria for use in biomaterials and bioelectronics.
Vanillin is a widely used flavoring compound in the food, pharmaceutical, and cosmetics area. However, the biosynthesis of vanillin from low-cost shikimic acid is significantly hindered by the low activity of the rate-limiting enzyme, caffeate O-methyltransferase (COMT). To screen COMT variants with improved conversion rates, we designed a biosensing system that is adaptable to the COMT-mediated vanillin synthetic pathway. Through the evolution of aldehyde transcriptional factor YqhC, we obtained a dual-responsive variant, MuYqhC, which positively responds to the product and negatively responds to the substrate, with no response to intermediates. Using the MuYqhC-based vanillin biosensor, we successfully identified a COMT variant, Mu176, that displayed a 7-fold increase in the conversion rate compared to the wild-type COMT. This variant produced 2.38 mM vanillin from 3 mM protocatechuic acid, achieving a conversion rate of 79.33%. The enhanced activity of Mu176 was attributed to an enlarged binding pocket and strengthened substrate interaction. Applying Mu176 to Bacillus subtilis increased the level of vanillin production from shikimic acid by 2.39-fold. Further optimization of the production chassis, increasing the S-adenosylmethionine supply and the precursor concentration, elevated the vanillin titer to 1 mM, marking the highest level of vanillin production from shikimic acid in Bacillus. Our work highlights the significance of the MuYqhC-based biosensing system and the Mu176 variant in vanillin production.
Auxins are crucial signaling molecules that regulate the growth, metabolism, and behavior of various organisms, most notably plants but also bacteria, fungi, and animals. Many microbes synthesize and perceive auxins, primarily indole-3-acetic acid (IAA, referred to as auxin herein), the most prevalent natural auxin, which influences their ability to colonize plants and animals. Understanding auxin biosynthesis and signaling in fungi may allow us to better control interkingdom relationships and microbiomes from agricultural soils to the human gut. Despite this importance, a biological tool for measuring auxin with high spatial and temporal resolution has not been engineered in fungi. In this study, we present a suite of genetically encoded, ratiometric, protein-based auxin biosensors designed for the model yeast Saccharomyces cerevisiae. Inspired by auxin signaling in plants, the ratiometric nature of these biosensors enhances the precision of auxin concentration measurements by minimizing clonal and growth phase variation. We used these biosensors to measure auxin production across diverse growth conditions and phases in yeast cultures and calibrated their responses to physiologically relevant levels of auxin. Future work will aim to improve the fold change and reversibility of these biosensors. These genetically encoded auxin biosensors are valuable tools for investigating auxin biosynthesis and signaling in S. cerevisiae and potentially other yeast and fungi and will also advance quantitative functional studies of the plant auxin perception machinery, from which they are built.