Are we close to a complete inventory of living processes so that we might expect in the near future to reproduce every essential aspect necessary for life? Or are there mechanisms and processes in cells and organisms that are presently inaccessible to us? Here I argue that a close examination of a particularly well-understood system--that of Escherichia coli chemotaxis--shows we are still a long way from a complete description. There is a level of molecular uncertainty, particularly that responsible for fine-tuning and adaptation to myriad external conditions, which we presently cannot resolve or reproduce on a computer. Moreover, the same uncertainty exists for any process in any organism and is especially pronounced and important in higher animals such as humans. Embryonic development, tissue homeostasis, immune recognition, memory formation, and survival in the real world, all depend on vast numbers of subtle variations in cell chemistry most of which are presently unknown or only poorly characterized. Overcoming these limitations will require us to not only accumulate large quantities of highly detailed data but also develop new computational methods able to recapitulate the massively parallel processing of living cells.
We develop an exact and flexible mathematical model for Lutz and Bujard's controllable promoters. It can be used as a building block for modeling genetic systems based on them. Special attention is paid to deduce all the model parameters from reported (in vitro) experimental data. We validate our model by comparing the regulatory ranges measured in vivo by Lutz and Bujard against the ranges predicted by the model, and which are calculated as the reporter activity obtained under inducing conditions divided by the activity measured under maximal repression. In particular, we verify Bond et al. assertion that the cooperativity between two lac operators can be assumed to be negligible when their central base pairs are separated by 22 or 32 bp [Gene repression by minimal lac loops in vivo, Nucleic Acids Res, 38 (2010) 8072-8082]. Moreover, we also find that the probability that two repressors LacI bind to these operators at the same time can be assumed to be negligible as well. We finally use the model for the promoter P(LlacO-1) to analyze a synthetic genetic oscillator recently build by Stricker et al. [A fast, robust and tunable synthetic gene oscillator, Nature, 456 (2008) 516-519].
In this review, we survey work that has been carried out in the attempts of biomathematicians to understand the dynamic behaviour of simple bacterial operons starting with the initial work of the 1960's. We concentrate on the simplest of situations, discussing both repressible and inducible systems and then turning to concrete examples related to the biology of the lactose and tryptophan operons. We conclude with a brief discussion of the role of both extrinsic noise and so-called intrinsic noise in the form of translational and/or transcriptional bursting.
Analysis of metabolic networks typically begins with construction of the stoichiometry matrix, which characterizes the network topology. This matrix provides, via the balance equation, a description of the potential steady-state flow distribution. This paper begins with the observation that the balance equation depends only on the structure of linear redundancies in the network, and so can be stated in a succinct manner, leading to computational efficiencies in steady-state analysis. This alternative description of steady-state behaviour is then used to provide a novel method for network reduction, which complements existing algorithms for describing intracellular networks in terms of input-output macro-reactions (to facilitate bioprocess optimization and control). Finally, it is demonstrated that this novel reduction method can be used to address elementary mode analysis of large networks: the modes supported by a reduced network can capture the input-output modes of a metabolic module with significantly reduced computational effort.
Recent evidence suggests that cells employ functionally asymmetric partitioning schemes in division to cope with aging. We explore various schemes in silico, with a stochastic model of Escherichia coli that includes gene expression, non-functional proteins generation, aggregation and polar retention, and molecule partitioning in division. The model is implemented in SGNS2, which allows stochastic, multi-delayed reactions within hierarchical, transient, interlinked compartments. After setting parameter values of non-functional proteins' generation and effects that reproduce realistic intracellular and population dynamics, we investigate how the spatial organization of non-functional proteins affects mean division times of cell populations in lineages and, thus, mean cell numbers over time. We find that division times decrease for increasingly asymmetric partitioning. Also, increasing the clustering of non-functional proteins decreases division times. Increasing the bias in polar segregation further decreases division times, particularly if the bias favors the older pole and aggregates' polar retention is robust. Finally, we show that the non-energy consuming retention of inherited non-functional proteins at the older pole via nucleoid occlusion is a source of functional asymmetries and, thus, is advantageous. Our results suggest that the mechanisms of intracellular organization of non-functional proteins, including clustering and polar retention, affect the vitality of E. coli populations.
ChemModLab, written by the ECCR @ NCSU consortium under NIH support, is a toolbox for fitting and assessing quantitative structure-activity relationships (QSARs). Its elements are: a cheminformatic front end used to supply molecular descriptors for use in modeling; a set of methods for fitting models; and methods for validating the resulting model. Compounds may be input as structures from which standard descriptors will be calculated using the freely available cheminformatic front end PowerMV; PowerMV also supports compound visualization. In addition, the user can directly input their own choices of descriptors, so the capability for comparing descriptors is effectively unlimited. The statistical methodologies comprise a comprehensive collection of approaches whose validity and utility have been accepted by experts in the fields. As far as possible, these tools are implemented in open-source software linked into the flexible R platform, giving the user the capability of applying many different QSAR modeling methods in a seamless way. As promising new QSAR methodologies emerge from the statistical and data-mining communities, they will be incorporated in the laboratory. The web site also incorporates links to public-domain data sets that can be used as test cases for proposed new modeling methods. The capabilities of ChemModLab are illustrated using a variety of biological responses, with different modeling methodologies being applied to each. These show clear differences in quality of the fitted QSAR model, and in computational requirements. The laboratory is web-based, and use is free. Researchers with new assay data, a new descriptor set, or a new modeling method may readily build QSAR models and benchmark their results against other findings. Users may also examine the diversity of the molecules identified by a QSAR model. Moreover, users have the choice of placing their data sets in a public area to facilitate communication with other researchers; or can keep them hidden to preserve confidentiality.
This paper addresses the problem of reconstructing viral quasispecies from next-generation sequencing reads obtained from amplicons (i.e., reads generated from predefined amplified overlapping regions). We compare the parsimonious and likelihood models for this problem and propose several novel assembling algorithms. The proposed methods have been validated on simulated error-free HCV and real HBV amplicon reads. The new algorithms have been shown to outperform the method of Prosperi et. al. Our experiments also show that viral quasispecies can be reconstructed in most cases more accurately from amplicon reads rather than shotgun reads. All algorithms have been implemented and made available at https://bitbucket.org/nmancuso/bioa/.
Hepatitis C virus (HCV) is a major cause of liver disease world-wide. Current interferon and ribavirin (IFN/RBV) therapy is effective in 50%-60% of patients. HCV exists in infected patients as a large viral population of intra-host variants (quasispecies), which may be differentially resistant to interferon treatment. We present a method for measuring differential interferon resistance of HCV quasispecies based on mathematical modeling and analysis of HCV population dynamics during the first hours of interferon therapy. The mathematical models showed that individual intra-host HCV variants have a wide range of resistance to IFN treatment in each patient. Analysis of differential IFN resistance among intra-host HCV variants allows for accurate prediction of response to IFN therapy. The models strongly suggest that resistance to interferon may vary broadly among closely related variants in infected hosts and therapy outcome may be defined by a single or a few variants irrespective of their frequency in the intra-host HCV population before treatment.
Microarray technology facilitates the monitoring of the expression levels of thousands of genes over different experimental conditions simultaneously. Clustering is a popular data mining tool which can be applied to microarray gene expression data to identify co-expressed genes. Most of the traditional clustering methods optimize a single clustering goodness criterion and thus may not be capable of performing well on all kinds of datasets. Motivated by this, in this article, a multiobjective clustering technique that optimizes cluster compactness and separation simultaneously, has been improved through a novel support vector machine classification based cluster ensemble method. The superiority of MOCSVMEN (MultiObjective Clustering with Support Vector Machine based ENsemble) has been established by comparing its performance with that of several well known existing microarray data clustering algorithms. Two real-life benchmark gene expression datasets have been used for testing the comparative performances of different algorithms. A recently developed metric, called Biological Homogeneity Index (BHI), which computes the clustering goodness with respect to functional annotation, has been used for the comparison purpose.