Pub Date : 2022-10-01Epub Date: 2022-09-12DOI: 10.1142/S0219720022500202
Caijie Gao, Xu Zhao, Jianrong Fan
The peroxisome proliferator-activated receptor-[Formula: see text] (PPAR[Formula: see text]) is a member of PPAR nuclear receptor family, and its antagonists have been widely used to treat pediatric metabolic disorders. Traditional type-1 and type-2 PPAR[Formula: see text] antagonists are all small-molecule compounds that have been developed to target the ligand-binding site (LBS) of PPAR[Formula: see text], which is not overlapped with the coactivator-interacting site (CIS) of PPAR[Formula: see text]. In this study, we described the rational design of type-3 peptidic antagonists that can directly disrupt PPAR[Formula: see text]-coactivator interaction by physically competing with coactivator proteins for the CIS site. In the procedure, seven reported PPAR[Formula: see text] coactivator proteins were collected and eight 11-mer helical peptide segments that contain the core PPAR[Formula: see text]-binding LXXLL motif were identified in these coactivators, which, however, possessed a large flexibility and intrinsic disorder when splitting from coactivator protein context, and thus would incur a considerable entropy penalty (i.e. indirect readout) upon binding to PPAR[Formula: see text] CIS site. By carefully examining the natively folded conformation of these helical peptides in their parent protein context and in their interaction mode with the CIS site, we rationally designed a hydrocarbon bridge across the solvent-exposed, ([Formula: see text], [Formula: see text]+ 4) residues to constrain their helical conformation, thus largely minimizing the unfavorable indirect readout effect but having only a moderate influence on favorable enthalpy contribution (i.e. direct readout) upon PPAR[Formula: see text]-peptide binding. The computational findings were further substantiated by fluorescence competition assays.
{"title":"Computational design and experimental confirmation of conformationally constrained peptides to compete with coactivators for pediatric PPAR[Formula: see text] by minimizing indirect readout effect.","authors":"Caijie Gao, Xu Zhao, Jianrong Fan","doi":"10.1142/S0219720022500202","DOIUrl":"https://doi.org/10.1142/S0219720022500202","url":null,"abstract":"<p><p>The peroxisome proliferator-activated receptor-[Formula: see text] (PPAR[Formula: see text]) is a member of PPAR nuclear receptor family, and its antagonists have been widely used to treat pediatric metabolic disorders. Traditional type-1 and type-2 PPAR[Formula: see text] antagonists are all small-molecule compounds that have been developed to target the ligand-binding site (LBS) of PPAR[Formula: see text], which is not overlapped with the coactivator-interacting site (CIS) of PPAR[Formula: see text]. In this study, we described the rational design of type-3 peptidic antagonists that can directly disrupt PPAR[Formula: see text]-coactivator interaction by physically competing with coactivator proteins for the CIS site. In the procedure, seven reported PPAR[Formula: see text] coactivator proteins were collected and eight 11-mer helical peptide segments that contain the core PPAR[Formula: see text]-binding LXXLL motif were identified in these coactivators, which, however, possessed a large flexibility and intrinsic disorder when splitting from coactivator protein context, and thus would incur a considerable entropy penalty (i.e. indirect readout) upon binding to PPAR[Formula: see text] CIS site. By carefully examining the natively folded conformation of these helical peptides in their parent protein context and in their interaction mode with the CIS site, we rationally designed a hydrocarbon bridge across the solvent-exposed, ([Formula: see text], [Formula: see text]+ 4) residues to constrain their helical conformation, thus largely minimizing the unfavorable indirect readout effect but having only a moderate influence on favorable enthalpy contribution (i.e. direct readout) upon PPAR[Formula: see text]-peptide binding. The computational findings were further substantiated by fluorescence competition assays.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33464193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-08DOI: 10.1142/s0219720022500226
Sarah-Laure Rincourt, S. Michiels, D. Drubay
The development of prognostic molecular signatures considering the inter-patient heterogeneity is a key challenge for the precision medicine. We propose a joint model of this heterogeneity and the patient survival, assuming that tumor expression results from a mixture of a subset of independent signatures. We deconvolute the omics data using a non-parametric independent component analysis with a double sparseness structure for the source and the weight matrices, corresponding to the gene-component and individual-component associations, respectively. In a simulation study, our approach identified the correct number of components and reconstructed with high accuracy the weight ([Formula: see text]0.85) and the source ([Formula: see text]0.75) matrices sparseness. The selection rate of components with high-to-moderate prognostic impacts was close to 95%, while the weak impacts were selected with a frequency close to the observed false positive rate ([Formula: see text]25%). When applied to the expression of 1063 genes from 614 breast cancer patients, our model identified 15 components, including six associated to patient survival, and related to three known prognostic pathways in early breast cancer (i.e. immune system, proliferation, and stromal invasion). The proposed algorithm provides a new insight into the individual molecular heterogeneity that is associated with patient prognosis to better understand the complex tumor mechanisms.
{"title":"A non-parametric Bayesian joint model for latent individual molecular profiles and survival in oncology","authors":"Sarah-Laure Rincourt, S. Michiels, D. Drubay","doi":"10.1142/s0219720022500226","DOIUrl":"https://doi.org/10.1142/s0219720022500226","url":null,"abstract":"The development of prognostic molecular signatures considering the inter-patient heterogeneity is a key challenge for the precision medicine. We propose a joint model of this heterogeneity and the patient survival, assuming that tumor expression results from a mixture of a subset of independent signatures. We deconvolute the omics data using a non-parametric independent component analysis with a double sparseness structure for the source and the weight matrices, corresponding to the gene-component and individual-component associations, respectively. In a simulation study, our approach identified the correct number of components and reconstructed with high accuracy the weight ([Formula: see text]0.85) and the source ([Formula: see text]0.75) matrices sparseness. The selection rate of components with high-to-moderate prognostic impacts was close to 95%, while the weak impacts were selected with a frequency close to the observed false positive rate ([Formula: see text]25%). When applied to the expression of 1063 genes from 614 breast cancer patients, our model identified 15 components, including six associated to patient survival, and related to three known prognostic pathways in early breast cancer (i.e. immune system, proliferation, and stromal invasion). The proposed algorithm provides a new insight into the individual molecular heterogeneity that is associated with patient prognosis to better understand the complex tumor mechanisms.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44087582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-01Epub Date: 2022-08-03DOI: 10.1142/S0219720022500172
Hongliang Zou
RNA 5-hydroxymethylcytosine (5 hmC) is an important RNA modification, which plays vital role in several biological processes. Currently, it is a hot topic to identify 5 hmC sites due to its benefit in understanding its biological functions. Therefore, in this study, we developed a predictor called iRNA5 hmC-HOC, which is based on a high-order correlation information method to identify 5 hmC sites. To build the model, 22 different classes of dinucleotide physicochemical (PC) properties were employed to represent RNA sequences, and the least absolute shrinkage and selection operator (LASSO) algorithm was adopted to select the most discriminative features. In the jackknife test, the proposed method achieved 89.80% classification accuracy based on support vector machine (SVM). As compared with the state-of-the-art predictors, our proposed method has significant improvement on the classification performance. It indicates that the proposed method might be a promising tool in identifying RNA 5 hmC modification sites. The dataset and source codes are available at https://figshare.com/articles/online_resource/iRNA5hmC-HOC/15177450.
{"title":"iRNA5hmC-HOC: High-order correlation information for identifying RNA 5-hydroxymethylcytosine modification.","authors":"Hongliang Zou","doi":"10.1142/S0219720022500172","DOIUrl":"https://doi.org/10.1142/S0219720022500172","url":null,"abstract":"<p><p>RNA 5-hydroxymethylcytosine (5 hmC) is an important RNA modification, which plays vital role in several biological processes. Currently, it is a hot topic to identify 5 hmC sites due to its benefit in understanding its biological functions. Therefore, in this study, we developed a predictor called iRNA5 hmC-HOC, which is based on a high-order correlation information method to identify 5 hmC sites. To build the model, 22 different classes of dinucleotide physicochemical (PC) properties were employed to represent RNA sequences, and the least absolute shrinkage and selection operator (LASSO) algorithm was adopted to select the most discriminative features. In the jackknife test, the proposed method achieved 89.80% classification accuracy based on support vector machine (SVM). As compared with the state-of-the-art predictors, our proposed method has significant improvement on the classification performance. It indicates that the proposed method might be a promising tool in identifying RNA 5 hmC modification sites. The dataset and source codes are available at https://figshare.com/articles/online_resource/iRNA5hmC-HOC/15177450.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40576562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-01DOI: 10.1142/S0219720022500123
Nadia Tahiri, Andrey Veriga, Aleksandr Koshkarov, Boris Morozov
The evolutionary histories of genes are susceptible of differing greatly from each other which could be explained by evolutionary variations in horizontal gene transfers or biological recombinations. A phylogenetic tree would therefore represent the evolutionary history of each gene, which may present different patterns from the species tree that defines the main evolutionary patterns. In addition, phylogenetic trees of closely related species should be merged, thus minimizing the topological conflicts they present and obtaining consensus trees (in the case of homogeneous data) or supertrees (in the case of heterogeneous data). The traditional approaches are consensus tree inference (if the set of trees contains the same set of species) or supertrees (if the set of trees contains different, but overlapping sets of species). Consensus trees and supertrees are constructed to produce unique trees. However, these methods lose precision with respect to different evolutionary variability. Other approaches have been implemented to preserve this variability using the [Formula: see text]-means algorithm or the [Formula: see text]-medoids algorithm. Using a new method, we determine all possible consensus trees and supertrees that best represent the most significant evolutionary models in a set of phylogenetic trees, thereby increasing the precision of the results and decreasing the time required. Results: This paper presents in detail a new method for predicting the number of clusters in a Robinson and Foulds (RF) distance matrix using a convolutional neural network (CNN). We developed a new CNN approach (called CNNTrees) for multiple tree classification. This new strategy returns a number of clusters of the input phylogenetic trees for different-size sets of trees, which makes the new approach more stable and more robust. The paper provides an in-depth analysis of the relevant, but very difficult, problem of constructing alternative supertrees using phylogenies with different but overlapping sets of taxa. This new model will play an important role in the inference of Trees of Life (ToL). Availability and implementation: CNNTrees is available through a web server at https://tahirinadia.github.io/. The source code, data and information about installation procedures are also available at https://github.com/TahiriNadia/CNNTrees. Supplementary information: Supplementary data are available on GitHub platform. The evolutionary history of species is not unique, but is specific to sets of genes. Indeed, each gene has its own evolutionary history that differs considerably from one gene to another. For example, some individual genes or operons may be affected by specific horizontal gene transfer and recombination events. Thus, the evolutionary history of each gene must be represented by its own phylogenetic tree, which may exhibit different evolutionary patterns than the species tree that accounts for the major vertical descent patterns. T
基因的进化史可能彼此之间有很大的差异,这可以用水平基因转移或生物重组的进化变化来解释。因此,系统发育树将代表每个基因的进化史,它可能呈现与定义主要进化模式的物种树不同的模式。此外,应该合并密切相关物种的系统发育树,从而最大限度地减少它们所呈现的拓扑冲突,并获得共识树(在同类数据的情况下)或超树(在异构数据的情况下)。传统的方法是共识树推理(如果树集包含相同的物种集)或超树(如果树集包含不同但重叠的物种集)。共识树和超树的构造是为了产生唯一树。然而,这些方法相对于不同的进化变异性失去了精度。已经实现了其他方法来保持这种可变性,使用[公式:见文本]-means算法或[公式:见文本]- medioids算法。利用一种新的方法,我们确定了一组系统发育树中最能代表最重要进化模型的所有可能的共识树和超树,从而提高了结果的精度并减少了所需的时间。结果:本文详细介绍了一种利用卷积神经网络(CNN)预测Robinson and Foulds (RF)距离矩阵中簇数的新方法。我们开发了一种新的CNN方法(称为CNNTrees)用于多树分类。这种新策略为不同大小的树集返回许多输入系统发育树的簇,这使得新方法更加稳定和健壮。本文深入分析了利用不同但重叠的分类群系统发育构建替代超树的相关但非常困难的问题。这一新模型将在生命之树(ToL)的推理中发挥重要作用。可用性和实现:CNNTrees可通过web服务器访问https://tahirinadia.github.io/。有关安装过程的源代码、数据和信息也可在https://github.com/TahiriNadia/CNNTrees上获得。补充信息:在GitHub平台上提供补充数据。物种的进化史不是独一无二的,而是特定于一组基因的。事实上,每个基因都有自己的进化历史,而且每个基因之间的差异很大。例如,某些个体基因或操纵子可能受到特定水平基因转移和重组事件的影响。因此,每个基因的进化史必须由它自己的系统发育树来表示,这可能表现出不同的进化模式,而不是物种树,说明主要的垂直下降模式。传统的共识树或超树推理方法的结果是一个单一的共识树或超树。在本文中,我们详细提出了一种使用卷积神经网络(CNN)预测Robinson and Foulds (RF)距离矩阵中簇数的新方法。我们开发了一种新的CNN方法(CNNTrees)来构建多树分类。这种新策略按照输入树的顺序返回许多簇,这使得这种新方法更稳定,也更健壮。
{"title":"Invariant transformers of Robinson and Foulds distance matrices for Convolutional Neural Network.","authors":"Nadia Tahiri, Andrey Veriga, Aleksandr Koshkarov, Boris Morozov","doi":"10.1142/S0219720022500123","DOIUrl":"https://doi.org/10.1142/S0219720022500123","url":null,"abstract":"<p><p>The evolutionary histories of genes are susceptible of differing greatly from each other which could be explained by evolutionary variations in horizontal gene transfers or biological recombinations. A phylogenetic tree would therefore represent the evolutionary history of each gene, which may present different patterns from the species tree that defines the main evolutionary patterns. In addition, phylogenetic trees of closely related species should be merged, thus minimizing the topological conflicts they present and obtaining consensus trees (in the case of homogeneous data) or supertrees (in the case of heterogeneous data). The traditional approaches are consensus tree inference (if the set of trees contains the same set of species) or supertrees (if the set of trees contains different, but overlapping sets of species). Consensus trees and supertrees are constructed to produce unique trees. However, these methods lose precision with respect to different evolutionary variability. Other approaches have been implemented to preserve this variability using the [Formula: see text]-means algorithm or the [Formula: see text]-medoids algorithm. Using a new method, we determine all possible consensus trees and supertrees that best represent the most significant evolutionary models in a set of phylogenetic trees, thereby increasing the precision of the results and decreasing the time required. <b>Results:</b> This paper presents in detail a new method for predicting the number of clusters in a Robinson and Foulds (RF) distance matrix using a convolutional neural network (CNN). We developed a new CNN approach (called CNNTrees) for multiple tree classification. This new strategy returns a number of clusters of the input phylogenetic trees for different-size sets of trees, which makes the new approach more stable and more robust. The paper provides an in-depth analysis of the relevant, but very difficult, problem of constructing alternative supertrees using phylogenies with different but overlapping sets of taxa. This new model will play an important role in the inference of Trees of Life (ToL). <b>Availability and implementation:</b> CNNTrees is available through a web server at https://tahirinadia.github.io/. The source code, data and information about installation procedures are also available at https://github.com/TahiriNadia/CNNTrees. <b>Supplementary information:</b> Supplementary data are available on GitHub platform. The evolutionary history of species is not unique, but is specific to sets of genes. Indeed, each gene has its own evolutionary history that differs considerably from one gene to another. For example, some individual genes or operons may be affected by specific horizontal gene transfer and recombination events. Thus, the evolutionary history of each gene must be represented by its own phylogenetic tree, which may exhibit different evolutionary patterns than the species tree that accounts for the major vertical descent patterns. T","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10775458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-01Epub Date: 2022-08-08DOI: 10.1142/S0219720022500184
Siddhartha Kundu
Whilst data on biochemical networks has increased several-fold, our comprehension of the underlying molecular biology is incomplete and inadequate. Simulation studies permit data collation from disparate time points and the imputed trajectories can provide valuable insights into the molecular biology of complex biochemical systems. Although, stochastic simulations are accurate, each run is an independent event and the data that is generated cannot be directly compared even with identical simulation times. This lack of robustness will preclude a biologically meaningful result for the metabolite(s) of concern and is a significant limitation of this approach. "TemporalGSSA" or temporal Gillespie Stochastic Simulation Algorithm is an R-wrapper which will collate and partition SSA-generated datasets with identical simulation times (trials) into finite sets of linear models (technical replicates). Each such model (time step of a single run, absolute number of molecules for a metabolite) computes several coefficients (slope, intercept, etc.). These coefficients are averaged (mean slope, mean intercept) across all trials of a technical replicate and along with an imputed time step (mean, median, random) is incorporated into a linear regression equation. The solution to this equation is the number of molecules of a metabolite which is used to compute the molar concentration of the metabolite per technical replicate. The summarized (mean, standard deviation) data of this vector of technical replicates is the outcome or numerical estimate of the molar concentration of a metabolite and is dependent on the duration of the simulation. If the SSA-generated dataset comprises runs with differing simulation times, "TemporalGSSA" can compute the time-dependent trajectory of a metabolite provided the trials-per technical replicate constraint is complied with. The algorithms deployed by "TemporalGSSA" are rigorous, have a sound theoretical basis and have contributed meaningfully to our comprehension of the mechanism(s) that drive complex biochemical systems. "TemporalGSSA", is robust, freely accessible and easy to use with several readily testable examples.
{"title":"TemporalGSSA: A numerically robust R-wrapper to facilitate computation of a metabolite-specific and simulation time-dependent trajectory from stochastic simulation algorithm (SSA)-generated datasets.","authors":"Siddhartha Kundu","doi":"10.1142/S0219720022500184","DOIUrl":"https://doi.org/10.1142/S0219720022500184","url":null,"abstract":"<p><p>Whilst data on biochemical networks has increased several-fold, our comprehension of the underlying molecular biology is incomplete and inadequate. Simulation studies permit data collation from disparate time points and the imputed trajectories can provide valuable insights into the molecular biology of complex biochemical systems. Although, stochastic simulations are accurate, each run is an independent event and the data that is generated cannot be directly compared even with identical simulation times. This lack of robustness will preclude a biologically meaningful result for the metabolite(s) of concern and is a significant limitation of this approach. \"TemporalGSSA\" or temporal Gillespie Stochastic Simulation Algorithm is an R-wrapper which will collate and partition SSA-generated datasets with identical simulation times (trials) into finite sets of linear models (technical replicates). Each such model (time step of a single run, absolute number of molecules for a metabolite) computes several coefficients (slope, intercept, etc.). These coefficients are averaged (mean slope, mean intercept) across all trials of a technical replicate and along with an imputed time step (mean, median, random) is incorporated into a linear regression equation. The solution to this equation is the number of molecules of a metabolite which is used to compute the molar concentration of the metabolite per technical replicate. The summarized (mean, standard deviation) data of this vector of technical replicates is the outcome or numerical estimate of the molar concentration of a metabolite and is dependent on the duration of the simulation. If the SSA-generated dataset comprises runs with differing simulation times, \"TemporalGSSA\" can compute the time-dependent trajectory of a metabolite provided the trials-per technical replicate constraint is complied with. The algorithms deployed by \"TemporalGSSA\" are rigorous, have a sound theoretical basis and have contributed meaningfully to our comprehension of the mechanism(s) that drive complex biochemical systems. \"TemporalGSSA\", is robust, freely accessible and easy to use with several readily testable examples.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40691252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-01DOI: 10.1142/S021972002250010X
Andisheh Dadashi, Derek Martinez
Metabolism is an essential cellular process for the growth and maintenance of organisms. A better understanding of metabolism during embryogenesis may shed light on the developmental origins of human disease. Metabolic networks, however, are vastly complex with many redundant pathways and interconnected circuits. Thus, computational approaches serve as a practical solution for unraveling the genetic basis of embryo metabolism to help guide future experimental investigations. RNA-sequencing and other profiling technologies make it possible to elucidate metabolic genotype-phenotype relationships and yet our understanding of metabolism is limited. Very few studies have examined the temporal or spatial metabolomics of the human embryo, and prohibitively small sample sizes traditionally observed in human embryo research have presented logistical challenges for metabolic studies, hindering progress towards the reconstruction of the human embryonic metabolome. We employed a network expansion algorithm to evolve the metabolic network of the peri-implantation embryo metabolism and we utilized flux balance analysis (FBA) to examine the viability of the evolved networks. We found that modulating oxygen uptake promotes lactate diffusion across the outer mitochondrial layer, providing in-silico support for a proposed lactate-malate-aspartate shuttle. We developed a stage-specific model to serve as a proof-of-concept for the reconstruction of future metabolic models of development. Our work shows that it is feasible to model human metabolism with respect to time-dependent changes characteristic of peri-implantation development.
{"title":"Flux balance network expansion predicts stage-specific human peri_implantation embryo metabolism.","authors":"Andisheh Dadashi, Derek Martinez","doi":"10.1142/S021972002250010X","DOIUrl":"https://doi.org/10.1142/S021972002250010X","url":null,"abstract":"<p><p>Metabolism is an essential cellular process for the growth and maintenance of organisms. A better understanding of metabolism during embryogenesis may shed light on the developmental origins of human disease. Metabolic networks, however, are vastly complex with many redundant pathways and interconnected circuits. Thus, computational approaches serve as a practical solution for unraveling the genetic basis of embryo metabolism to help guide future experimental investigations. RNA-sequencing and other profiling technologies make it possible to elucidate metabolic genotype-phenotype relationships and yet our understanding of metabolism is limited. Very few studies have examined the temporal or spatial metabolomics of the human embryo, and prohibitively small sample sizes traditionally observed in human embryo research have presented logistical challenges for metabolic studies, hindering progress towards the reconstruction of the human embryonic metabolome. We employed a network expansion algorithm to evolve the metabolic network of the peri-implantation embryo metabolism and we utilized flux balance analysis (FBA) to examine the viability of the evolved networks. We found that modulating oxygen uptake promotes lactate diffusion across the outer mitochondrial layer, providing <i>in-silico</i> support for a proposed lactate-malate-aspartate shuttle. We developed a stage-specific model to serve as a proof-of-concept for the reconstruction of future metabolic models of development. Our work shows that it is feasible to model human metabolism with respect to time-dependent changes characteristic of peri-implantation development.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10409009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-01Epub Date: 2022-08-03DOI: 10.1142/S0219720022400042
Sona Charles, J Sreekumar, Jeyakumar Natarajan
Tetralogy of Fallot (TOF) is a cyanotic congenital condition contributed by genetic, epigenetic as well as environmental factors. We applied sparse machine learning algorithms to RNAseq and sRNAseq data to select the prospective biomarker candidates. Furthermore, we applied filtering techniques to identify a subset of biomarker pairs in TOF. Differential expression analysis disclosed 2757 genes and 214 miRNAs, which are dysregulated. Weighted gene co-expression network analysis on the differentially expressed genes extracted five significant modules that are enriched in GO terms, extracellular matrix, signaling and calcium ion binding. Also, voomNSC selected two genes and five miRNAs and transformed PLDA-predicted 72 genes and 38 miRNAs as prognostic biomarkers. Out of the selected biomarkers, miRNA target analysis revealed 14 miRNA-gene interactions. Also, 10 out of 14 pairs were oppositely expressed and four out of 10 oppositely expressed biomarker pairs shared common pathways of focal adhesion and P13K-Akt signaling. In conclusion, our study demonstrated the concept of biomarker pairs, which may be considered for clinical validation due to the high literature as well as experimental support.
{"title":"Transcriptomic meta-analysis reveals biomarker pairs and key pathways in Tetralogy of Fallot.","authors":"Sona Charles, J Sreekumar, Jeyakumar Natarajan","doi":"10.1142/S0219720022400042","DOIUrl":"https://doi.org/10.1142/S0219720022400042","url":null,"abstract":"<p><p>Tetralogy of Fallot (TOF) is a cyanotic congenital condition contributed by genetic, epigenetic as well as environmental factors. We applied sparse machine learning algorithms to RNAseq and sRNAseq data to select the prospective biomarker candidates. Furthermore, we applied filtering techniques to identify a subset of biomarker pairs in TOF. Differential expression analysis disclosed 2757 genes and 214 miRNAs, which are dysregulated. Weighted gene co-expression network analysis on the differentially expressed genes extracted five significant modules that are enriched in GO terms, extracellular matrix, signaling and calcium ion binding. Also, voomNSC selected two genes and five miRNAs and transformed PLDA-predicted 72 genes and 38 miRNAs as prognostic biomarkers. Out of the selected biomarkers, miRNA target analysis revealed 14 miRNA-gene interactions. Also, 10 out of 14 pairs were oppositely expressed and four out of 10 oppositely expressed biomarker pairs shared common pathways of focal adhesion and P13K-Akt signaling. In conclusion, our study demonstrated the concept of biomarker pairs, which may be considered for clinical validation due to the high literature as well as experimental support.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40576560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-01Epub Date: 2022-08-03DOI: 10.1142/S0219720022020012
Yun Zheng
{"title":"Introduction to Selected Papers from InCoB 2021.","authors":"Yun Zheng","doi":"10.1142/S0219720022020012","DOIUrl":"https://doi.org/10.1142/S0219720022020012","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40576561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-01Epub Date: 2022-07-25DOI: 10.1142/S0219720022500147
Ali Burak Öncül, Yüksel Çelik, Necdet Mehmet Ünel, Mehmet Cengiz Baloglu
The basic helix loop helix (bHLH) superfamily is a large and diverse protein family that plays a role in various vital functions in nearly all animals and plants. The bHLH proteins form one of the largest families of transcription factors found in plants that act as homo- or heterodimers to regulate the expression of their target genes. The bHLH transcription factor is involved in many aspects of plant development and metabolism, including photomorphogenesis, light signal transduction, secondary metabolism, and stress response. The amount of molecular data has increased dramatically with the development of high-throughput techniques and wide use of bioinformatics techniques. The most efficient way to use this information is to store and analyze the data in a well-organized manner. In this study, all members of the bHLH superfamily in the plant kingdom were used to develop and implement a relational database. We have created a database called bHLHDB (www.bhlhdb.org) for the bHLH family members on which queries can be conducted based on the family or sequences information. The Hidden Markov Model (HMM), which is frequently used by researchers for the analysis of sequences, and the BLAST query were integrated into the database. In addition, the deep learning model was developed to predict the type of TF with only the protein sequence quickly, efficiently, and with 97.54% accuracy and 97.76% precision. We created a unique and next-generation database for bHLH transcription factors and made this database available to the world of science. We believe that the database will be a valuable tool in future studies of the bHLH family.
{"title":"bHLHDB: A next generation database of basic helix loop helix transcription factors based on deep learning model.","authors":"Ali Burak Öncül, Yüksel Çelik, Necdet Mehmet Ünel, Mehmet Cengiz Baloglu","doi":"10.1142/S0219720022500147","DOIUrl":"https://doi.org/10.1142/S0219720022500147","url":null,"abstract":"<p><p>The basic helix loop helix (bHLH) superfamily is a large and diverse protein family that plays a role in various vital functions in nearly all animals and plants. The bHLH proteins form one of the largest families of transcription factors found in plants that act as homo- or heterodimers to regulate the expression of their target genes. The bHLH transcription factor is involved in many aspects of plant development and metabolism, including photomorphogenesis, light signal transduction, secondary metabolism, and stress response. The amount of molecular data has increased dramatically with the development of high-throughput techniques and wide use of bioinformatics techniques. The most efficient way to use this information is to store and analyze the data in a well-organized manner. In this study, all members of the bHLH superfamily in the plant kingdom were used to develop and implement a relational database. We have created a database called bHLHDB (www.bhlhdb.org) for the bHLH family members on which queries can be conducted based on the family or sequences information. The Hidden Markov Model (HMM), which is frequently used by researchers for the analysis of sequences, and the BLAST query were integrated into the database. In addition, the deep learning model was developed to predict the type of TF with only the protein sequence quickly, efficiently, and with 97.54% accuracy and 97.76% precision. We created a unique and next-generation database for bHLH transcription factors and made this database available to the world of science. We believe that the database will be a valuable tool in future studies of the bHLH family.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40555017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-01DOI: 10.1142/S0219720022500159
Omid Zarei, Stéphane L Raeppel, Maryam Hamzeh-Mivehroud
Recepteur d'Origine Nantais known as RON is a member of the receptor tyrosine kinase (RTK) superfamily which has recently gained increasing attention as cancer target for therapeutic intervention. The aim of this work was to perform an alignment-independent three-dimensional quantitative structure-activity relationship (3D QSAR) study for a series of RON inhibitors. A 3D QSAR model based on GRid-INdependent Descriptors (GRIND) methodology was generated using a set of 19 compounds with RON inhibitory activities. The generated 3D QSAR model revealed the main structural features important in the potency of RON inhibitors. The results obtained from the presented study can be used in lead optimization projects for designing of novel compounds where inhibition of RON is needed.
{"title":"An alignment-independent three-dimensional quantitative structure-activity relationship study on ron receptor tyrosine kinase inhibitors.","authors":"Omid Zarei, Stéphane L Raeppel, Maryam Hamzeh-Mivehroud","doi":"10.1142/S0219720022500159","DOIUrl":"https://doi.org/10.1142/S0219720022500159","url":null,"abstract":"<p><p>Recepteur d'Origine Nantais known as RON is a member of the receptor tyrosine kinase (RTK) superfamily which has recently gained increasing attention as cancer target for therapeutic intervention. The aim of this work was to perform an alignment-independent three-dimensional quantitative structure-activity relationship (3D QSAR) study for a series of RON inhibitors. A 3D QSAR model based on GRid-INdependent Descriptors (GRIND) methodology was generated using a set of 19 compounds with RON inhibitory activities. The generated 3D QSAR model revealed the main structural features important in the potency of RON inhibitors. The results obtained from the presented study can be used in lead optimization projects for designing of novel compounds where inhibition of RON is needed.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9233977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}