Pub Date : 2014-12-18DOI: 10.1109/ISB.2014.6990736
Su-Li Li, Meng-Zhe Jin, Zhao-Hui Qi
Human influenza A virus is an important pathogen which threatens the health of human in a long time. The mutation study of HA gene is the most important. Here we investigate the evolution characteristics of HA gene of H3N2 influenza virus from 1990 to 2013. Numerical mapping and PCA clustering analysis are applied to the gene evolution analysis. The clustering diagram by MATLAB represents the mapping of HA gene in 2D space. The first two principal components account for 78.48% by PCA analysis. And the points are clustered into three parts, 1990~1999, 2000~2005 and 2006~2013. However, there is no obvious interval among them. Then we show the graphical representation of HA gene sequences according to the emerging time of isolates and different continents. Results show that during 1990 to 2013 human influenza A H3N2 virus has been evoluting gradually. There was not large genetic recombination. Even so, it is necessary to continuously monitor the human influenza A (H3N2) viruses.
{"title":"Evolution analysis for HA gene of human influenza A H3N2 virus (1990 – 2013)","authors":"Su-Li Li, Meng-Zhe Jin, Zhao-Hui Qi","doi":"10.1109/ISB.2014.6990736","DOIUrl":"https://doi.org/10.1109/ISB.2014.6990736","url":null,"abstract":"Human influenza A virus is an important pathogen which threatens the health of human in a long time. The mutation study of HA gene is the most important. Here we investigate the evolution characteristics of HA gene of H3N2 influenza virus from 1990 to 2013. Numerical mapping and PCA clustering analysis are applied to the gene evolution analysis. The clustering diagram by MATLAB represents the mapping of HA gene in 2D space. The first two principal components account for 78.48% by PCA analysis. And the points are clustered into three parts, 1990~1999, 2000~2005 and 2006~2013. However, there is no obvious interval among them. Then we show the graphical representation of HA gene sequences according to the emerging time of isolates and different continents. Results show that during 1990 to 2013 human influenza A H3N2 virus has been evoluting gradually. There was not large genetic recombination. Even so, it is necessary to continuously monitor the human influenza A (H3N2) viruses.","PeriodicalId":249103,"journal":{"name":"2014 8th International Conference on Systems Biology (ISB)","volume":"123 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116435064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-18DOI: 10.1109/ISB.2014.6990740
Hongdong Li, G. Hong, Zheng Guo
Detecting aberrant DNA methylation as diagnostic or prognostic biomarkers for cancer has been a topic of considerable interest recently. However, current classifiers based on absolute methylation values detected from a cohort of samples are typically difficult to be transferable to other cohorts of samples. Here, we employed a modified rank-based method to extract pairs of CpG sites with reversal relative DNA methylation levels in disease samples to those in normal controls for five cancer types respectively. The reversal pairs showed excellent prediction performance with the accuracy above 95% for each type of cancer. Furthermore, the reversal pairs identified for a cancer type could distinguish samples with different subtypes and different malignant stages including early stage of this cancer from normal controls and were also specific to this cancer. In conclusion, the reversal pairs detected by the rank-based method are accurate and transferable to independent cohorts of samples, which are also applicable to early cancer diagnosis. They could also be used to detect common molecular alterations in cancer, which can shed light on the other follow-up studies.
{"title":"Reversal DNA methylation patterns for cancer diagnosis","authors":"Hongdong Li, G. Hong, Zheng Guo","doi":"10.1109/ISB.2014.6990740","DOIUrl":"https://doi.org/10.1109/ISB.2014.6990740","url":null,"abstract":"Detecting aberrant DNA methylation as diagnostic or prognostic biomarkers for cancer has been a topic of considerable interest recently. However, current classifiers based on absolute methylation values detected from a cohort of samples are typically difficult to be transferable to other cohorts of samples. Here, we employed a modified rank-based method to extract pairs of CpG sites with reversal relative DNA methylation levels in disease samples to those in normal controls for five cancer types respectively. The reversal pairs showed excellent prediction performance with the accuracy above 95% for each type of cancer. Furthermore, the reversal pairs identified for a cancer type could distinguish samples with different subtypes and different malignant stages including early stage of this cancer from normal controls and were also specific to this cancer. In conclusion, the reversal pairs detected by the rank-based method are accurate and transferable to independent cohorts of samples, which are also applicable to early cancer diagnosis. They could also be used to detect common molecular alterations in cancer, which can shed light on the other follow-up studies.","PeriodicalId":249103,"journal":{"name":"2014 8th International Conference on Systems Biology (ISB)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126225028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the development of high-throughput and low-cost sequencing technology, a large amount of marine microbial sequences is generated. So, it is possible to research more uncultivated marine microbes. Generally, the functional capability and taxa structure are highly related with environment factors in microbial communities, which are hidden in these large amount sequences. However, most works used the canonical correlation analysis (CCA) method to research the correlative relationship among taxa, pathways and environmental factors. CCA is difficult to find which environmental factors are the major determinants of some special taxa and pathway. In this paper, we integrated 14 ocean metagenomes with geographical, meteorological and geophysicochemical data to construct the correlative weighted networks with Spearman correlation. By using an improved weighted network community detection algorithm, named as IWNCD, we find some special correlation patterns among taxa, pathways and environmental factors. Analysis of these patterns shows that the climatic factors such as temperature, sunlight, and correlated CO2, and the nutrients such as chlorophyII and primary production are the main determining factors of the functional community composition; The growth and development of some special taxa are dependent on some main environmental factors such as sunlight, temperature, CO2, primary production, dissolved oxygen, dissolved silicate; In addition, sampling sites more similar in geographic location have a greater tendency to be closer together based on their metabolic pathways.
{"title":"Mining correlation patterns of taxa, pathways and environmental factors with an improved weighted network community detection algorithm","authors":"Xiao-Ying Yan, Shaowu Zhang, Ze-Gang Wei, Wei-feng Guo","doi":"10.1109/ISB.2014.6990746","DOIUrl":"https://doi.org/10.1109/ISB.2014.6990746","url":null,"abstract":"With the development of high-throughput and low-cost sequencing technology, a large amount of marine microbial sequences is generated. So, it is possible to research more uncultivated marine microbes. Generally, the functional capability and taxa structure are highly related with environment factors in microbial communities, which are hidden in these large amount sequences. However, most works used the canonical correlation analysis (CCA) method to research the correlative relationship among taxa, pathways and environmental factors. CCA is difficult to find which environmental factors are the major determinants of some special taxa and pathway. In this paper, we integrated 14 ocean metagenomes with geographical, meteorological and geophysicochemical data to construct the correlative weighted networks with Spearman correlation. By using an improved weighted network community detection algorithm, named as IWNCD, we find some special correlation patterns among taxa, pathways and environmental factors. Analysis of these patterns shows that the climatic factors such as temperature, sunlight, and correlated CO2, and the nutrients such as chlorophyII and primary production are the main determining factors of the functional community composition; The growth and development of some special taxa are dependent on some main environmental factors such as sunlight, temperature, CO2, primary production, dissolved oxygen, dissolved silicate; In addition, sampling sites more similar in geographic location have a greater tendency to be closer together based on their metabolic pathways.","PeriodicalId":249103,"journal":{"name":"2014 8th International Conference on Systems Biology (ISB)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132995965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-10-01DOI: 10.1109/ISB.2014.6990748
Changyi Ma, J. Shang, Shengjun Li, Y. Sun
Most of complex diseases are believed to be mainly caused by epistatic interactions of pair single nucleotide poly-morphisms (SNPs), namely, SNP-SNP interactions. Though many works have been done for the detection of SNP-SNP interactions, the algorithmic development is still ongoing due to their mathematical and computational complexities. In this study, we proposed a method, PSOMiner, based on the generalized particle swarm optimization algorithm, with mutual information as its fitness function, for the detection of SNP-SNP interaction that has the highest pathogenic effect in a SNP data set. Experiments of PSOMiner are performed on six simulation data sets under the criteria of detection power. Results demonstrate that PSOMiner is promising for the detection of SNP-SNP interaction. In addition, the application of PSOMiner on a real age-related macular degeneration (AMD) data set provides several new clues for the exploration of AMD associated SNPs that have not been described previously. PSOMiner might be an alternative to existing methods for detecting SNP-SNP interactions.
{"title":"Detection of SNP-SNP interaction based on the generalized particle swarm optimization algorithm","authors":"Changyi Ma, J. Shang, Shengjun Li, Y. Sun","doi":"10.1109/ISB.2014.6990748","DOIUrl":"https://doi.org/10.1109/ISB.2014.6990748","url":null,"abstract":"Most of complex diseases are believed to be mainly caused by epistatic interactions of pair single nucleotide poly-morphisms (SNPs), namely, SNP-SNP interactions. Though many works have been done for the detection of SNP-SNP interactions, the algorithmic development is still ongoing due to their mathematical and computational complexities. In this study, we proposed a method, PSOMiner, based on the generalized particle swarm optimization algorithm, with mutual information as its fitness function, for the detection of SNP-SNP interaction that has the highest pathogenic effect in a SNP data set. Experiments of PSOMiner are performed on six simulation data sets under the criteria of detection power. Results demonstrate that PSOMiner is promising for the detection of SNP-SNP interaction. In addition, the application of PSOMiner on a real age-related macular degeneration (AMD) data set provides several new clues for the exploration of AMD associated SNPs that have not been described previously. PSOMiner might be an alternative to existing methods for detecting SNP-SNP interactions.","PeriodicalId":249103,"journal":{"name":"2014 8th International Conference on Systems Biology (ISB)","volume":"469 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122946727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-10-01DOI: 10.1109/ISB.2014.6990759
Yasen Jiao, Pufeng Du, Xiaoquan Su
Knowing the subcellular location of a protein is an important step in understanding its biological functions. In this paper, we developed a new method to identify whether a protein is a Golgi-resident protein or not in plant cells. We proposed to incorporate transmembrane domain information and six different kinds of physicochemical properties of amino acids in the general form of Chou's pseudo-amino acid compositions. By using SVM based classifiers, our method achieved over 90% prediction accuracy in a 5-fold cross validation, which is much better than the other state-of-the-art methods.
{"title":"Predicting Golgi-resident proteins in plants by incorporating N-terminal transmembrane domain information in the general form of Chou's pseudoamino acid compositions","authors":"Yasen Jiao, Pufeng Du, Xiaoquan Su","doi":"10.1109/ISB.2014.6990759","DOIUrl":"https://doi.org/10.1109/ISB.2014.6990759","url":null,"abstract":"Knowing the subcellular location of a protein is an important step in understanding its biological functions. In this paper, we developed a new method to identify whether a protein is a Golgi-resident protein or not in plant cells. We proposed to incorporate transmembrane domain information and six different kinds of physicochemical properties of amino acids in the general form of Chou's pseudo-amino acid compositions. By using SVM based classifiers, our method achieved over 90% prediction accuracy in a 5-fold cross validation, which is much better than the other state-of-the-art methods.","PeriodicalId":249103,"journal":{"name":"2014 8th International Conference on Systems Biology (ISB)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128939930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-10-01DOI: 10.1109/ISB.2014.6990758
Liwei Wang, Jiabei Wang, Qian Zhu
Drug repositioning is one of emerging approaches dedicated to find alternative usages of existing drugs efficiently and economically, especially with the advance in computational technology. The current progress made for computational drug repositioning is primarily focusing on informatics approach development/improvement or exploration on different type of data in order to identify possible drug candidates. Comparing to the existing studies, we proposed a novel method for constructing the disease based network by applying data extracted from the Semantic MEDLINE. Phenotypical associations (disease-disease associations) can be identified from this network, which can drive drug repositioning study by targeting on specific domain. In this paper, we successfully demonstrated the capability of the disease based network in hidden phenotypical association discovery to support drug repositioning in case studies.
{"title":"Evidence based disease network construction towards drug repositioning","authors":"Liwei Wang, Jiabei Wang, Qian Zhu","doi":"10.1109/ISB.2014.6990758","DOIUrl":"https://doi.org/10.1109/ISB.2014.6990758","url":null,"abstract":"Drug repositioning is one of emerging approaches dedicated to find alternative usages of existing drugs efficiently and economically, especially with the advance in computational technology. The current progress made for computational drug repositioning is primarily focusing on informatics approach development/improvement or exploration on different type of data in order to identify possible drug candidates. Comparing to the existing studies, we proposed a novel method for constructing the disease based network by applying data extracted from the Semantic MEDLINE. Phenotypical associations (disease-disease associations) can be identified from this network, which can drive drug repositioning study by targeting on specific domain. In this paper, we successfully demonstrated the capability of the disease based network in hidden phenotypical association discovery to support drug repositioning in case studies.","PeriodicalId":249103,"journal":{"name":"2014 8th International Conference on Systems Biology (ISB)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121714593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-10-01DOI: 10.1109/ISB.2014.6990749
Muscular Aging, Theodoros Koutsandreas, I. Valavanis, E. Pilalis, A. Chatziioannou
This study aims to expand the efficiency of the interpretation concerning the aging process, by exploring a broad gene set, derived from the analysis of an integrative transcriptomic microarray dataset. The dataset comprises human skeletal muscle samples, obtained from healthy males and females, that were used to derive a gene signature of a high informative content, with respect to its functional association with the aging phenotype. Towards this end, a multilayered computational workflow integrating advanced statistical methodologies for the derivation of reliable confidence measures, distribution-based entropy calculations to examine the informational content of the dataset, enrichment analysis, graph-theoretic methods and intuitive visualization was applied. Specifically, statistical testing revealed differentially expressed genes, while an uncertainty calculation algorithm, exploiting Gene Ontology (GO) terms annotations, extended the list of significant genes from 254 to 2791, namely p-value threshold was increased from 0.0005 to 0.103, while keeping simultaneously noise measurements legitimately low. This rich gene set associated functionally the macroscopic phenotype of muscular aging with highly informative, stably correlated with each other, molecular annotations in the GO database. Finally, a set of 57 reliable genes was identified that comprise a gender-independent aging signature, after incorporating crucial information about genes pivotal regulatory role as inferred by the GO tree. The biological interpretation was highly assisted by the illustration of the functional mappings between genes, cellular location and biological processes through circle packing graphs.
{"title":"An Entropy-based Statistical Workflow Provides Noise-Minimizing Biological Annotation for","authors":"Muscular Aging, Theodoros Koutsandreas, I. Valavanis, E. Pilalis, A. Chatziioannou","doi":"10.1109/ISB.2014.6990749","DOIUrl":"https://doi.org/10.1109/ISB.2014.6990749","url":null,"abstract":"This study aims to expand the efficiency of the interpretation concerning the aging process, by exploring a broad gene set, derived from the analysis of an integrative transcriptomic microarray dataset. The dataset comprises human skeletal muscle samples, obtained from healthy males and females, that were used to derive a gene signature of a high informative content, with respect to its functional association with the aging phenotype. Towards this end, a multilayered computational workflow integrating advanced statistical methodologies for the derivation of reliable confidence measures, distribution-based entropy calculations to examine the informational content of the dataset, enrichment analysis, graph-theoretic methods and intuitive visualization was applied. Specifically, statistical testing revealed differentially expressed genes, while an uncertainty calculation algorithm, exploiting Gene Ontology (GO) terms annotations, extended the list of significant genes from 254 to 2791, namely p-value threshold was increased from 0.0005 to 0.103, while keeping simultaneously noise measurements legitimately low. This rich gene set associated functionally the macroscopic phenotype of muscular aging with highly informative, stably correlated with each other, molecular annotations in the GO database. Finally, a set of 57 reliable genes was identified that comprise a gender-independent aging signature, after incorporating crucial information about genes pivotal regulatory role as inferred by the GO tree. The biological interpretation was highly assisted by the illustration of the functional mappings between genes, cellular location and biological processes through circle packing graphs.","PeriodicalId":249103,"journal":{"name":"2014 8th International Conference on Systems Biology (ISB)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133867145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}