Pub Date : 2016-01-01Epub Date: 2016-05-24DOI: 10.1155/2016/5614058
Rongying Tang, Debra O Prosser, Donald R Love
The increasing diagnostic use of gene sequencing has led to an expanding dataset of novel variants that lie within consensus splice junctions. The challenge for diagnostic laboratories is the evaluation of these variants in order to determine if they affect splicing or are merely benign. A common evaluation strategy is to use in silico analysis, and it is here that a number of programmes are available online; however, currently, there are no consensus guidelines on the selection of programmes or protocols to interpret the prediction results. Using a collection of 222 pathogenic mutations and 50 benign polymorphisms, we evaluated the sensitivity and specificity of four in silico programmes in predicting the effect of each variant on splicing. The programmes comprised Human Splice Finder (HSF), Max Entropy Scan (MES), NNSplice, and ASSP. The MES and ASSP programmes gave the highest performance based on Receiver Operator Curve analysis, with an optimal cut-off of score reduction of 10%. The study also showed that the sensitivity of prediction is affected by the level of conservation of individual positions, with in silico predictions for variants at positions -4 and +7 within consensus splice sites being largely uninformative.
越来越多的诊断使用的基因测序已经导致扩大的新变异的数据集,位于共识剪接连接。诊断实验室面临的挑战是评估这些变异,以确定它们是否影响剪接或仅仅是良性的。一种常见的评价策略是使用计算机分析,在这方面,一些课程可以在线获得;然而,目前在选择方案或方案来解释预测结果方面没有一致的指导方针。利用222个致病突变和50个良性多态性,我们评估了四种计算机程序在预测每种变异对剪接影响方面的敏感性和特异性。程序包括Human Splice Finder (HSF)、Max Entropy Scan (MES)、NNSplice和ASSP。根据接收算子曲线分析,MES和ASSP方案的表现最好,分数减少的最佳截止值为10%。该研究还表明,预测的敏感性受到个体位置保护水平的影响,在共识剪接位点内-4和+7位置变异的计算机预测在很大程度上是缺乏信息的。
{"title":"Evaluation of Bioinformatic Programmes for the Analysis of Variants within Splice Site Consensus Regions.","authors":"Rongying Tang, Debra O Prosser, Donald R Love","doi":"10.1155/2016/5614058","DOIUrl":"https://doi.org/10.1155/2016/5614058","url":null,"abstract":"<p><p>The increasing diagnostic use of gene sequencing has led to an expanding dataset of novel variants that lie within consensus splice junctions. The challenge for diagnostic laboratories is the evaluation of these variants in order to determine if they affect splicing or are merely benign. A common evaluation strategy is to use in silico analysis, and it is here that a number of programmes are available online; however, currently, there are no consensus guidelines on the selection of programmes or protocols to interpret the prediction results. Using a collection of 222 pathogenic mutations and 50 benign polymorphisms, we evaluated the sensitivity and specificity of four in silico programmes in predicting the effect of each variant on splicing. The programmes comprised Human Splice Finder (HSF), Max Entropy Scan (MES), NNSplice, and ASSP. The MES and ASSP programmes gave the highest performance based on Receiver Operator Curve analysis, with an optimal cut-off of score reduction of 10%. The study also showed that the sensitivity of prediction is affected by the level of conservation of individual positions, with in silico predictions for variants at positions -4 and +7 within consensus splice sites being largely uninformative. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"5614058"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/5614058","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34477255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-01-01Epub Date: 2016-06-20DOI: 10.1155/2016/7357123
Juan Frausto-Solis, Ernesto Liñán-García, Juan Paulo Sánchez-Hernández, J Javier González-Barbosa, Carlos González-Flores, Guadalupe Castilla-Valdez
A new hybrid Multiphase Simulated Annealing Algorithm using Boltzmann and Bose-Einstein distributions (MPSABBE) is proposed. MPSABBE was designed for solving the Protein Folding Problem (PFP) instances. This new approach has four phases: (i) Multiquenching Phase (MQP), (ii) Boltzmann Annealing Phase (BAP), (iii) Bose-Einstein Annealing Phase (BEAP), and (iv) Dynamical Equilibrium Phase (DEP). BAP and BEAP are simulated annealing searching procedures based on Boltzmann and Bose-Einstein distributions, respectively. DEP is also a simulated annealing search procedure, which is applied at the final temperature of the fourth phase, which can be seen as a second Bose-Einstein phase. MQP is a search process that ranges from extremely high to high temperatures, applying a very fast cooling process, and is not very restrictive to accept new solutions. However, BAP and BEAP range from high to low and from low to very low temperatures, respectively. They are more restrictive for accepting new solutions. DEP uses a particular heuristic to detect the stochastic equilibrium by applying a least squares method during its execution. MPSABBE parameters are tuned with an analytical method, which considers the maximal and minimal deterioration of problem instances. MPSABBE was tested with several instances of PFP, showing that the use of both distributions is better than using only the Boltzmann distribution on the classical SA.
{"title":"Multiphase Simulated Annealing Based on Boltzmann and Bose-Einstein Distribution Applied to Protein Folding Problem.","authors":"Juan Frausto-Solis, Ernesto Liñán-García, Juan Paulo Sánchez-Hernández, J Javier González-Barbosa, Carlos González-Flores, Guadalupe Castilla-Valdez","doi":"10.1155/2016/7357123","DOIUrl":"https://doi.org/10.1155/2016/7357123","url":null,"abstract":"<p><p>A new hybrid Multiphase Simulated Annealing Algorithm using Boltzmann and Bose-Einstein distributions (MPSABBE) is proposed. MPSABBE was designed for solving the Protein Folding Problem (PFP) instances. This new approach has four phases: (i) Multiquenching Phase (MQP), (ii) Boltzmann Annealing Phase (BAP), (iii) Bose-Einstein Annealing Phase (BEAP), and (iv) Dynamical Equilibrium Phase (DEP). BAP and BEAP are simulated annealing searching procedures based on Boltzmann and Bose-Einstein distributions, respectively. DEP is also a simulated annealing search procedure, which is applied at the final temperature of the fourth phase, which can be seen as a second Bose-Einstein phase. MQP is a search process that ranges from extremely high to high temperatures, applying a very fast cooling process, and is not very restrictive to accept new solutions. However, BAP and BEAP range from high to low and from low to very low temperatures, respectively. They are more restrictive for accepting new solutions. DEP uses a particular heuristic to detect the stochastic equilibrium by applying a least squares method during its execution. MPSABBE parameters are tuned with an analytical method, which considers the maximal and minimal deterioration of problem instances. MPSABBE was tested with several instances of PFP, showing that the use of both distributions is better than using only the Boltzmann distribution on the classical SA. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"7357123"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/7357123","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34668396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-01-01Epub Date: 2016-04-14DOI: 10.1155/2016/1673284
Rayapadi G Swetha, Sudha Ramaiah, Anand Anbarasu, Kanagaraj Sekar
Ebola Virus Disease (EVD) is a life-threatening haemorrhagic fever in humans. Even though there are many reports on EVD, the protein precursor functions and virulent factors of ebolaviruses remain poorly understood. Comparative analyses of Ebolavirus genomes will help in the identification of these important features. This prompted us to develop the Ebolavirus Database (EDB) and we have provided links to various tools that will aid researchers to locate important regions in both the genomes and proteomes of Ebolavirus. The genomic analyses of ebolaviruses will provide important clues for locating the essential and core functional genes. The aim of EDB is to act as an integrated resource for ebolaviruses and we strongly believe that the database will be a useful tool for clinicians, microbiologists, health care workers, and bioscience researchers.
{"title":"Ebolavirus Database: Gene and Protein Information Resource for Ebolaviruses.","authors":"Rayapadi G Swetha, Sudha Ramaiah, Anand Anbarasu, Kanagaraj Sekar","doi":"10.1155/2016/1673284","DOIUrl":"https://doi.org/10.1155/2016/1673284","url":null,"abstract":"<p><p>Ebola Virus Disease (EVD) is a life-threatening haemorrhagic fever in humans. Even though there are many reports on EVD, the protein precursor functions and virulent factors of ebolaviruses remain poorly understood. Comparative analyses of Ebolavirus genomes will help in the identification of these important features. This prompted us to develop the Ebolavirus Database (EDB) and we have provided links to various tools that will aid researchers to locate important regions in both the genomes and proteomes of Ebolavirus. The genomic analyses of ebolaviruses will provide important clues for locating the essential and core functional genes. The aim of EDB is to act as an integrated resource for ebolaviruses and we strongly believe that the database will be a useful tool for clinicians, microbiologists, health care workers, and bioscience researchers. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"1673284"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/1673284","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34401638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-01-01Epub Date: 2016-05-04DOI: 10.1155/2016/1276594
Salvador Eugenio C Caoili
Epitope-based design of vaccines, immunotherapeutics, and immunodiagnostics is complicated by structural changes that radically alter immunological outcomes. This is obscured by expressing redundancy among linear-epitope data as fractional sequence-alignment identity, which fails to account for potentially drastic loss of binding affinity due to single-residue substitutions even where these might be considered conservative in the context of classical sequence analysis. From the perspective of immune function based on molecular recognition of epitopes, functional redundancy of epitope data (FRED) thus may be defined in a biologically more meaningful way based on residue-level physicochemical similarity in the context of antigenic cross-reaction, with functional similarity between epitopes expressed as the Shannon information entropy for differential epitope binding. Such similarity may be estimated in terms of structural differences between an immunogen epitope and an antigen epitope with reference to an idealized binding site of high complementarity to the immunogen epitope, by analogy between protein folding and ligand-receptor binding; but this underestimates potential for cross-reactivity, suggesting that epitope-binding site complementarity is typically suboptimal as regards immunologic specificity. The apparently suboptimal complementarity may reflect a tradeoff to attain optimal immune function that favors generation of immune-system components each having potential for cross-reactivity with a variety of epitopes.
{"title":"Expressing Redundancy among Linear-Epitope Sequence Data Based on Residue-Level Physicochemical Similarity in the Context of Antigenic Cross-Reaction.","authors":"Salvador Eugenio C Caoili","doi":"10.1155/2016/1276594","DOIUrl":"https://doi.org/10.1155/2016/1276594","url":null,"abstract":"<p><p>Epitope-based design of vaccines, immunotherapeutics, and immunodiagnostics is complicated by structural changes that radically alter immunological outcomes. This is obscured by expressing redundancy among linear-epitope data as fractional sequence-alignment identity, which fails to account for potentially drastic loss of binding affinity due to single-residue substitutions even where these might be considered conservative in the context of classical sequence analysis. From the perspective of immune function based on molecular recognition of epitopes, functional redundancy of epitope data (FRED) thus may be defined in a biologically more meaningful way based on residue-level physicochemical similarity in the context of antigenic cross-reaction, with functional similarity between epitopes expressed as the Shannon information entropy for differential epitope binding. Such similarity may be estimated in terms of structural differences between an immunogen epitope and an antigen epitope with reference to an idealized binding site of high complementarity to the immunogen epitope, by analogy between protein folding and ligand-receptor binding; but this underestimates potential for cross-reactivity, suggesting that epitope-binding site complementarity is typically suboptimal as regards immunologic specificity. The apparently suboptimal complementarity may reflect a tradeoff to attain optimal immune function that favors generation of immune-system components each having potential for cross-reactivity with a variety of epitopes. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"1276594"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/1276594","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34620741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-01-01Epub Date: 2016-03-02DOI: 10.1155/2016/9654921
Luke Day, Ouala Abdelhadi Ep Souki, Andreas A Albrecht, Kathleen Steinhöfel
Identifying sets of metastable conformations is a major research topic in RNA energy landscape analysis, and recently several methods have been proposed for finding local minima in landscapes spawned by RNA secondary structures. An important and time-critical component of such methods is steepest, or gradient, descent in attraction basins of local minima. We analyse the speed-up achievable by randomised descent in attraction basins in the context of large sample sets where the size has an order of magnitude in the region of ~10(6). While the gain for each individual sample might be marginal, the overall run-time improvement can be significant. Moreover, for the two nongradient methods we analysed for partial energy landscapes induced by ten different RNA sequences, we obtained that the number of observed local minima is on average larger by 7.3% and 3.5%, respectively. The run-time improvement is approximately 16.6% and 6.8% on average over the ten partial energy landscapes. For the large sample size we selected for descent procedures, the coverage of local minima is very high up to energy values of the region where the samples were randomly selected from the partial energy landscapes; that is, the difference to the total set of local minima is mainly due to the upper area of the energy landscapes.
{"title":"Random versus Deterministic Descent in RNA Energy Landscape Analysis.","authors":"Luke Day, Ouala Abdelhadi Ep Souki, Andreas A Albrecht, Kathleen Steinhöfel","doi":"10.1155/2016/9654921","DOIUrl":"https://doi.org/10.1155/2016/9654921","url":null,"abstract":"<p><p>Identifying sets of metastable conformations is a major research topic in RNA energy landscape analysis, and recently several methods have been proposed for finding local minima in landscapes spawned by RNA secondary structures. An important and time-critical component of such methods is steepest, or gradient, descent in attraction basins of local minima. We analyse the speed-up achievable by randomised descent in attraction basins in the context of large sample sets where the size has an order of magnitude in the region of ~10(6). While the gain for each individual sample might be marginal, the overall run-time improvement can be significant. Moreover, for the two nongradient methods we analysed for partial energy landscapes induced by ten different RNA sequences, we obtained that the number of observed local minima is on average larger by 7.3% and 3.5%, respectively. The run-time improvement is approximately 16.6% and 6.8% on average over the ten partial energy landscapes. For the large sample size we selected for descent procedures, the coverage of local minima is very high up to energy values of the region where the samples were randomly selected from the partial energy landscapes; that is, the difference to the total set of local minima is mainly due to the upper area of the energy landscapes. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"9654921"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/9654921","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34330444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-01-01Epub Date: 2016-06-02DOI: 10.1155/2016/8054219
Manish Kurhekar, Umesh Deshpande
Modeling of stem cells not only describes but also predicts how a stem cell's environment can control its fate. The first stem cell populations discovered were hematopoietic stem cells (HSCs). In this paper, we present a deterministic model of bone marrow (that hosts HSCs) that is consistent with several of the qualitative biological observations. This model incorporates stem cell death (apoptosis) after a certain number of cell divisions and also demonstrates that a single HSC can potentially populate the entire bone marrow. It also demonstrates that there is a production of sufficient number of differentiated cells (RBCs, WBCs, etc.). We prove that our model of bone marrow is biologically consistent and it overcomes the biological feasibility limitations of previously reported models. The major contribution of our model is the flexibility it allows in choosing model parameters which permits several different simulations to be carried out in silico without affecting the homeostatic properties of the model. We have also performed agent-based simulation of the model of bone marrow system proposed in this paper. We have also included parameter details and the results obtained from the simulation. The program of the agent-based simulation of the proposed model is made available on a publicly accessible website.
{"title":"Agent-Based Deterministic Modeling of the Bone Marrow Homeostasis.","authors":"Manish Kurhekar, Umesh Deshpande","doi":"10.1155/2016/8054219","DOIUrl":"https://doi.org/10.1155/2016/8054219","url":null,"abstract":"<p><p>Modeling of stem cells not only describes but also predicts how a stem cell's environment can control its fate. The first stem cell populations discovered were hematopoietic stem cells (HSCs). In this paper, we present a deterministic model of bone marrow (that hosts HSCs) that is consistent with several of the qualitative biological observations. This model incorporates stem cell death (apoptosis) after a certain number of cell divisions and also demonstrates that a single HSC can potentially populate the entire bone marrow. It also demonstrates that there is a production of sufficient number of differentiated cells (RBCs, WBCs, etc.). We prove that our model of bone marrow is biologically consistent and it overcomes the biological feasibility limitations of previously reported models. The major contribution of our model is the flexibility it allows in choosing model parameters which permits several different simulations to be carried out in silico without affecting the homeostatic properties of the model. We have also performed agent-based simulation of the model of bone marrow system proposed in this paper. We have also included parameter details and the results obtained from the simulation. The program of the agent-based simulation of the proposed model is made available on a publicly accessible website. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"8054219"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/8054219","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34606483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-01-01Epub Date: 2016-06-06DOI: 10.1155/2016/6040124
Sebastián Metz, Juan Manuel Cabrera, Eva Rueda, Federico Giri, Patricia Amavet
Microsatellites are genomic sequences comprised of tandem repeats of short nucleotide motifs widely used as molecular markers in population genetics. FullSSR is a new bioinformatic tool for microsatellite (SSR) loci detection and primer design using genomic data from NGS assay. The software was tested with 2000 sequences of Oryza sativa shotgun sequencing project from the National Center of Biotechnology Information Trace Archive and with partial genome sequencing with ROCHE 454® from Caiman latirostris, Salvator merianae, Aegla platensis, and Zilchiopsis collastinensis. FullSSR performance was compared against other similar SSR search programs. The results of the use of this kind of approach depend on the parameters set by the user. In addition, results can be affected by the analyzed sequences because of differences among the genomes. FullSSR simplifies the detection of SSRs and primer design on a big data set. The command line interface of FullSSR was intended to be used as part of genomic analysis tools pipeline; however, it can be used as a stand-alone program because the results are easily interpreted for a nonexpert user.
{"title":"FullSSR: Microsatellite Finder and Primer Designer.","authors":"Sebastián Metz, Juan Manuel Cabrera, Eva Rueda, Federico Giri, Patricia Amavet","doi":"10.1155/2016/6040124","DOIUrl":"https://doi.org/10.1155/2016/6040124","url":null,"abstract":"<p><p>Microsatellites are genomic sequences comprised of tandem repeats of short nucleotide motifs widely used as molecular markers in population genetics. FullSSR is a new bioinformatic tool for microsatellite (SSR) loci detection and primer design using genomic data from NGS assay. The software was tested with 2000 sequences of Oryza sativa shotgun sequencing project from the National Center of Biotechnology Information Trace Archive and with partial genome sequencing with ROCHE 454® from Caiman latirostris, Salvator merianae, Aegla platensis, and Zilchiopsis collastinensis. FullSSR performance was compared against other similar SSR search programs. The results of the use of this kind of approach depend on the parameters set by the user. In addition, results can be affected by the analyzed sequences because of differences among the genomes. FullSSR simplifies the detection of SSRs and primer design on a big data set. The command line interface of FullSSR was intended to be used as part of genomic analysis tools pipeline; however, it can be used as a stand-alone program because the results are easily interpreted for a nonexpert user. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"6040124"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/6040124","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34692637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-01-01Epub Date: 2016-05-17DOI: 10.1155/2016/3791214
Manuel Galli, Italo Zoppis, Gabriele De Sio, Clizia Chinello, Fabio Pagni, Fulvio Magni, Giancarlo Mauri
Biomarkers able to characterise and predict multifactorial diseases are still one of the most important targets for all the "omics" investigations. In this context, Matrix-Assisted Laser Desorption/Ionisation-Mass Spectrometry Imaging (MALDI-MSI) has gained considerable attention in recent years, but it also led to a huge amount of complex data to be elaborated and interpreted. For this reason, computational and machine learning procedures for biomarker discovery are important tools to consider, both to reduce data dimension and to provide predictive markers for specific diseases. For instance, the availability of protein and genetic markers to support thyroid lesion diagnoses would impact deeply on society due to the high presence of undetermined reports (THY3) that are generally treated as malignant patients. In this paper we show how an accurate classification of thyroid bioptic specimens can be obtained through the application of a state-of-the-art machine learning approach (i.e., Support Vector Machines) on MALDI-MSI data, together with a particular wrapper feature selection algorithm (i.e., recursive feature elimination). The model is able to provide an accurate discriminatory capability using only 20 out of 144 features, resulting in an increase of the model performances, reliability, and computational efficiency. Finally, tissue areas rather than average proteomic profiles are classified, highlighting potential discriminating areas of clinical interest.
{"title":"A Support Vector Machine Classification of Thyroid Bioptic Specimens Using MALDI-MSI Data.","authors":"Manuel Galli, Italo Zoppis, Gabriele De Sio, Clizia Chinello, Fabio Pagni, Fulvio Magni, Giancarlo Mauri","doi":"10.1155/2016/3791214","DOIUrl":"https://doi.org/10.1155/2016/3791214","url":null,"abstract":"<p><p>Biomarkers able to characterise and predict multifactorial diseases are still one of the most important targets for all the \"omics\" investigations. In this context, Matrix-Assisted Laser Desorption/Ionisation-Mass Spectrometry Imaging (MALDI-MSI) has gained considerable attention in recent years, but it also led to a huge amount of complex data to be elaborated and interpreted. For this reason, computational and machine learning procedures for biomarker discovery are important tools to consider, both to reduce data dimension and to provide predictive markers for specific diseases. For instance, the availability of protein and genetic markers to support thyroid lesion diagnoses would impact deeply on society due to the high presence of undetermined reports (THY3) that are generally treated as malignant patients. In this paper we show how an accurate classification of thyroid bioptic specimens can be obtained through the application of a state-of-the-art machine learning approach (i.e., Support Vector Machines) on MALDI-MSI data, together with a particular wrapper feature selection algorithm (i.e., recursive feature elimination). The model is able to provide an accurate discriminatory capability using only 20 out of 144 features, resulting in an increase of the model performances, reliability, and computational efficiency. Finally, tissue areas rather than average proteomic profiles are classified, highlighting potential discriminating areas of clinical interest. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"3791214"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/3791214","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34571874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-01-01Epub Date: 2016-07-10DOI: 10.1155/2016/2632917
Mohamed M Hassan, Shaza E Omer, Rahma M Khalf-Allah, Razaz Y Mustafa, Isra S Ali, Sofia B Mohamed
This study was carried out for Homo sapiens single variation (SNPs/Indels) in BRAF gene through coding/non-coding regions. Variants data was obtained from database of SNP even last update of November, 2015. Many bioinformatics tools were used to identify functional SNPs and indels in proteins functions, structures and expressions. Results shown, for coding polymorphisms, 111 SNPs predicted as highly damaging and six other were less. For UTRs, showed five SNPs and one indel were altered in micro RNAs binding sites (3' UTR), furthermore nil SNP or indel have functional altered in transcription factor binding sites (5' UTR). In addition for 5'/3' splice sites, analysis showed that one SNP within 5' splice site and one Indel in 3' splice site showed potential alteration of splicing. In conclude these previous functional identified SNPs and indels could lead to gene alteration, which may be directly or indirectly contribute to the occurrence of many diseases.
{"title":"Bioinformatics Approach for Prediction of Functional Coding/Noncoding Simple Polymorphisms (SNPs/Indels) in Human BRAF Gene.","authors":"Mohamed M Hassan, Shaza E Omer, Rahma M Khalf-Allah, Razaz Y Mustafa, Isra S Ali, Sofia B Mohamed","doi":"10.1155/2016/2632917","DOIUrl":"https://doi.org/10.1155/2016/2632917","url":null,"abstract":"<p><p>This study was carried out for Homo sapiens single variation (SNPs/Indels) in BRAF gene through coding/non-coding regions. Variants data was obtained from database of SNP even last update of November, 2015. Many bioinformatics tools were used to identify functional SNPs and indels in proteins functions, structures and expressions. Results shown, for coding polymorphisms, 111 SNPs predicted as highly damaging and six other were less. For UTRs, showed five SNPs and one indel were altered in micro RNAs binding sites (3' UTR), furthermore nil SNP or indel have functional altered in transcription factor binding sites (5' UTR). In addition for 5'/3' splice sites, analysis showed that one SNP within 5' splice site and one Indel in 3' splice site showed potential alteration of splicing. In conclude these previous functional identified SNPs and indels could lead to gene alteration, which may be directly or indirectly contribute to the occurrence of many diseases. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"2632917"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/2632917","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34623617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-01-01Epub Date: 2016-03-06DOI: 10.1155/2016/8792814
Vishwambhar Bhandare, Amutha Ramaswamy
The human Argonaute2 protein (Ago2) is a key player in RNA interference pathway and small RNA recognition by Ago2 is the crucial step in siRNA mediated gene silencing mechanism. The present study highlights the structural and functional dynamics of human Ago2 and the interaction mechanism of Ago2 with a set of seven siRNAs for the first time. The human Ago2 protein adopts two conformations such as "open" and "close" during the simulation of 25 ns. One of the domains named as PAZ, which is responsible for anchoring the 3'-end of siRNA guide strand, is observed as a highly flexible region. The interaction between Ago2 and siRNA, analyzed using a set of siRNAs (targeting at positions 128, 251, 341, 383, 537, 1113, and 1115 of mRNA) designed to target tdp43 mutants causing Amyotrophic Lateral Sclerosis (ALS) disease, revealed the stable and strong recognition of siRNA by the Ago2 protein during dynamics. Among the studied siRNAs, the siRNA341 is identified as a potent siRNA to recognize Ago2 and hence could be used further as a possible siRNA candidate to target the mutant tdp43 protein for the treatment of ALS patients.
{"title":"Structural Dynamics of Human Argonaute2 and Its Interaction with siRNAs Designed to Target Mutant tdp43.","authors":"Vishwambhar Bhandare, Amutha Ramaswamy","doi":"10.1155/2016/8792814","DOIUrl":"https://doi.org/10.1155/2016/8792814","url":null,"abstract":"<p><p>The human Argonaute2 protein (Ago2) is a key player in RNA interference pathway and small RNA recognition by Ago2 is the crucial step in siRNA mediated gene silencing mechanism. The present study highlights the structural and functional dynamics of human Ago2 and the interaction mechanism of Ago2 with a set of seven siRNAs for the first time. The human Ago2 protein adopts two conformations such as \"open\" and \"close\" during the simulation of 25 ns. One of the domains named as PAZ, which is responsible for anchoring the 3'-end of siRNA guide strand, is observed as a highly flexible region. The interaction between Ago2 and siRNA, analyzed using a set of siRNAs (targeting at positions 128, 251, 341, 383, 537, 1113, and 1115 of mRNA) designed to target tdp43 mutants causing Amyotrophic Lateral Sclerosis (ALS) disease, revealed the stable and strong recognition of siRNA by the Ago2 protein during dynamics. Among the studied siRNAs, the siRNA341 is identified as a potent siRNA to recognize Ago2 and hence could be used further as a possible siRNA candidate to target the mutant tdp43 protein for the treatment of ALS patients. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"8792814"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/8792814","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34330443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}