Pub Date : 2015-02-10DOI: 10.4172/2153-0602.1000168
B. Hajieghrari, N. Farrokhi, B. Goliaei, K. Kavousi
MicroRNAs (miRNAs) are single stranded non-coding endogenous small RNAs of about 22 nucleotides, which are directly involved in regulating gene expression at post transcriptional level. miRNAs play key roles in development and response to biotic and abiotic stresses. Homology searches allow identification of new miRNAs due to their relative high conservation in plant species. Here, miRNAs were identified for Amborella trichopoda. Known and unique plant miRNAs from miRBase were BLAST-searched against Expressed Sequence Tag (EST) and Genomic Survey Sequence (GSS) in A. trichopoda. All candidate sequences with appropriate fold back structure were screened by a series of miRNA filtering criteria. Finally, we identified and analysed conservation of 5 potential conserved miRNAs belonging to 5 miRNA gene families from ESTs as well 82 newly identified miRNAs dependant 39 miRNA families from GSSs. Potential target genes of identified miRNAs were identified based on their sequence complementarities to the respective miRNAs using psRNATarget against scaffold assignment of A. trichopoda genome sequences. Totally, 1219 target sites in A. trichopoda genome were identified. From which, 941 (77.19%) were predicted to be the subject of miRNA cleavage and 278 (22.81%) scaffolds were regulated via translational repression of mRNA. From the predicted miRNAs, 18 had no target sequence in A.trichopoda.
{"title":"Computational Identification, Characterization and Analysis of Conserved miRNAs and their Targets in Amborella Trichopoda","authors":"B. Hajieghrari, N. Farrokhi, B. Goliaei, K. Kavousi","doi":"10.4172/2153-0602.1000168","DOIUrl":"https://doi.org/10.4172/2153-0602.1000168","url":null,"abstract":"MicroRNAs (miRNAs) are single stranded non-coding endogenous small RNAs of about 22 nucleotides, which are directly involved in regulating gene expression at post transcriptional level. miRNAs play key roles in development and response to biotic and abiotic stresses. Homology searches allow identification of new miRNAs due to their relative high conservation in plant species. Here, miRNAs were identified for Amborella trichopoda. Known and unique plant miRNAs from miRBase were BLAST-searched against Expressed Sequence Tag (EST) and Genomic Survey Sequence (GSS) in A. trichopoda. All candidate sequences with appropriate fold back structure were screened by a series of miRNA filtering criteria. Finally, we identified and analysed conservation of 5 potential conserved miRNAs belonging to 5 miRNA gene families from ESTs as well 82 newly identified miRNAs dependant 39 miRNA families from GSSs. Potential target genes of identified miRNAs were identified based on their sequence complementarities to the respective miRNAs using psRNATarget against scaffold assignment of A. trichopoda genome sequences. Totally, 1219 target sites in A. trichopoda genome were identified. From which, 941 (77.19%) were predicted to be the subject of miRNA cleavage and 278 (22.81%) scaffolds were regulated via translational repression of mRNA. From the predicted miRNAs, 18 had no target sequence in A.trichopoda.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"16 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2015-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87565652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-07DOI: 10.4172/2153-0602.1000166
M. Achou, WahidaLoucif-Ayad, H. Legout, Hayan Hmidan, Mohamed Alburaki, L. Garnery
This study assessed the genetic diversity of honeybees (Apis mellifera) in Algeria, in North Africa, using the molecular marker mtDNA COI-COII (Cytochrome Oxidase I and II). In total, five hundred eighty-two honeybee workers were sampled from 22 regions of the country. A PCR-RFLP (Polymerase Chain Reaction Restriction Fragment Length Polymorphism) analysis of the mtDNA samples distinguished the honeybee evolutionary lineages and mtDNA haplotypes from each region. Our data revealed the presence of three different honeybee lineages among the studied populations, comprising the African (A), North Mediterranean (C) and West Mediterranean (M) lineages. Eight different mtDNA haplotypes were recorded at various frequencies (A1, A2, A8, A9, A10, A13, C7 and M4). For the first time, our results identified a low genetic introgression (3.1%) of non-local mtDNA haplotypes (C7 and M4) among the local Algerian honeybees, most likely due to the import of foreign honeybees. Notably, the southern Algerian honeybee populations had lower haplotype diversity than the northern populations. Overall, the local North African honeybee subspecies A. m. intermissa and/or A. m.sahariensis seem to be remarkably dominant across northern Algeria.
本研究利用分子标记mtDNA COI-COII(细胞色素氧化酶I和II)对北非阿尔及利亚蜜蜂(Apis mellifera)的遗传多样性进行了评估。总共从该国22个地区取样了582只蜜蜂工蜂。通过聚合酶链反应限制性片段长度多态性(PCR-RFLP)分析,区分了各区域蜜蜂的进化谱系和mtDNA单倍型。我们的数据显示,在研究人群中存在三种不同的蜜蜂谱系,包括非洲(A),北地中海(C)和西地中海(M)谱系。8种不同频率的mtDNA单倍型分别为A1、A2、A8、A9、A10、A13、C7和M4。我们的研究结果首次在阿尔及利亚本地蜜蜂中发现了非本地mtDNA单倍型(C7和M4)的低遗传渗入(3.1%),这很可能是由于外来蜜蜂的进口。值得注意的是,阿尔及利亚南部蜜蜂种群的单倍型多样性低于北部种群。总的来说,当地的北非蜜蜂亚种A. m. intermissa和/或A. m. sahara似乎在阿尔及利亚北部占据显著优势。
{"title":"An Insightful Molecular Analysis Reveals Foreign Honeybees Among Algerian Honeybee Populations (Apis mellifera L.)","authors":"M. Achou, WahidaLoucif-Ayad, H. Legout, Hayan Hmidan, Mohamed Alburaki, L. Garnery","doi":"10.4172/2153-0602.1000166","DOIUrl":"https://doi.org/10.4172/2153-0602.1000166","url":null,"abstract":"This study assessed the genetic diversity of honeybees (Apis mellifera) in Algeria, in North Africa, using the molecular marker mtDNA COI-COII (Cytochrome Oxidase I and II). In total, five hundred eighty-two honeybee workers were sampled from 22 regions of the country. A PCR-RFLP (Polymerase Chain Reaction Restriction Fragment Length Polymorphism) analysis of the mtDNA samples distinguished the honeybee evolutionary lineages and mtDNA haplotypes from each region. Our data revealed the presence of three different honeybee lineages among the studied populations, comprising the African (A), North Mediterranean (C) and West Mediterranean (M) lineages. Eight different mtDNA haplotypes were recorded at various frequencies (A1, A2, A8, A9, A10, A13, C7 and M4). For the first time, our results identified a low genetic introgression (3.1%) of non-local mtDNA haplotypes (C7 and M4) among the local Algerian honeybees, most likely due to the import of foreign honeybees. Notably, the southern Algerian honeybee populations had lower haplotype diversity than the northern populations. Overall, the local North African honeybee subspecies A. m. intermissa and/or A. m.sahariensis seem to be remarkably dominant across northern Algeria.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"108 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2015-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73417474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-07DOI: 10.4172/2153-0602.1000167
C. Reynès, Leslie Regad, R. Sabatier, A. Camproux
The prediction of particular structural motifs associated to biological functions or to structure is of utmost importance. Given the increasing availability of primary sequences without any structure information, predictions from amino-acid (AA) sequences are essential. The proposed prediction method of structural motifs is a two-step approach based on a structural alphabet. This alphabet allows encoding any 3D structure into a 1D sequence of structural letters (SL). First, basic correspondence rules between AA and SL are learnt through genetic programming. Then, a Hidden Markov Model is learnt for each beforehand identified motif of interest. Finally, a probability to correspond to a given 3D motif for any given amino-acid sequence is provided. The method is applied on ATP binding sites to compare the efficiency of our method to other ones for a classical function. Then, the method ability to learn motifs corresponding to more rarely predicted functions or to other types of motifs is illustrated.
{"title":"Prediction of Structural Patterns of Interest from Protein Primary Sequence through Structural Alphabet: Illustration to ATP/GTP Binding Site Prediction","authors":"C. Reynès, Leslie Regad, R. Sabatier, A. Camproux","doi":"10.4172/2153-0602.1000167","DOIUrl":"https://doi.org/10.4172/2153-0602.1000167","url":null,"abstract":"The prediction of particular structural motifs associated to biological functions or to structure is of utmost importance. Given the increasing availability of primary sequences without any structure information, predictions from amino-acid (AA) sequences are essential. The proposed prediction method of structural motifs is a two-step approach based on a structural alphabet. This alphabet allows encoding any 3D structure into a 1D sequence of structural letters (SL). First, basic correspondence rules between AA and SL are learnt through genetic programming. Then, a Hidden Markov Model is learnt for each beforehand identified motif of interest. Finally, a probability to correspond to a given 3D motif for any given amino-acid sequence is provided. The method is applied on ATP binding sites to compare the efficiency of our method to other ones for a classical function. Then, the method ability to learn motifs corresponding to more rarely predicted functions or to other types of motifs is illustrated.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"20 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78567455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-07DOI: 10.4172/2153-0602.1000165
L. Ross, L. Frances, Cen Yu-wen, Xu Yang, W. Jian
Objective: To explore the pattern of slow-down of CD4 T cell depletion by Aining granule administration. Method: The data of prospective, randomized, placebo-controlled and double blinded clinical trials enrolling one hundred HIV/AIDS individuals, randomized into two groups, one with 50 cases administered with Aining granule plus the combination of d4T, ddI and NVP, and the other received with placebo plus the combination of d4T, ddI and NVP for observing in a duration of 11 months in were re-analyzed to observe the course of different CD T cells over the treatment period. Results: Only the patients in the Aining granule treatment group (7, vs 0 in the control group, deviation exceeding 2 sigmas) had stable CD4 T cell count over the treatment course. Conclusion: Our results provided insights into molecular investigation of the relation between the active ingredients of Aining granule, DNA replication and HIV-induced CD4 T cell death.
{"title":"Aining Granule Stabilizes the Decline of CD4 Cell Count in HAART-Receiving HIV/AIDS Patients Having Virologic Failure","authors":"L. Ross, L. Frances, Cen Yu-wen, Xu Yang, W. Jian","doi":"10.4172/2153-0602.1000165","DOIUrl":"https://doi.org/10.4172/2153-0602.1000165","url":null,"abstract":"Objective: To explore the pattern of slow-down of CD4 T cell depletion by Aining granule administration. \u0000Method: The data of prospective, randomized, placebo-controlled and double blinded clinical trials enrolling one hundred HIV/AIDS individuals, randomized into two groups, one with 50 cases administered with Aining granule plus the combination of d4T, ddI and NVP, and the other received with placebo plus the combination of d4T, ddI and NVP for observing in a duration of 11 months in were re-analyzed to observe the course of different CD T cells over the treatment period. \u0000Results: Only the patients in the Aining granule treatment group (7, vs 0 in the control group, deviation exceeding 2 sigmas) had stable CD4 T cell count over the treatment course. Conclusion: Our results provided insights into molecular investigation of the relation between the active ingredients of Aining granule, DNA replication and HIV-induced CD4 T cell death.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"284 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2015-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85462811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-02DOI: 10.4172/2153-0602.1000E117
Kejue Jia, R. Jernigan
With the development of high-throughput, next-generation sequencing and other advanced technologies, a large number of gene expression profiles have been produced. Many of these profiles are available from public databases [1-3]. A challenging research problem that has drawn a lot of attention in the past is to infer gene regulatory networks from the expression data. A gene regulatory network is represented by a directed graph, in which nodes represent transcription factors or mRNA with edges showing transcriptional regulatory relationships between two nodes.
{"title":"Combining Disparate Data Types: Protein Sequences and Protein Structures","authors":"Kejue Jia, R. Jernigan","doi":"10.4172/2153-0602.1000E117","DOIUrl":"https://doi.org/10.4172/2153-0602.1000E117","url":null,"abstract":"With the development of high-throughput, next-generation sequencing and other advanced technologies, a large number of gene expression profiles have been produced. Many of these profiles are available from public databases [1-3]. A challenging research problem that has drawn a lot of attention in the past is to infer gene regulatory networks from the expression data. A gene regulatory network is represented by a directed graph, in which nodes represent transcription factors or mRNA with edges showing transcriptional regulatory relationships between two nodes.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"10 3 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2015-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73152746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-01Epub Date: 2015-06-06DOI: 10.4172/2153-0602.1000e119
Zhihua Jiang
During the last ten years, next generation sequencing methods, technologies and platforms have revolutionized genomics and transcriptomics research fields and advanced their applications in agriculture and biomedicine [1–3]. To date, the Roche 454 GS FLX(+) system, Applied Biosystems SOLiD (supported oligonucleotide ligation and detection) and Ion Proton/PGM/Chef systems now owned by Life Technologies (Grand Island, NY); Solexa GA (Genome Analyzer)/HiSeq/MiSeq/NextSeq developed by Illumina (San Diego, CA); and PacBio RSII system made by Pacific Biosciences (Menlo Park, CA) present five major platforms in the market. They utilize different sequencing chemistries (e.g., sequencing by ligation vs. sequencing by synthesis); templates (e.g., single molecules vs. clusters amplified by emulsion or bridge PCR); product sizes (e.g., from 75 bp to 8,500 bp in length) and number of reads per run (e.g., from one million to 5,000 million) [2,3].
{"title":"Mining Next Generation Sequencing Data: How to Avoid \"Treasure in, Error Out\".","authors":"Zhihua Jiang","doi":"10.4172/2153-0602.1000e119","DOIUrl":"https://doi.org/10.4172/2153-0602.1000e119","url":null,"abstract":"During the last ten years, next generation sequencing methods, technologies and platforms have revolutionized genomics and transcriptomics research fields and advanced their applications in agriculture and biomedicine [1–3]. To date, the Roche 454 GS FLX(+) system, Applied Biosystems SOLiD (supported oligonucleotide ligation and detection) and Ion Proton/PGM/Chef systems now owned by Life Technologies (Grand Island, NY); Solexa GA (Genome Analyzer)/HiSeq/MiSeq/NextSeq developed by Illumina (San Diego, CA); and PacBio RSII system made by Pacific Biosciences (Menlo Park, CA) present five major platforms in the market. They utilize different sequencing chemistries (e.g., sequencing by ligation vs. sequencing by synthesis); templates (e.g., single molecules vs. clusters amplified by emulsion or bridge PCR); product sizes (e.g., from 75 bp to 8,500 bp in length) and number of reads per run (e.g., from one million to 5,000 million) [2,3].","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"6 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4172/2153-0602.1000e119","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35967581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-01DOI: 10.4172/2153-0602.1000163
A Yazdani, E Boerwinkle
Causal analyses and causal inference is a growing area of biostatics. In parallel, there is increasing focus on using genomic information to guide medical practice, i.e. personalized medicine or decision medicine. This perspective discusses causal inference in the context of personalized or decision medicine, including the assumptions and the concept that the task is different depending on whether the primary goal is the average response of treatment in the population or the ability to characterize the response for an individual or a subgroup. This perspective provides a tutorial of modern causal inference and then provides suggestions how application of specific kinds of causal inference would promote advances in translational sciences. The concept of the subpopulation causal effect is one path toward improved decision medicine. A dataset containing cardiovascular disease risk factor levels and genomic information is analyzed and different causal effects are estimated.
{"title":"Causal Inference in the Age of Decision Medicine.","authors":"A Yazdani, E Boerwinkle","doi":"10.4172/2153-0602.1000163","DOIUrl":"https://doi.org/10.4172/2153-0602.1000163","url":null,"abstract":"<p><p>Causal analyses and causal inference is a growing area of biostatics. In parallel, there is increasing focus on using genomic information to guide medical practice, i.e. personalized medicine or decision medicine. This perspective discusses causal inference in the context of personalized or decision medicine, including the assumptions and the concept that the task is different depending on whether the primary goal is the average response of treatment in the population or the ability to characterize the response for an individual or a subgroup. This perspective provides a tutorial of modern causal inference and then provides suggestions how application of specific kinds of causal inference would promote advances in translational sciences. The concept of the subpopulation causal effect is one path toward improved decision medicine. A dataset containing cardiovascular disease risk factor levels and genomic information is analyzed and different causal effects are estimated.</p>","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4172/2153-0602.1000163","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33398379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-01DOI: 10.4172/2153-0602.1000164
Shiva Kumar, Vijay H. Ghadage, I. Subramanian, A. Desai, Vivek Singh, A. Jere
Background: The primary objective of life science research is to understand complex cellular mechanisms and the interplay of various genes/proteins in multiple cellular processes. For this, PubMed is still the primary source of biomedical information even though multiple other databases such as UniProt, Protein Data Bank (PDB) and Reactome exist. Objective: With the available large volume data from high-throughput technologies and multiple databases, finding relevant information for gene-process-phenotype has now become extremely challenging and tedious. No tool is currently available to simultaneously search PubMed and multiple other databases to get holistic information. Moreover, a typical PubMed search returns large number of articles, which need to be manually screened for identifying relevant literature. Hence, we developed BioGyan, a literature mining tool to simplify the combinatorial search for genes, celltypes and cellular processes in PubMed and other relevant databases. Methods: BioGyan uses a robust scoring method to rank articles relevant to user search terms. The scoring method is based on the weighted sum of co-occurrence of gene, process and interactions terms in an abstract. Results: BioGyan retrieves PubMed articles supporting association between queried genes and processes, relevant pathways from pathway databases and 3-dimensional structures from PDB. For easy viewing, all information to the user is available in single window. BioGyan showed an accuracy of 85.46% in predicting relevance of articles to a gene-process association, and performed better than PESCADOR. Conclusion: BioGyan has several key features such as batch query of genes as well as processes, offline reading of articles, export of list of articles as bibliography and flexibility for user to revise the article relevance, making it a vital tool for literature search. Thus, BioGyan is a unique tool that offers holistic search across multiple databases while greatly automating the entire process.
背景:生命科学研究的主要目的是了解复杂的细胞机制和多种基因/蛋白质在多种细胞过程中的相互作用。因此,PubMed仍然是生物医学信息的主要来源,即使存在多个其他数据库,如UniProt, Protein Data Bank (PDB)和Reactome。目的:随着高通量技术和多个数据库的大量数据,寻找基因-过程-表型的相关信息变得非常具有挑战性和繁琐。目前还没有工具可以同时搜索PubMed和多个其他数据库以获得整体信息。此外,典型的PubMed搜索返回大量文章,需要手动筛选以识别相关文献。因此,我们开发了BioGyan,这是一个文献挖掘工具,可以简化PubMed和其他相关数据库中基因、细胞类型和细胞过程的组合搜索。方法:BioGyan使用稳健的评分方法对与用户搜索词相关的文章进行排名。该评分方法基于摘要中基因、过程和交互项共现的加权和。结果:BioGyan检索PubMed文章支持查询的基因和过程之间的关联,从途径数据库中检索相关途径,从PDB中检索三维结构。为了方便查看,用户的所有信息都在一个窗口中提供。BioGyan预测文章与基因过程关联相关性的准确率为85.46%,优于PESCADOR。结论:BioGyan具有批量查询基因和流程、文章离线阅读、文章目录输出、用户灵活修改文章相关性等关键功能,是文献检索的重要工具。因此,BioGyan是一个独特的工具,它提供跨多个数据库的整体搜索,同时大大自动化了整个过程。
{"title":"BioGyan: A Tool to Identify Gene Functions from Literature","authors":"Shiva Kumar, Vijay H. Ghadage, I. Subramanian, A. Desai, Vivek Singh, A. Jere","doi":"10.4172/2153-0602.1000164","DOIUrl":"https://doi.org/10.4172/2153-0602.1000164","url":null,"abstract":"Background: The primary objective of life science research is to understand complex cellular mechanisms and the interplay of various genes/proteins in multiple cellular processes. For this, PubMed is still the primary source of biomedical information even though multiple other databases such as UniProt, Protein Data Bank (PDB) and Reactome exist. Objective: With the available large volume data from high-throughput technologies and multiple databases, finding relevant information for gene-process-phenotype has now become extremely challenging and tedious. No tool is currently available to simultaneously search PubMed and multiple other databases to get holistic information. Moreover, a typical PubMed search returns large number of articles, which need to be manually screened for identifying relevant literature. Hence, we developed BioGyan, a literature mining tool to simplify the combinatorial search for genes, celltypes and cellular processes in PubMed and other relevant databases. Methods: BioGyan uses a robust scoring method to rank articles relevant to user search terms. The scoring method is based on the weighted sum of co-occurrence of gene, process and interactions terms in an abstract. Results: BioGyan retrieves PubMed articles supporting association between queried genes and processes, relevant pathways from pathway databases and 3-dimensional structures from PDB. For easy viewing, all information to the user is available in single window. BioGyan showed an accuracy of 85.46% in predicting relevance of articles to a gene-process association, and performed better than PESCADOR. Conclusion: BioGyan has several key features such as batch query of genes as well as processes, offline reading of articles, export of list of articles as bibliography and flexibility for user to revise the article relevance, making it a vital tool for literature search. Thus, BioGyan is a unique tool that offers holistic search across multiple databases while greatly automating the entire process.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"149 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72877183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-01-01DOI: 10.4172/2153-0602.1000E118
J. Wang
With the development of high-throughput, next-generation sequencing and other advanced technologies, a large number of gene expression profiles have been produced. Many of these profiles are available from public databases [1-3]. A challenging research problem that has drawn a lot of attention in the past is to infer gene regulatory networks from the expression data. A gene regulatory network is represented by a directed graph, in which nodes represent transcription factors or mRNA with edges showing transcriptional regulatory relationships between two nodes.
{"title":"Inferring Gene Regulatory Networks: Challenges and Opportunities","authors":"J. Wang","doi":"10.4172/2153-0602.1000E118","DOIUrl":"https://doi.org/10.4172/2153-0602.1000E118","url":null,"abstract":"With the development of high-throughput, next-generation sequencing and other advanced technologies, a large number of gene expression profiles have been produced. Many of these profiles are available from public databases [1-3]. A challenging research problem that has drawn a lot of attention in the past is to infer gene regulatory networks from the expression data. A gene regulatory network is represented by a directed graph, in which nodes represent transcription factors or mRNA with edges showing transcriptional regulatory relationships between two nodes.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"116 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79187552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-10-06DOI: 10.4172/2153-0602.1000162
M. Srivastava, M. Shahid, P. Sonika, Ey, A. Singh, Vipul Kumar, S. Gupta, M. Maurya
Trichoderma species are widely used in agriculture as biopesticides. These fungi reproduce asexually by production of conidia and chlamydospores and in wild habitats by ascospores. Trichoderma species are well known for their production of enzymes called Cell Wall Degrading Enzymes (CWDEs). All living organisms are made up of genes that code for a protein which performs the particular function. Some genes that play an important role in the biocontrol process are known as the biocontrol genes. These genes send some signals which help in secretion of proteins and enzymes that degrade the plant pathogens. These biocontrol genes can be cloned in huge amounts and can be used on large scale for commercial production. Some Trichoderma genes are also helpful in providing resistance to the biotic and abiotic stresses such as heat, drought and salt .The major biocontrol processes include antibiosis, mycoparasitism and providing plant nutrition.
{"title":"Trichoderma Genome to Genomics: A Review","authors":"M. Srivastava, M. Shahid, P. Sonika, Ey, A. Singh, Vipul Kumar, S. Gupta, M. Maurya","doi":"10.4172/2153-0602.1000162","DOIUrl":"https://doi.org/10.4172/2153-0602.1000162","url":null,"abstract":"Trichoderma species are widely used in agriculture as biopesticides. These fungi reproduce asexually by production of conidia and chlamydospores and in wild habitats by ascospores. Trichoderma species are well known for their production of enzymes called Cell Wall Degrading Enzymes (CWDEs). All living organisms are made up of genes that code for a protein which performs the particular function. Some genes that play an important role in the biocontrol process are known as the biocontrol genes. These genes send some signals which help in secretion of proteins and enzymes that degrade the plant pathogens. These biocontrol genes can be cloned in huge amounts and can be used on large scale for commercial production. Some Trichoderma genes are also helpful in providing resistance to the biotic and abiotic stresses such as heat, drought and salt .The major biocontrol processes include antibiosis, mycoparasitism and providing plant nutrition.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"1 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90074594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}