首页 > 最新文献

Journal of Data Mining in Genomics & Proteomics最新文献

英文 中文
Computational Identification, Characterization and Analysis of Conserved miRNAs and their Targets in Amborella Trichopoda 滴虫中保守mirna及其靶点的计算鉴定、表征和分析
Pub Date : 2015-02-10 DOI: 10.4172/2153-0602.1000168
B. Hajieghrari, N. Farrokhi, B. Goliaei, K. Kavousi
MicroRNAs (miRNAs) are single stranded non-coding endogenous small RNAs of about 22 nucleotides, which are directly involved in regulating gene expression at post transcriptional level. miRNAs play key roles in development and response to biotic and abiotic stresses. Homology searches allow identification of new miRNAs due to their relative high conservation in plant species. Here, miRNAs were identified for Amborella trichopoda. Known and unique plant miRNAs from miRBase were BLAST-searched against Expressed Sequence Tag (EST) and Genomic Survey Sequence (GSS) in A. trichopoda. All candidate sequences with appropriate fold back structure were screened by a series of miRNA filtering criteria. Finally, we identified and analysed conservation of 5 potential conserved miRNAs belonging to 5 miRNA gene families from ESTs as well 82 newly identified miRNAs dependant 39 miRNA families from GSSs. Potential target genes of identified miRNAs were identified based on their sequence complementarities to the respective miRNAs using psRNATarget against scaffold assignment of A. trichopoda genome sequences. Totally, 1219 target sites in A. trichopoda genome were identified. From which, 941 (77.19%) were predicted to be the subject of miRNA cleavage and 278 (22.81%) scaffolds were regulated via translational repression of mRNA. From the predicted miRNAs, 18 had no target sequence in A.trichopoda.
MicroRNAs (miRNAs)是一种单链非编码内源性小rna,大约有22个核苷酸,在转录后水平直接参与基因表达调控。mirna在生物和非生物胁迫的发育和响应中发挥关键作用。同源性搜索允许鉴定新的mirna,因为它们在植物物种中相对较高的保守性。在这里,鉴定出了滴虫Amborella trichopoda的mirna。从miRBase中提取已知的和独特的植物miRNAs,对毛蛾的表达序列标签(EST)和基因组调查序列(GSS)进行blast检索。通过一系列miRNA过滤标准筛选具有适当折叠结构的候选序列。最后,我们鉴定并分析了来自est的5个miRNA基因家族的5个潜在保守miRNA,以及来自gss的82个新鉴定的miRNA依赖的39个miRNA家族。利用psRNATarget对毛螺旋藻基因组序列的支架定位,根据miRNAs与相应miRNAs序列的序列互补性,鉴定鉴定出潜在的靶基因。总共鉴定出1219个毛虫基因组的目标位点。其中941个(77.19%)是miRNA切割的对象,278个(22.81%)是通过mRNA的翻译抑制来调节的。从预测的mirna中,有18个没有在毛螺旋藻中找到目标序列。
{"title":"Computational Identification, Characterization and Analysis of Conserved miRNAs and their Targets in Amborella Trichopoda","authors":"B. Hajieghrari, N. Farrokhi, B. Goliaei, K. Kavousi","doi":"10.4172/2153-0602.1000168","DOIUrl":"https://doi.org/10.4172/2153-0602.1000168","url":null,"abstract":"MicroRNAs (miRNAs) are single stranded non-coding endogenous small RNAs of about 22 nucleotides, which are directly involved in regulating gene expression at post transcriptional level. miRNAs play key roles in development and response to biotic and abiotic stresses. Homology searches allow identification of new miRNAs due to their relative high conservation in plant species. Here, miRNAs were identified for Amborella trichopoda. Known and unique plant miRNAs from miRBase were BLAST-searched against Expressed Sequence Tag (EST) and Genomic Survey Sequence (GSS) in A. trichopoda. All candidate sequences with appropriate fold back structure were screened by a series of miRNA filtering criteria. Finally, we identified and analysed conservation of 5 potential conserved miRNAs belonging to 5 miRNA gene families from ESTs as well 82 newly identified miRNAs dependant 39 miRNA families from GSSs. Potential target genes of identified miRNAs were identified based on their sequence complementarities to the respective miRNAs using psRNATarget against scaffold assignment of A. trichopoda genome sequences. Totally, 1219 target sites in A. trichopoda genome were identified. From which, 941 (77.19%) were predicted to be the subject of miRNA cleavage and 278 (22.81%) scaffolds were regulated via translational repression of mRNA. From the predicted miRNAs, 18 had no target sequence in A.trichopoda.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"16 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2015-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87565652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
An Insightful Molecular Analysis Reveals Foreign Honeybees Among Algerian Honeybee Populations (Apis mellifera L.) 阿尔及利亚蜜蜂种群中的外来蜜蜂(Apis mellifera L.)
Pub Date : 2015-02-07 DOI: 10.4172/2153-0602.1000166
M. Achou, WahidaLoucif-Ayad, H. Legout, Hayan Hmidan, Mohamed Alburaki, L. Garnery
This study assessed the genetic diversity of honeybees (Apis mellifera) in Algeria, in North Africa, using the molecular marker mtDNA COI-COII (Cytochrome Oxidase I and II). In total, five hundred eighty-two honeybee workers were sampled from 22 regions of the country. A PCR-RFLP (Polymerase Chain Reaction Restriction Fragment Length Polymorphism) analysis of the mtDNA samples distinguished the honeybee evolutionary lineages and mtDNA haplotypes from each region. Our data revealed the presence of three different honeybee lineages among the studied populations, comprising the African (A), North Mediterranean (C) and West Mediterranean (M) lineages. Eight different mtDNA haplotypes were recorded at various frequencies (A1, A2, A8, A9, A10, A13, C7 and M4). For the first time, our results identified a low genetic introgression (3.1%) of non-local mtDNA haplotypes (C7 and M4) among the local Algerian honeybees, most likely due to the import of foreign honeybees. Notably, the southern Algerian honeybee populations had lower haplotype diversity than the northern populations. Overall, the local North African honeybee subspecies A. m. intermissa and/or A. m.sahariensis seem to be remarkably dominant across northern Algeria.
本研究利用分子标记mtDNA COI-COII(细胞色素氧化酶I和II)对北非阿尔及利亚蜜蜂(Apis mellifera)的遗传多样性进行了评估。总共从该国22个地区取样了582只蜜蜂工蜂。通过聚合酶链反应限制性片段长度多态性(PCR-RFLP)分析,区分了各区域蜜蜂的进化谱系和mtDNA单倍型。我们的数据显示,在研究人群中存在三种不同的蜜蜂谱系,包括非洲(A),北地中海(C)和西地中海(M)谱系。8种不同频率的mtDNA单倍型分别为A1、A2、A8、A9、A10、A13、C7和M4。我们的研究结果首次在阿尔及利亚本地蜜蜂中发现了非本地mtDNA单倍型(C7和M4)的低遗传渗入(3.1%),这很可能是由于外来蜜蜂的进口。值得注意的是,阿尔及利亚南部蜜蜂种群的单倍型多样性低于北部种群。总的来说,当地的北非蜜蜂亚种A. m. intermissa和/或A. m. sahara似乎在阿尔及利亚北部占据显著优势。
{"title":"An Insightful Molecular Analysis Reveals Foreign Honeybees Among Algerian Honeybee Populations (Apis mellifera L.)","authors":"M. Achou, WahidaLoucif-Ayad, H. Legout, Hayan Hmidan, Mohamed Alburaki, L. Garnery","doi":"10.4172/2153-0602.1000166","DOIUrl":"https://doi.org/10.4172/2153-0602.1000166","url":null,"abstract":"This study assessed the genetic diversity of honeybees (Apis mellifera) in Algeria, in North Africa, using the molecular marker mtDNA COI-COII (Cytochrome Oxidase I and II). In total, five hundred eighty-two honeybee workers were sampled from 22 regions of the country. A PCR-RFLP (Polymerase Chain Reaction Restriction Fragment Length Polymorphism) analysis of the mtDNA samples distinguished the honeybee evolutionary lineages and mtDNA haplotypes from each region. Our data revealed the presence of three different honeybee lineages among the studied populations, comprising the African (A), North Mediterranean (C) and West Mediterranean (M) lineages. Eight different mtDNA haplotypes were recorded at various frequencies (A1, A2, A8, A9, A10, A13, C7 and M4). For the first time, our results identified a low genetic introgression (3.1%) of non-local mtDNA haplotypes (C7 and M4) among the local Algerian honeybees, most likely due to the import of foreign honeybees. Notably, the southern Algerian honeybee populations had lower haplotype diversity than the northern populations. Overall, the local North African honeybee subspecies A. m. intermissa and/or A. m.sahariensis seem to be remarkably dominant across northern Algeria.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"108 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2015-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73417474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Prediction of Structural Patterns of Interest from Protein Primary Sequence through Structural Alphabet: Illustration to ATP/GTP Binding Site Prediction 通过结构字母表预测蛋白质一级序列的结构模式:说明ATP/GTP结合位点预测
Pub Date : 2015-02-07 DOI: 10.4172/2153-0602.1000167
C. Reynès, Leslie Regad, R. Sabatier, A. Camproux
The prediction of particular structural motifs associated to biological functions or to structure is of utmost importance. Given the increasing availability of primary sequences without any structure information, predictions from amino-acid (AA) sequences are essential. The proposed prediction method of structural motifs is a two-step approach based on a structural alphabet. This alphabet allows encoding any 3D structure into a 1D sequence of structural letters (SL). First, basic correspondence rules between AA and SL are learnt through genetic programming. Then, a Hidden Markov Model is learnt for each beforehand identified motif of interest. Finally, a probability to correspond to a given 3D motif for any given amino-acid sequence is provided. The method is applied on ATP binding sites to compare the efficiency of our method to other ones for a classical function. Then, the method ability to learn motifs corresponding to more rarely predicted functions or to other types of motifs is illustrated.
预测与生物功能或结构相关的特定结构基序是至关重要的。由于无任何结构信息的初级序列越来越多,从氨基酸(AA)序列进行预测是必要的。本文提出的结构基序预测方法是基于结构字母表的两步法。该字母表允许将任何3D结构编码为结构字母(SL)的1D序列。首先,通过遗传规划学习AA和SL之间的基本对应规则。然后,为每个预先识别的感兴趣的母题学习一个隐马尔可夫模型。最后,提供了与任何给定氨基酸序列对应的给定3D基序的概率。将该方法应用于ATP结合位点,比较了该方法与其他经典函数的效率。然后,说明了该方法学习与更少预测函数或其他类型基序相对应的基序的能力。
{"title":"Prediction of Structural Patterns of Interest from Protein Primary Sequence through Structural Alphabet: Illustration to ATP/GTP Binding Site Prediction","authors":"C. Reynès, Leslie Regad, R. Sabatier, A. Camproux","doi":"10.4172/2153-0602.1000167","DOIUrl":"https://doi.org/10.4172/2153-0602.1000167","url":null,"abstract":"The prediction of particular structural motifs associated to biological functions or to structure is of utmost importance. Given the increasing availability of primary sequences without any structure information, predictions from amino-acid (AA) sequences are essential. The proposed prediction method of structural motifs is a two-step approach based on a structural alphabet. This alphabet allows encoding any 3D structure into a 1D sequence of structural letters (SL). First, basic correspondence rules between AA and SL are learnt through genetic programming. Then, a Hidden Markov Model is learnt for each beforehand identified motif of interest. Finally, a probability to correspond to a given 3D motif for any given amino-acid sequence is provided. The method is applied on ATP binding sites to compare the efficiency of our method to other ones for a classical function. Then, the method ability to learn motifs corresponding to more rarely predicted functions or to other types of motifs is illustrated.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"20 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78567455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Aining Granule Stabilizes the Decline of CD4 Cell Count in HAART-Receiving HIV/AIDS Patients Having Virologic Failure 益宁颗粒稳定接受haart治疗的HIV/AIDS病毒学失败患者CD4细胞计数下降
Pub Date : 2015-02-07 DOI: 10.4172/2153-0602.1000165
L. Ross, L. Frances, Cen Yu-wen, Xu Yang, W. Jian
Objective: To explore the pattern of slow-down of CD4 T cell depletion by Aining granule administration. Method: The data of prospective, randomized, placebo-controlled and double blinded clinical trials enrolling one hundred HIV/AIDS individuals, randomized into two groups, one with 50 cases administered with Aining granule plus the combination of d4T, ddI and NVP, and the other received with placebo plus the combination of d4T, ddI and NVP for observing in a duration of 11 months in were re-analyzed to observe the course of different CD T cells over the treatment period. Results: Only the patients in the Aining granule treatment group (7, vs 0 in the control group, deviation exceeding 2 sigmas) had stable CD4 T cell count over the treatment course. Conclusion: Our results provided insights into molecular investigation of the relation between the active ingredients of Aining granule, DNA replication and HIV-induced CD4 T cell death.
目的:探讨宁宁颗粒剂对小鼠CD4 T细胞凋亡的抑制作用。方法:对100例HIV/AIDS患者的前瞻性、随机、安慰剂对照、双盲临床试验资料进行再分析,随机分为两组,其中50例患者给予宁宁颗粒联合d4T、ddI、NVP治疗,另50例患者给予安慰剂联合d4T、ddI、NVP治疗,为期11个月,观察不同cdt细胞在治疗期间的变化。结果:只有宁宁颗粒治疗组患者CD4 T细胞计数在治疗过程中保持稳定(7例,对照组0例,偏差大于2 σ)。结论:我们的研究结果揭示了宁宁颗粒有效成分与DNA复制和hiv诱导的CD4 T细胞死亡之间的分子关系。
{"title":"Aining Granule Stabilizes the Decline of CD4 Cell Count in HAART-Receiving HIV/AIDS Patients Having Virologic Failure","authors":"L. Ross, L. Frances, Cen Yu-wen, Xu Yang, W. Jian","doi":"10.4172/2153-0602.1000165","DOIUrl":"https://doi.org/10.4172/2153-0602.1000165","url":null,"abstract":"Objective: To explore the pattern of slow-down of CD4 T cell depletion by Aining granule administration. \u0000Method: The data of prospective, randomized, placebo-controlled and double blinded clinical trials enrolling one hundred HIV/AIDS individuals, randomized into two groups, one with 50 cases administered with Aining granule plus the combination of d4T, ddI and NVP, and the other received with placebo plus the combination of d4T, ddI and NVP for observing in a duration of 11 months in were re-analyzed to observe the course of different CD T cells over the treatment period. \u0000Results: Only the patients in the Aining granule treatment group (7, vs 0 in the control group, deviation exceeding 2 sigmas) had stable CD4 T cell count over the treatment course. Conclusion: Our results provided insights into molecular investigation of the relation between the active ingredients of Aining granule, DNA replication and HIV-induced CD4 T cell death.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"284 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2015-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85462811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Combining Disparate Data Types: Protein Sequences and Protein Structures 结合不同的数据类型:蛋白质序列和蛋白质结构
Pub Date : 2015-01-02 DOI: 10.4172/2153-0602.1000E117
Kejue Jia, R. Jernigan
With the development of high-throughput, next-generation sequencing and other advanced technologies, a large number of gene expression profiles have been produced. Many of these profiles are available from public databases [1-3]. A challenging research problem that has drawn a lot of attention in the past is to infer gene regulatory networks from the expression data. A gene regulatory network is represented by a directed graph, in which nodes represent transcription factors or mRNA with edges showing transcriptional regulatory relationships between two nodes.
随着高通量、下一代测序等先进技术的发展,产生了大量的基因表达谱。这些配置文件中的许多都可以从公共数据库中获得[1-3]。从基因表达数据中推断基因调控网络是一个具有挑战性的研究问题,过去一直引起人们的广泛关注。基因调控网络由有向图表示,其中节点表示转录因子或mRNA,边缘表示两个节点之间的转录调控关系。
{"title":"Combining Disparate Data Types: Protein Sequences and Protein Structures","authors":"Kejue Jia, R. Jernigan","doi":"10.4172/2153-0602.1000E117","DOIUrl":"https://doi.org/10.4172/2153-0602.1000E117","url":null,"abstract":"With the development of high-throughput, next-generation sequencing and other advanced technologies, a large number of gene expression profiles have been produced. Many of these profiles are available from public databases [1-3]. A challenging research problem that has drawn a lot of attention in the past is to infer gene regulatory networks from the expression data. A gene regulatory network is represented by a directed graph, in which nodes represent transcription factors or mRNA with edges showing transcriptional regulatory relationships between two nodes.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"10 3 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2015-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73152746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mining Next Generation Sequencing Data: How to Avoid "Treasure in, Error Out". 挖掘下一代测序数据:如何避免“宝入错出”。
Pub Date : 2015-01-01 Epub Date: 2015-06-06 DOI: 10.4172/2153-0602.1000e119
Zhihua Jiang
During the last ten years, next generation sequencing methods, technologies and platforms have revolutionized genomics and transcriptomics research fields and advanced their applications in agriculture and biomedicine [1–3]. To date, the Roche 454 GS FLX(+) system, Applied Biosystems SOLiD (supported oligonucleotide ligation and detection) and Ion Proton/PGM/Chef systems now owned by Life Technologies (Grand Island, NY); Solexa GA (Genome Analyzer)/HiSeq/MiSeq/NextSeq developed by Illumina (San Diego, CA); and PacBio RSII system made by Pacific Biosciences (Menlo Park, CA) present five major platforms in the market. They utilize different sequencing chemistries (e.g., sequencing by ligation vs. sequencing by synthesis); templates (e.g., single molecules vs. clusters amplified by emulsion or bridge PCR); product sizes (e.g., from 75 bp to 8,500 bp in length) and number of reads per run (e.g., from one million to 5,000 million) [2,3].
{"title":"Mining Next Generation Sequencing Data: How to Avoid \"Treasure in, Error Out\".","authors":"Zhihua Jiang","doi":"10.4172/2153-0602.1000e119","DOIUrl":"https://doi.org/10.4172/2153-0602.1000e119","url":null,"abstract":"During the last ten years, next generation sequencing methods, technologies and platforms have revolutionized genomics and transcriptomics research fields and advanced their applications in agriculture and biomedicine [1–3]. To date, the Roche 454 GS FLX(+) system, Applied Biosystems SOLiD (supported oligonucleotide ligation and detection) and Ion Proton/PGM/Chef systems now owned by Life Technologies (Grand Island, NY); Solexa GA (Genome Analyzer)/HiSeq/MiSeq/NextSeq developed by Illumina (San Diego, CA); and PacBio RSII system made by Pacific Biosciences (Menlo Park, CA) present five major platforms in the market. They utilize different sequencing chemistries (e.g., sequencing by ligation vs. sequencing by synthesis); templates (e.g., single molecules vs. clusters amplified by emulsion or bridge PCR); product sizes (e.g., from 75 bp to 8,500 bp in length) and number of reads per run (e.g., from one million to 5,000 million) [2,3].","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"6 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4172/2153-0602.1000e119","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35967581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Causal Inference in the Age of Decision Medicine. 决策医学时代的因果推理。
Pub Date : 2015-01-01 DOI: 10.4172/2153-0602.1000163
A Yazdani, E Boerwinkle

Causal analyses and causal inference is a growing area of biostatics. In parallel, there is increasing focus on using genomic information to guide medical practice, i.e. personalized medicine or decision medicine. This perspective discusses causal inference in the context of personalized or decision medicine, including the assumptions and the concept that the task is different depending on whether the primary goal is the average response of treatment in the population or the ability to characterize the response for an individual or a subgroup. This perspective provides a tutorial of modern causal inference and then provides suggestions how application of specific kinds of causal inference would promote advances in translational sciences. The concept of the subpopulation causal effect is one path toward improved decision medicine. A dataset containing cardiovascular disease risk factor levels and genomic information is analyzed and different causal effects are estimated.

因果分析和因果推理是生物静力学的一个新兴领域。与此同时,人们越来越关注使用基因组信息来指导医疗实践,即个性化医疗或决策医学。这一观点讨论了个性化或决策医学背景下的因果推理,包括假设和概念,即任务的不同取决于主要目标是人群中治疗的平均反应,还是表征个体或亚群体反应的能力。这一观点提供了现代因果推理的教程,然后提供了具体类型的因果推理的应用如何促进翻译科学的进步的建议。亚群体因果效应的概念是改进决策医学的一个途径。分析了包含心血管疾病风险因素水平和基因组信息的数据集,并估计了不同的因果效应。
{"title":"Causal Inference in the Age of Decision Medicine.","authors":"A Yazdani,&nbsp;E Boerwinkle","doi":"10.4172/2153-0602.1000163","DOIUrl":"https://doi.org/10.4172/2153-0602.1000163","url":null,"abstract":"<p><p>Causal analyses and causal inference is a growing area of biostatics. In parallel, there is increasing focus on using genomic information to guide medical practice, i.e. personalized medicine or decision medicine. This perspective discusses causal inference in the context of personalized or decision medicine, including the assumptions and the concept that the task is different depending on whether the primary goal is the average response of treatment in the population or the ability to characterize the response for an individual or a subgroup. This perspective provides a tutorial of modern causal inference and then provides suggestions how application of specific kinds of causal inference would promote advances in translational sciences. The concept of the subpopulation causal effect is one path toward improved decision medicine. A dataset containing cardiovascular disease risk factor levels and genomic information is analyzed and different causal effects are estimated.</p>","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4172/2153-0602.1000163","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33398379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
BioGyan: A Tool to Identify Gene Functions from Literature 生物基因:从文献中识别基因功能的工具
Pub Date : 2015-01-01 DOI: 10.4172/2153-0602.1000164
Shiva Kumar, Vijay H. Ghadage, I. Subramanian, A. Desai, Vivek Singh, A. Jere
Background: The primary objective of life science research is to understand complex cellular mechanisms and the interplay of various genes/proteins in multiple cellular processes. For this, PubMed is still the primary source of biomedical information even though multiple other databases such as UniProt, Protein Data Bank (PDB) and Reactome exist. Objective: With the available large volume data from high-throughput technologies and multiple databases, finding relevant information for gene-process-phenotype has now become extremely challenging and tedious. No tool is currently available to simultaneously search PubMed and multiple other databases to get holistic information. Moreover, a typical PubMed search returns large number of articles, which need to be manually screened for identifying relevant literature. Hence, we developed BioGyan, a literature mining tool to simplify the combinatorial search for genes, celltypes and cellular processes in PubMed and other relevant databases. Methods: BioGyan uses a robust scoring method to rank articles relevant to user search terms. The scoring method is based on the weighted sum of co-occurrence of gene, process and interactions terms in an abstract. Results: BioGyan retrieves PubMed articles supporting association between queried genes and processes, relevant pathways from pathway databases and 3-dimensional structures from PDB. For easy viewing, all information to the user is available in single window. BioGyan showed an accuracy of 85.46% in predicting relevance of articles to a gene-process association, and performed better than PESCADOR. Conclusion: BioGyan has several key features such as batch query of genes as well as processes, offline reading of articles, export of list of articles as bibliography and flexibility for user to revise the article relevance, making it a vital tool for literature search. Thus, BioGyan is a unique tool that offers holistic search across multiple databases while greatly automating the entire process.
背景:生命科学研究的主要目的是了解复杂的细胞机制和多种基因/蛋白质在多种细胞过程中的相互作用。因此,PubMed仍然是生物医学信息的主要来源,即使存在多个其他数据库,如UniProt, Protein Data Bank (PDB)和Reactome。目的:随着高通量技术和多个数据库的大量数据,寻找基因-过程-表型的相关信息变得非常具有挑战性和繁琐。目前还没有工具可以同时搜索PubMed和多个其他数据库以获得整体信息。此外,典型的PubMed搜索返回大量文章,需要手动筛选以识别相关文献。因此,我们开发了BioGyan,这是一个文献挖掘工具,可以简化PubMed和其他相关数据库中基因、细胞类型和细胞过程的组合搜索。方法:BioGyan使用稳健的评分方法对与用户搜索词相关的文章进行排名。该评分方法基于摘要中基因、过程和交互项共现的加权和。结果:BioGyan检索PubMed文章支持查询的基因和过程之间的关联,从途径数据库中检索相关途径,从PDB中检索三维结构。为了方便查看,用户的所有信息都在一个窗口中提供。BioGyan预测文章与基因过程关联相关性的准确率为85.46%,优于PESCADOR。结论:BioGyan具有批量查询基因和流程、文章离线阅读、文章目录输出、用户灵活修改文章相关性等关键功能,是文献检索的重要工具。因此,BioGyan是一个独特的工具,它提供跨多个数据库的整体搜索,同时大大自动化了整个过程。
{"title":"BioGyan: A Tool to Identify Gene Functions from Literature","authors":"Shiva Kumar, Vijay H. Ghadage, I. Subramanian, A. Desai, Vivek Singh, A. Jere","doi":"10.4172/2153-0602.1000164","DOIUrl":"https://doi.org/10.4172/2153-0602.1000164","url":null,"abstract":"Background: The primary objective of life science research is to understand complex cellular mechanisms and the interplay of various genes/proteins in multiple cellular processes. For this, PubMed is still the primary source of biomedical information even though multiple other databases such as UniProt, Protein Data Bank (PDB) and Reactome exist. Objective: With the available large volume data from high-throughput technologies and multiple databases, finding relevant information for gene-process-phenotype has now become extremely challenging and tedious. No tool is currently available to simultaneously search PubMed and multiple other databases to get holistic information. Moreover, a typical PubMed search returns large number of articles, which need to be manually screened for identifying relevant literature. Hence, we developed BioGyan, a literature mining tool to simplify the combinatorial search for genes, celltypes and cellular processes in PubMed and other relevant databases. Methods: BioGyan uses a robust scoring method to rank articles relevant to user search terms. The scoring method is based on the weighted sum of co-occurrence of gene, process and interactions terms in an abstract. Results: BioGyan retrieves PubMed articles supporting association between queried genes and processes, relevant pathways from pathway databases and 3-dimensional structures from PDB. For easy viewing, all information to the user is available in single window. BioGyan showed an accuracy of 85.46% in predicting relevance of articles to a gene-process association, and performed better than PESCADOR. Conclusion: BioGyan has several key features such as batch query of genes as well as processes, offline reading of articles, export of list of articles as bibliography and flexibility for user to revise the article relevance, making it a vital tool for literature search. Thus, BioGyan is a unique tool that offers holistic search across multiple databases while greatly automating the entire process.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"149 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72877183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inferring Gene Regulatory Networks: Challenges and Opportunities 推断基因调控网络:挑战与机遇
Pub Date : 2015-01-01 DOI: 10.4172/2153-0602.1000E118
J. Wang
With the development of high-throughput, next-generation sequencing and other advanced technologies, a large number of gene expression profiles have been produced. Many of these profiles are available from public databases [1-3]. A challenging research problem that has drawn a lot of attention in the past is to infer gene regulatory networks from the expression data. A gene regulatory network is represented by a directed graph, in which nodes represent transcription factors or mRNA with edges showing transcriptional regulatory relationships between two nodes.
随着高通量、下一代测序等先进技术的发展,产生了大量的基因表达谱。这些配置文件中的许多都可以从公共数据库中获得[1-3]。从基因表达数据中推断基因调控网络是一个具有挑战性的研究问题,过去一直引起人们的广泛关注。基因调控网络由有向图表示,其中节点表示转录因子或mRNA,边缘表示两个节点之间的转录调控关系。
{"title":"Inferring Gene Regulatory Networks: Challenges and Opportunities","authors":"J. Wang","doi":"10.4172/2153-0602.1000E118","DOIUrl":"https://doi.org/10.4172/2153-0602.1000E118","url":null,"abstract":"With the development of high-throughput, next-generation sequencing and other advanced technologies, a large number of gene expression profiles have been produced. Many of these profiles are available from public databases [1-3]. A challenging research problem that has drawn a lot of attention in the past is to infer gene regulatory networks from the expression data. A gene regulatory network is represented by a directed graph, in which nodes represent transcription factors or mRNA with edges showing transcriptional regulatory relationships between two nodes.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"116 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79187552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Trichoderma Genome to Genomics: A Review 木霉基因组到基因组学:综述
Pub Date : 2014-10-06 DOI: 10.4172/2153-0602.1000162
M. Srivastava, M. Shahid, P. Sonika, Ey, A. Singh, Vipul Kumar, S. Gupta, M. Maurya
Trichoderma species are widely used in agriculture as biopesticides. These fungi reproduce asexually by production of conidia and chlamydospores and in wild habitats by ascospores. Trichoderma species are well known for their production of enzymes called Cell Wall Degrading Enzymes (CWDEs). All living organisms are made up of genes that code for a protein which performs the particular function. Some genes that play an important role in the biocontrol process are known as the biocontrol genes. These genes send some signals which help in secretion of proteins and enzymes that degrade the plant pathogens. These biocontrol genes can be cloned in huge amounts and can be used on large scale for commercial production. Some Trichoderma genes are also helpful in providing resistance to the biotic and abiotic stresses such as heat, drought and salt .The major biocontrol processes include antibiosis, mycoparasitism and providing plant nutrition.
木霉作为生物农药广泛应用于农业。这些真菌通过产生分生孢子和衣孢子进行无性繁殖,在野生生境中通过子囊孢子进行无性繁殖。木霉以生产细胞壁降解酶(CWDEs)而闻名。所有生物体都是由基因组成的,这些基因编码一种执行特定功能的蛋白质。一些在生物防治过程中起重要作用的基因被称为生物防治基因。这些基因发出一些信号,帮助分泌降解植物病原体的蛋白质和酶。这些生物防治基因可以大量克隆,并可大规模用于商业生产。一些木霉基因也有助于抵抗生物和非生物胁迫,如热、干旱和盐。主要的生物防治过程包括抗生素、真菌寄生和提供植物营养。
{"title":"Trichoderma Genome to Genomics: A Review","authors":"M. Srivastava, M. Shahid, P. Sonika, Ey, A. Singh, Vipul Kumar, S. Gupta, M. Maurya","doi":"10.4172/2153-0602.1000162","DOIUrl":"https://doi.org/10.4172/2153-0602.1000162","url":null,"abstract":"Trichoderma species are widely used in agriculture as biopesticides. These fungi reproduce asexually by production of conidia and chlamydospores and in wild habitats by ascospores. Trichoderma species are well known for their production of enzymes called Cell Wall Degrading Enzymes (CWDEs). All living organisms are made up of genes that code for a protein which performs the particular function. Some genes that play an important role in the biocontrol process are known as the biocontrol genes. These genes send some signals which help in secretion of proteins and enzymes that degrade the plant pathogens. These biocontrol genes can be cloned in huge amounts and can be used on large scale for commercial production. Some Trichoderma genes are also helpful in providing resistance to the biotic and abiotic stresses such as heat, drought and salt .The major biocontrol processes include antibiosis, mycoparasitism and providing plant nutrition.","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"1 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90074594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
期刊
Journal of Data Mining in Genomics & Proteomics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1