首页 > 最新文献

Proceedings of the ... Asia-Pacific bioinformatics conference最新文献

英文 中文
System Identification and Robustness Analysis of the Circadian Regulatory Network via Microarray Data in Arabidopsis Thaliana 基于微阵列数据的拟南芥昼夜节律调节网络系统识别和稳健性分析
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0006
C. Li, W. Chang, B. S. Chen
The circadian regulatory network is one of the main topics of plant investigations. The intracellular interactions among genes in response to the environmental stimuli of light are related to the foundation of functional genomics in plant. However, the sensitivity analysis of the circadian system has not analyzed by perturbed stochastic dynamic model via microarray data in plant. In this study, the circadian network is constructed for Arabidopsis thaliana using a stochastic dynamic model with sigmoid interaction, activation delay, and regulation of input light taken into consideration. The describing function method in nonlinear control theory about nonlinear limit cycle (oscillation) is employed to interpret the oscillations of the circadian regulatory networks from the viewpoint that nonlinear network will continue to oscillate if its feedback loop gain is equal to 1 to support the oscillation of circadian network. Based on the dynamic model via microarray data, the system sensitivity analysis is performed to assess the robustness of circadian regulatory network via biological perturbations. We found that the circadian network is more sensitive to the perturbation of the trans-expression threshold, is more sensitive to the activation level of steady state, rather than the trans-sensitivity rate.
昼夜节律调节网络是植物研究的主要课题之一。细胞内基因对环境光刺激的相互作用关系到植物功能基因组学的基础。然而,目前还没有基于微阵列数据的扰动随机动态模型对植物昼夜节律系统进行敏感性分析。在这项研究中,拟南芥采用随机动态模型,考虑了s形相互作用、激活延迟和输入光的调节,构建了昼夜节律网络。采用非线性控制理论中关于非线性极限环(振荡)的描述函数方法,从非线性网络的反馈环增益等于1以支持昼夜节律网络的振荡的角度来解释昼夜节律调节网络的振荡。基于微阵列数据建立的动态模型,进行了系统敏感性分析,以评估生物扰动下昼夜节律调节网络的鲁棒性。我们发现昼夜节律网络对反式表达阈值的扰动更敏感,对稳态激活水平更敏感,而不是反式敏感率。
{"title":"System Identification and Robustness Analysis of the Circadian Regulatory Network via Microarray Data in Arabidopsis Thaliana","authors":"C. Li, W. Chang, B. S. Chen","doi":"10.1142/9781860947292_0006","DOIUrl":"https://doi.org/10.1142/9781860947292_0006","url":null,"abstract":"The circadian regulatory network is one of the main topics of plant investigations. The intracellular interactions among genes in response to the environmental stimuli of light are related to the foundation of functional genomics in plant. However, the sensitivity analysis of the circadian system has not analyzed by perturbed stochastic dynamic model via microarray data in plant. In this study, the circadian network is constructed for Arabidopsis thaliana using a stochastic dynamic model with sigmoid interaction, activation delay, and regulation of input light taken into consideration. The describing function method in nonlinear control theory about nonlinear limit cycle (oscillation) is employed to interpret the oscillations of the circadian regulatory networks from the viewpoint that nonlinear network will continue to oscillate if its feedback loop gain is equal to 1 to support the oscillation of circadian network. Based on the dynamic model via microarray data, the system sensitivity analysis is performed to assess the robustness of circadian regulatory network via biological perturbations. We found that the circadian network is more sensitive to the perturbation of the trans-expression threshold, is more sensitive to the activation level of steady state, rather than the trans-sensitivity rate.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"76 1","pages":"27-37"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80584103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification of MicroRNA Precursors via SVM 基于支持向量机的MicroRNA前体鉴定
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0030
L. Yang, W. Hsu, M. Lee, L. Wong
MiRNAs are short non-coding RNAs that regulate gene expression. While the first miRNAs were discovered using experimental methods, experimental miRNA identification remains technically challenging and incomplete. This calls for the development of computational approaches to complement experimental approaches to miRNA gene identification. We pr opose in this paper a de novo miRNA precursor prediction method. This method follows the “feature generation, feature selection, and feature integration” paradigm of constructing recognition models for genomics sequences. We generate and identified features based on information in both primary sequence and secondary structure, and use these features to construct SVM-based models for the recognition of miRNA precursors. Experimental results show that our method is effective, and can achieve good sensitivity and specificity.
mirna是调节基因表达的短非编码rna。虽然第一个miRNA是通过实验方法发现的,但实验miRNA鉴定在技术上仍然具有挑战性和不完整。这就要求开发计算方法来补充miRNA基因鉴定的实验方法。本文提出了一种新的miRNA前体预测方法。该方法遵循构建基因组序列识别模型的“特征生成、特征选择和特征集成”范式。我们基于一级序列和二级结构的信息生成和识别特征,并使用这些特征构建基于支持向量机的模型来识别miRNA前体。实验结果表明,该方法是有效的,具有良好的灵敏度和特异性。
{"title":"Identification of MicroRNA Precursors via SVM","authors":"L. Yang, W. Hsu, M. Lee, L. Wong","doi":"10.1142/9781860947292_0030","DOIUrl":"https://doi.org/10.1142/9781860947292_0030","url":null,"abstract":"MiRNAs are short non-coding RNAs that regulate gene expression. While the first miRNAs were discovered using experimental methods, experimental miRNA identification remains technically challenging and incomplete. This calls for the development of computational approaches to complement experimental approaches to miRNA gene identification. We pr opose in this paper a de novo miRNA precursor prediction method. This method follows the “feature generation, feature selection, and feature integration” paradigm of constructing recognition models for genomics sequences. We generate and identified features based on information in both primary sequence and secondary structure, and use these features to construct SVM-based models for the recognition of miRNA precursors. Experimental results show that our method is effective, and can achieve good sensitivity and specificity.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"33 1","pages":"267-276"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82424268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Automating the Search for Lateral Gene Transfer 自动搜索横向基因转移
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0002
M. Ragan
Most genes have attained their observed distribution among ge omes by transmission from parent to offspring through time. In prokaryotes (bacteria and archa ea), however, some genes are where they are as the result of transfer from an unrelated lineage. To el ucidate the biological origins and functional consequences of lateral gene transfer (LGT), we have constructed an automated computational pipeline to recognise protein families among prokaryotic g enomes, generate high-quality multiple sequence alignments of orthologs, infer statistically sound phylogenetic trees, and find topologically incongruent subtrees (prima facie instances of LGT). This pip eline requires that we automate workflows, design and optimize algorithms, mobilise high-performanc e computing resources, and efficiently manage federated data. I will summarise results from the automa ted comparison of 422971 proteins in 22437 families across 144 sequenced prokaryotic genomes, i nclud ng the nature and extent of LGT among these lineages, major donors and recipients, the bioc hemi al pathways and physiological functions most affected, and implications for the role of LGT in e volution of biochemical pathways.
大多数基因在基因组中的分布都是通过从亲代传给后代的方式实现的。然而,在原核生物(细菌和古细菌)中,有些基因是由不相关的谱系转移而来的。为了阐明横向基因转移(LGT)的生物学起源和功能后果,我们构建了一个自动计算管道来识别原核生物g基因组中的蛋白质家族,生成高质量的同源多序列比对,推断统计学上合理的系统发育树,并发现拓扑上不一致的子树(LGT的初步实例)。这条流水线要求我们自动化工作流程,设计和优化算法,调动高性能计算资源,并有效地管理联邦数据。我将总结来自22437个家族的422971个蛋白在144个测序的原核生物基因组中的自动比较结果,包括LGT在这些谱系中的性质和程度,主要供体和受体,受影响最大的生物半通路和生理功能,以及LGT在生化途径进化中的作用。
{"title":"Automating the Search for Lateral Gene Transfer","authors":"M. Ragan","doi":"10.1142/9781860947292_0002","DOIUrl":"https://doi.org/10.1142/9781860947292_0002","url":null,"abstract":"Most genes have attained their observed distribution among ge omes by transmission from parent to offspring through time. In prokaryotes (bacteria and archa ea), however, some genes are where they are as the result of transfer from an unrelated lineage. To el ucidate the biological origins and functional consequences of lateral gene transfer (LGT), we have constructed an automated computational pipeline to recognise protein families among prokaryotic g enomes, generate high-quality multiple sequence alignments of orthologs, infer statistically sound phylogenetic trees, and find topologically incongruent subtrees (prima facie instances of LGT). This pip eline requires that we automate workflows, design and optimize algorithms, mobilise high-performanc e computing resources, and efficiently manage federated data. I will summarise results from the automa ted comparison of 422971 proteins in 22437 families across 144 sequenced prokaryotic genomes, i nclud ng the nature and extent of LGT among these lineages, major donors and recipients, the bioc hemi al pathways and physiological functions most affected, and implications for the role of LGT in e volution of biochemical pathways.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"54 1","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80408853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Property-Dependent Analysis of Aligned Proteins from Two Or More Populations 来自两个或多个群体的对齐蛋白的特性依赖分析
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0020
Steinar Thorvaldsen, E. Ytterstad, T. Flå
Multiple sequence alignments can provide information for comparative analyses of proteins and protein populations. We present some statistical trend-tests that can be used when an aligned data set can be divided into two or more populations based on phenotypic traits such as preference of temperature, pH, salt concentration or pressure. The approach is based on estimation and analysis of the variation between the values of physicochemical parameters at positions of the sequence alignment. Monotonic trends are detected by applying a cumulative Mann-Kendall test. The method is found to be useful to identify significant physicochemical mechanisms behind adaptation to extreme environments and uncover molecular differences between mesophile and extremophile organisms. A filtering technique is also presented to visualize the underlying structure in the data. All the comparative statistical methods are available in the toolbox DeltaProt.
多序列比对可以为蛋白质和蛋白质群体的比较分析提供信息。我们提出了一些统计趋势测试,当一个排列的数据集可以根据表型特征(如温度、pH值、盐浓度或压力的偏好)划分为两个或多个种群时,可以使用这些趋势测试。该方法基于对序列比对位置上理化参数值变化的估计和分析。单调趋势是通过应用累积曼-肯德尔检验来检测的。该方法有助于识别极端环境适应背后的重要物理化学机制,揭示中温生物和极端生物之间的分子差异。提出了一种过滤技术,使数据的底层结构可视化。所有的比较统计方法都可以在工具箱DeltaProt中找到。
{"title":"Property-Dependent Analysis of Aligned Proteins from Two Or More Populations","authors":"Steinar Thorvaldsen, E. Ytterstad, T. Flå","doi":"10.1142/9781860947292_0020","DOIUrl":"https://doi.org/10.1142/9781860947292_0020","url":null,"abstract":"Multiple sequence alignments can provide information for comparative analyses of proteins and protein populations. We present some statistical trend-tests that can be used when an aligned data set can be divided into two or more populations based on phenotypic traits such as preference of temperature, pH, salt concentration or pressure. The approach is based on estimation and analysis of the variation between the values of physicochemical parameters at positions of the sequence alignment. Monotonic trends are detected by applying a cumulative Mann-Kendall test. The method is found to be useful to identify significant physicochemical mechanisms behind adaptation to extreme environments and uncover molecular differences between mesophile and extremophile organisms. A filtering technique is also presented to visualize the underlying structure in the data. All the comparative statistical methods are available in the toolbox DeltaProt.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"31 1","pages":"169-178"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87606057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Consequences of Mutation, Selection, and Physico-Chemical Properties of Encoded Proteins on Synonymous Codon Usage in Adenoviruses 腺病毒中编码蛋白的突变、选择和理化性质对同义密码子使用的影响
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0018
Sandip Paul, Sabyasachi Das, C. Dutta
Trends in synonymous codon usage in adenoviruses have been examined through the multivariate statistical analysis on the annotated protein-coding regions of 22 adenoviral species, for which complete genome sequences are available. One of the major determinants of such trends is the G + C content at third codon positions of the genes, the average value of which varied from one viral genome to other depending on the overall mutational bias of the species. G3S and C3S interacted synergistically along the first principal axis of Correspondence analysis on the Relative Synonymous Codon Usage of adenoviral genes, but antagonistically along the second principal axis. Other major determinants of the trends are the natural selection, putatively operative at the level of translation and quite interestingly, hydropathy of the encoded proteins. The trends in codon usage, though characterized by distinct virus-specific mutational bias, do not exhibit any sign of host-specificity. Significant variations are observed in synonymous codon choice in structural and nonstructural genes of adenoviruses.
通过对22种具有完整基因组序列的腺病毒物种的注释蛋白编码区进行多元统计分析,研究了腺病毒中同义密码子使用的趋势。这种趋势的主要决定因素之一是基因第三密码子位置的G + C含量,其平均值根据物种的总体突变偏倚在不同的病毒基因组中有所不同。在腺病毒基因相对同义密码子使用的对应分析中,G3S和C3S沿第一条主轴呈协同作用,但沿第二条主轴呈拮抗作用。这种趋势的其他主要决定因素是自然选择,被认为在翻译水平上起作用,很有趣的是,编码蛋白质的亲水性。密码子使用的趋势,虽然具有明显的病毒特异性突变倾向,但没有表现出任何宿主特异性的迹象。腺病毒结构基因和非结构基因的同义密码子选择存在显著差异。
{"title":"Consequences of Mutation, Selection, and Physico-Chemical Properties of Encoded Proteins on Synonymous Codon Usage in Adenoviruses","authors":"Sandip Paul, Sabyasachi Das, C. Dutta","doi":"10.1142/9781860947292_0018","DOIUrl":"https://doi.org/10.1142/9781860947292_0018","url":null,"abstract":"Trends in synonymous codon usage in adenoviruses have been examined through the multivariate statistical analysis on the annotated protein-coding regions of 22 adenoviral species, for which complete genome sequences are available. One of the major determinants of such trends is the G + C content at third codon positions of the genes, the average value of which varied from one viral genome to other depending on the overall mutational bias of the species. G3S and C3S interacted synergistically along the first principal axis of Correspondence analysis on the Relative Synonymous Codon Usage of adenoviral genes, but antagonistically along the second principal axis. Other major determinants of the trends are the natural selection, putatively operative at the level of translation and quite interestingly, hydropathy of the encoded proteins. The trends in codon usage, though characterized by distinct virus-specific mutational bias, do not exhibit any sign of host-specificity. Significant variations are observed in synonymous codon choice in structural and nonstructural genes of adenoviruses.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"17 1","pages":"149-158"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82581336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification of Over-Represented Combinations of Transcription Factor Binding Sites in Sets of Co-Expressed Genes 鉴定共表达基因组中转录因子结合位点的过度代表组合
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0028
S. Huang, Debra L. Fulton, David J. Arenillas, P. Perco, S. Sui, J. Mortimer, W. Wasserman
Transcription regulation is mediated by combinatorial interactions between diverse trans-acting proteins and arrays of cis-regulatory sequences. Revealing this complex interplay between transcription factors and binding sites remains a fundamental problem for understanding the flow of genetic information. The oPOSSUM analysis system facilitates the interpretation of gene expression data through the analysis of transcription factor binding sites shared by sets of co-expressed genes. The system is based on cross-species sequence comparisons for phylogenetic footprinting and motif models for binding site prediction. We introduce a new set of analysis algorithms for the study of the combinatorial properties of transcription factor binding sites shared by sets of co-expressed genes. The new methods circumvent computational challenges through an applied focus on families of transcription factors with similar binding properties. The algorithm accurately identifies combinations of binding sites over-represented in reference collections and clarifies the results obtained by existing methods for the study of isolated binding sites.
转录调节是由多种反式作用蛋白和顺式调节序列阵列之间的组合相互作用介导的。揭示转录因子和结合位点之间复杂的相互作用仍然是理解遗传信息流的一个基本问题。oPOSSUM分析系统通过分析共表达基因组共享的转录因子结合位点,促进了基因表达数据的解释。该系统是基于跨物种序列比较的系统发育足迹和基序模型的结合位点预测。我们引入了一套新的分析算法来研究转录因子结合位点的组合特性,这些位点是由共表达基因共享的。新方法通过应用于具有相似结合特性的转录因子家族来规避计算挑战。该算法准确地识别了在参考文献集中被过度代表的结合位点组合,并澄清了现有方法获得的孤立结合位点研究结果。
{"title":"Identification of Over-Represented Combinations of Transcription Factor Binding Sites in Sets of Co-Expressed Genes","authors":"S. Huang, Debra L. Fulton, David J. Arenillas, P. Perco, S. Sui, J. Mortimer, W. Wasserman","doi":"10.1142/9781860947292_0028","DOIUrl":"https://doi.org/10.1142/9781860947292_0028","url":null,"abstract":"Transcription regulation is mediated by combinatorial interactions between diverse trans-acting proteins and arrays of cis-regulatory sequences. Revealing this complex interplay between transcription factors and binding sites remains a fundamental problem for understanding the flow of genetic information. The oPOSSUM analysis system facilitates the interpretation of gene expression data through the analysis of transcription factor binding sites shared by sets of co-expressed genes. The system is based on cross-species sequence comparisons for phylogenetic footprinting and motif models for binding site prediction. We introduce a new set of analysis algorithms for the study of the combinatorial properties of transcription factor binding sites shared by sets of co-expressed genes. The new methods circumvent computational challenges through an applied focus on families of transcription factors with similar binding properties. The algorithm accurately identifies combinations of binding sites over-represented in reference collections and clarifies the results obtained by existing methods for the study of isolated binding sites.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"53 1","pages":"247-256"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75677973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Genome Wide Computational Analysis of Small Nuclear RNA Genes for Oryza Sativa (Indica and Japonica) 水稻(籼稻和粳稻)核小RNA基因全基因组计算分析
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0031
M. Shashikanth, A. Snethalatharani, S. Mubarak, K. Ulaganathan
Genome-wide computational analysis for small nuclear RNA (snRNA) genes resulted in identification of 76 and 73 putative snRNA genes from indica and japonica rice genomes, respectively. We used the basic criteria of a minimum of 70 % sequence identity to the plant snRNA gene used for genome search, presence of conserved promoter elements: TATA box, USE motif and monocot promoter specific elements (MSPs) and extensive sequence alignment to rice / plant expressed sequence tags to denote predicted sequence as snRNA genes. Comparative sequence analysis with snRNA genes from other organisms and predicted secondary structures showed that there is overall conservation of snRNA sequence and structure with plant specific features (presence of TATA box in both polymerase II and III transcribed genes, location of USE motif upstream to the TATA box at fixed but different distance in polymerase II and polymerase III transcribed snRNA genes) and the presence of multiple monocot specific MSPs upstream to the USE motif. Detailed analysis results including all multiple sequence alignments, sequence logos, secondary structures, sequences etc are available at http://kulab.org
对小核RNA (snRNA)基因进行全基因组计算分析,分别从籼稻和粳稻基因组中鉴定出76个和73个推测的snRNA基因。我们使用了与植物snRNA基因至少70%序列一致性的基本标准,用于基因组搜索,存在保守启动子元件:TATA box, USE motif和单子叶启动子特异性元件(MSPs),以及与水稻/植物表达序列标签广泛的序列比对,以表示预测序列为snRNA基因。与其他生物的snRNA基因和预测的二级结构的序列比较分析表明,snRNA序列和结构总体上具有植物特异性特征(在聚合酶II和聚合酶III转录的基因中都存在TATA box,在聚合酶II和聚合酶III转录的snRNA基因中,USE基序在TATA box上游的位置固定但距离不同),并且在USE基序上游存在多个单株特异性MSPs。详细的分析结果包括所有多个序列比对,序列标识,二级结构,序列等可在http://kulab.org上获得
{"title":"Genome Wide Computational Analysis of Small Nuclear RNA Genes for Oryza Sativa (Indica and Japonica)","authors":"M. Shashikanth, A. Snethalatharani, S. Mubarak, K. Ulaganathan","doi":"10.1142/9781860947292_0031","DOIUrl":"https://doi.org/10.1142/9781860947292_0031","url":null,"abstract":"Genome-wide computational analysis for small nuclear RNA (snRNA) genes resulted in identification of 76 and 73 putative snRNA genes from indica and japonica rice genomes, respectively. We used the basic criteria of a minimum of 70 % sequence identity to the plant snRNA gene used for genome search, presence of conserved promoter elements: TATA box, USE motif and monocot promoter specific elements (MSPs) and extensive sequence alignment to rice / plant expressed sequence tags to denote predicted sequence as snRNA genes. Comparative sequence analysis with snRNA genes from other organisms and predicted secondary structures showed that there is overall conservation of snRNA sequence and structure with plant specific features (presence of TATA box in both polymerase II and III transcribed genes, location of USE motif upstream to the TATA box at fixed but different distance in polymerase II and polymerase III transcribed snRNA genes) and the presence of multiple monocot specific MSPs upstream to the USE motif. Detailed analysis results including all multiple sequence alignments, sequence logos, secondary structures, sequences etc are available at http://kulab.org","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"99 2","pages":"277-286"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91443933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EDAM: An Efficient Clique Discovery Algorithm with Frequency Transformation for Finding Motifs 基于频率变换的高效团块发现算法
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0015
Yifei Ma, Guoren Wang, Yongguang Li, Yuhai Zhao
Finding motifs in DNA sequences plays an important role in deciphering transcriptional regulatory mechanisms and drug target identification. In this paper, we propose an efficient algorithm, EDAM, for finding motifs based on frequency transformation and Minimum Bounding Rectangle (MBR) techniques. It works in three phases, frequency transformation, MBR-clique searching and motif discovery. In frequency transformation, EDAM divides the sample sequences into a set of substrings by sliding windows, then transforms them to frequency vectors which are stored in MBRs. In MBR-clique searching, based on the frequency distance theorems EDAM searches for MBR-cliques used for motif discovery. In motif discovery, EDAM discovers larger cliques by extending smaller cliques with their neighbors. To accelerate the clique discovery, we propose a range query facility to avoid unnecessary computations for clique extension. The experimental results illustrate that EDAM well solves the running time bottleneck of the motif discovery problem in large DNA database.
在DNA序列中寻找基序在破译转录调控机制和药物靶标鉴定中具有重要作用。在本文中,我们提出了一种基于频率变换和最小边界矩形(MBR)技术的高效寻基算法EDAM。它分为三个阶段:频率变换、mbr -团搜索和基序发现。在频率变换方面,EDAM通过滑动窗口将采样序列分割成一组子串,然后将其变换成频率矢量存储在mbr中。在MBR-clique搜索中,EDAM基于频率距离定理搜索用于基序发现的MBR-clique。在基序发现中,EDAM通过与其邻居扩展较小的团块来发现较大的团块。为了加速团的发现,我们提出了一个范围查询工具,以避免团扩展的不必要计算。实验结果表明,EDAM很好地解决了大型DNA数据库中motif发现问题的运行时间瓶颈。
{"title":"EDAM: An Efficient Clique Discovery Algorithm with Frequency Transformation for Finding Motifs","authors":"Yifei Ma, Guoren Wang, Yongguang Li, Yuhai Zhao","doi":"10.1142/9781860947292_0015","DOIUrl":"https://doi.org/10.1142/9781860947292_0015","url":null,"abstract":"Finding motifs in DNA sequences plays an important role in deciphering transcriptional regulatory mechanisms and drug target identification. In this paper, we propose an efficient algorithm, EDAM, for finding motifs based on frequency transformation and Minimum Bounding Rectangle (MBR) techniques. It works in three phases, frequency transformation, MBR-clique searching and motif discovery. In frequency transformation, EDAM divides the sample sequences into a set of substrings by sliding windows, then transforms them to frequency vectors which are stored in MBRs. In MBR-clique searching, based on the frequency distance theorems EDAM searches for MBR-cliques used for motif discovery. In motif discovery, EDAM discovers larger cliques by extending smaller cliques with their neighbors. To accelerate the clique discovery, we propose a range query facility to avoid unnecessary computations for clique extension. The experimental results illustrate that EDAM well solves the running time bottleneck of the motif discovery problem in large DNA database.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"33 1","pages":"119-128"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79263502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gene Expression Data Clustering Based on Local Similarity Combination 基于局部相似组合的基因表达数据聚类
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0038
De Pan, Fei Wang
Clustering is widely used in gene expression analysis, which helps to group genes with similar biological function together. The traditional clustering techniques are not suitable to be directly applied to gene expression time series data, because of the inhered properties of local regulation and time shift. In order to cope with the existing problems, the local similarity and time shift, we have developed a new similarity measurement technique called Local Similarity Combination in this paper. And at last, we’ll run our method on the real gene expression data and show that it works well.
聚类分析广泛应用于基因表达分析,它有助于将具有相似生物学功能的基因聚在一起。传统的聚类技术由于其固有的局部调控和时移特性,不适合直接应用于基因表达时间序列数据。为了解决当前存在的局部相似度和时移问题,本文提出了一种新的相似度度量方法——局部相似度组合。最后,我们将在真实的基因表达数据上运行我们的方法,并证明它是有效的。
{"title":"Gene Expression Data Clustering Based on Local Similarity Combination","authors":"De Pan, Fei Wang","doi":"10.1142/9781860947292_0038","DOIUrl":"https://doi.org/10.1142/9781860947292_0038","url":null,"abstract":"Clustering is widely used in gene expression analysis, which helps to group genes with similar biological function together. The traditional clustering techniques are not suitable to be directly applied to gene expression time series data, because of the inhered properties of local regulation and time shift. In order to cope with the existing problems, the local similarity and time shift, we have developed a new similarity measurement technique called Local Similarity Combination in this paper. And at last, we’ll run our method on the real gene expression data and show that it works well.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"45 1","pages":"353-362"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75680433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Recursive Method for Solving Haplotype Frequencies in Multiple Loci Linkage Analysis 多位点连锁分析中单倍型频率的递归求解方法
Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0016
M. Ng
Multiple loci analysis has become popular with the advanced development in biological experiments. A lot of studies have been focused on the biological and the statistical properties of such multiple loci analysis. In this paper, we study one of the important computational problems: solving the probabilities of haplotype classes from a large linear system Ax = b derived from the recombination events in multiple loci analysis. Since the size of the recombination matrix A increases exponentially with respect to the number of loci, fast solvers are required to deal with a large number of loci in the analysis. By exploiting the nice structure of the matrix A, we develop an efficient recursive algorithm for solving such structured linear systems. In particular, the complexity of the proposed algorithm is of O(mlogm) operations and the memory requirement is of O(m) locations where m is the size of the matrix A. Numerical examples are given to demonstrate the effectiveness of our efficient solver.
随着生物实验技术的发展,多基因座分析已成为一种流行的分析方法。许多研究都集中在这种多位点分析的生物学和统计学特性上。本文研究了一个重要的计算问题:求解大型线性系统Ax = b在多基因座分析中由重组事件导出的单倍型类的概率。由于重组矩阵A的大小相对于基因座的数量呈指数增长,因此需要快速求解器来处理分析中的大量基因座。利用矩阵A的良好结构,我们开发了求解这类结构化线性系统的有效递归算法。特别地,所提出的算法的复杂度为O(mlogm)个操作,内存需求为O(m)个位置,其中m为矩阵a的大小。
{"title":"A Recursive Method for Solving Haplotype Frequencies in Multiple Loci Linkage Analysis","authors":"M. Ng","doi":"10.1142/9781860947292_0016","DOIUrl":"https://doi.org/10.1142/9781860947292_0016","url":null,"abstract":"Multiple loci analysis has become popular with the advanced development in biological experiments. A lot of studies have been focused on the biological and the statistical properties of such multiple loci analysis. In this paper, we study one of the important computational problems: solving the probabilities of haplotype classes from a large linear system Ax = b derived from the recombination events in multiple loci analysis. Since the size of the recombination matrix A increases exponentially with respect to the number of loci, fast solvers are required to deal with a large number of loci in the analysis. By exploiting the nice structure of the matrix A, we develop an efficient recursive algorithm for solving such structured linear systems. In particular, the complexity of the proposed algorithm is of O(mlogm) operations and the memory requirement is of O(m) locations where m is the size of the matrix A. Numerical examples are given to demonstrate the effectiveness of our efficient solver.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"8 1","pages":"129-138"},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74977713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the ... Asia-Pacific bioinformatics conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1