GenoMosaic: on-demand multiple genome comparison and comparative annotation

C. Gibas, D. Sturgill, J. Weller
{"title":"GenoMosaic: on-demand multiple genome comparison and comparative annotation","authors":"C. Gibas, D. Sturgill, J. Weller","doi":"10.1109/BIBE.2003.1188942","DOIUrl":null,"url":null,"abstract":"GenoMosaic is a portable database application for on demand multiple genome comparison. We discuss the methods used to generate a GenoMosaic data set from genome sequence data, and present the relational data model used in the application. We define an abstraction of genome sequence data (the feature mosaic) that allows us to bridge between annotation that describes features within single genes and that which includes possibly multiple genes and intergenic features over long stretches of genomic sequence. The goal of this project is to support new method development for on-demand multiple genome comparison. Each genome to be compared can be modeled as a string of generic features of any type that can be computationally defined, related by adjacency information within and among genomes. The generic feature abstraction makes it possible to study the arrangement of features in the genome at a level of detail which includes RNA genes, putative regulatory regions, SNPs, overlapping transcripts, intron splice junctions, alternative polyadenylation signals-in short, to incorporate significant sequence details which are not necessarily within protein-coding regions. This abstraction is amenable to functional implementation as a relational data model upon which novel query capabilities can be built, and provides objects that can be analyzed using algorithms for comparison of strings and lists. As an initial effort, we have implemented a prototype using a representative set of comparative and content-based annotation methods to reduce a collection of prokaryotic genomes to a feature mosaic representation. Entity-Relationship modeling was then used to develop a data model capable of storing detailed results, including complete parameters for each instance of analysis.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"240 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2003.1188942","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

GenoMosaic is a portable database application for on demand multiple genome comparison. We discuss the methods used to generate a GenoMosaic data set from genome sequence data, and present the relational data model used in the application. We define an abstraction of genome sequence data (the feature mosaic) that allows us to bridge between annotation that describes features within single genes and that which includes possibly multiple genes and intergenic features over long stretches of genomic sequence. The goal of this project is to support new method development for on-demand multiple genome comparison. Each genome to be compared can be modeled as a string of generic features of any type that can be computationally defined, related by adjacency information within and among genomes. The generic feature abstraction makes it possible to study the arrangement of features in the genome at a level of detail which includes RNA genes, putative regulatory regions, SNPs, overlapping transcripts, intron splice junctions, alternative polyadenylation signals-in short, to incorporate significant sequence details which are not necessarily within protein-coding regions. This abstraction is amenable to functional implementation as a relational data model upon which novel query capabilities can be built, and provides objects that can be analyzed using algorithms for comparison of strings and lists. As an initial effort, we have implemented a prototype using a representative set of comparative and content-based annotation methods to reduce a collection of prokaryotic genomes to a feature mosaic representation. Entity-Relationship modeling was then used to develop a data model capable of storing detailed results, including complete parameters for each instance of analysis.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基因组:按需多基因组比较和比较注释
GenoMosaic是一个便携式的按需多基因组比较数据库应用程序。我们讨论了从基因组序列数据中生成基因组数据集的方法,并给出了应用中使用的关系数据模型。我们定义了基因组序列数据的抽象(特征拼接),它允许我们在描述单个基因内的特征的注释和描述可能包括多个基因和基因组序列长片段的基因间特征的注释之间架起桥梁。该项目的目标是支持按需多基因组比较的新方法开发。每个要比较的基因组都可以被建模为一串可以计算定义的任何类型的通用特征,通过基因组内部和基因组之间的邻接信息联系起来。通用特征抽象使得研究基因组中特征的排列成为可能,这些特征包括RNA基因、假定的调控区域、snp、重叠转录本、内含子剪接连接、可选的聚腺苷化信号——简而言之,将不一定在蛋白质编码区域内的重要序列细节结合起来。这种抽象适用于作为关系数据模型的功能实现,可以在其上构建新的查询功能,并提供可以使用比较字符串和列表的算法进行分析的对象。作为最初的努力,我们已经实现了一个原型,使用一组具有代表性的比较和基于内容的注释方法,将原核基因组集合减少到特征马赛克表示。然后使用实体-关系建模来开发能够存储详细结果的数据模型,包括每个分析实例的完整参数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
GenoMosaic: on-demand multiple genome comparison and comparative annotation Respiratory gating for MRI and MRS in rodents DHC: a density-based hierarchical clustering method for time series gene expression data Evolving bubbles for prostate surface detection from TRUS images A repulsive clustering algorithm for gene expression data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1