An XML application for genomic data interoperation

K. Cheung, Yang Liu, Anuj Kumar, M. Snyder, M. Gerstein, P. Miller
{"title":"An XML application for genomic data interoperation","authors":"K. Cheung, Yang Liu, Anuj Kumar, M. Snyder, M. Gerstein, P. Miller","doi":"10.1109/BIBE.2001.974417","DOIUrl":null,"url":null,"abstract":"As the eXtensible Markup Language (XML) becomes a popular or standard language for exchanging data over the Internet/Web, there are a growing number of genome Web sites that make their data available in XML format. Publishing genomic data in XML format alone would not be that useful if there is a lack of development of software applications that could take advantage of the XML technology to process these XML-formatted data. This paper illustrates the usefulness of XML in representing and interoperating genomic data between two different data sources (Snyder's laboratory at Yale and SGD at Stanford). In particular, we compare the locations of transposon insertions in the yeast DNA sequences that have been identified by BLAST searches with the chromosomal locations of the yeast open reading frames (ORFs) stored in SGD. Such a comparison allows us to characterize the transposon insertions by indicating whether they fall into any ORFs (which may potentially encode proteins that possess essential biological functions). To implement this XML-based interoperation, we used NCBIs \"blastall\" (which gives an XML output option) and SGD's yeast nucleotide sequence dataset to establish a local blast server. Also, we converted the SGD's ORF location data file (which is available in tab-delimited formal) into an XML document based on the BIOML (BIOpolymer Markup Language) standard.","PeriodicalId":405124,"journal":{"name":"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2001.974417","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

As the eXtensible Markup Language (XML) becomes a popular or standard language for exchanging data over the Internet/Web, there are a growing number of genome Web sites that make their data available in XML format. Publishing genomic data in XML format alone would not be that useful if there is a lack of development of software applications that could take advantage of the XML technology to process these XML-formatted data. This paper illustrates the usefulness of XML in representing and interoperating genomic data between two different data sources (Snyder's laboratory at Yale and SGD at Stanford). In particular, we compare the locations of transposon insertions in the yeast DNA sequences that have been identified by BLAST searches with the chromosomal locations of the yeast open reading frames (ORFs) stored in SGD. Such a comparison allows us to characterize the transposon insertions by indicating whether they fall into any ORFs (which may potentially encode proteins that possess essential biological functions). To implement this XML-based interoperation, we used NCBIs "blastall" (which gives an XML output option) and SGD's yeast nucleotide sequence dataset to establish a local blast server. Also, we converted the SGD's ORF location data file (which is available in tab-delimited formal) into an XML document based on the BIOML (BIOpolymer Markup Language) standard.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于基因组数据互操作的XML应用程序
随着可扩展标记语言(eXtensible Markup Language, XML)成为在Internet/Web上交换数据的流行语言或标准语言,越来越多的基因组网站以XML格式提供数据。如果没有开发能够利用XML技术处理这些XML格式数据的软件应用程序,那么单独以XML格式发布基因组数据就没有多大用处。本文说明了XML在两个不同数据源(耶鲁大学Snyder的实验室和斯坦福大学SGD)之间表示和互操作基因组数据方面的有用性。特别地,我们将BLAST搜索确定的酵母DNA序列中的转座子插入位置与存储在SGD中的酵母开放阅读框(orf)的染色体位置进行了比较。这样的比较使我们能够通过指示转座子插入是否属于任何orf(可能编码具有基本生物学功能的蛋白质)来表征转座子插入。为了实现这种基于XML的互操作,我们使用ncbi的“blastall”(它提供了一个XML输出选项)和SGD的酵母核苷酸序列数据集来建立本地blast服务器。此外,我们还将SGD的ORF位置数据文件(以制表符分隔的形式提供)转换为基于BIOML(生物聚合物标记语言)标准的XML文档。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Comparing algorithms for large-scale sequence analysis Mining genome variation to associate disease with transcription factor binding site alteration Searching online journals for fluorescence microscope images depicting protein subcellular location patterns Profile combinatorics for fragment selection in comparative protein structure modeling Development of a robotic device for MRI-guided interventions in the breast
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1