Prediction of probable genes by Fourier analysis of genomic sequences.

S Tiwari, S Ramachandran, A Bhattacharya, S Bhattacharya, R Ramaswamy
{"title":"Prediction of probable genes by Fourier analysis of genomic sequences.","authors":"S Tiwari, S Ramachandran, A Bhattacharya, S Bhattacharya, R Ramaswamy","doi":"10.1093/bioinformatics/13.3.263","DOIUrl":null,"url":null,"abstract":"MOTIVATION The major signal in coding regions of genomic sequences is a three-base periodicity. Our aim is to use Fourier techniques to analyse this periodicity, and thereby to develop a tool to recognize coding regions in genomic DNA. RESULT The three-base periodicity in the nucleotide arrangement is evidenced as a sharp peak at frequency f = 1/3 in the Fourier (or power) spectrum. From extensive spectral analysis of DNA sequences of total length over 5.5 million base pairs from a wide variety or organisms (including the human genome), and by separately examining coding and non-coding sequences, we find that the relative-height of the peak at f = 1/3 in the Fourier spectrum is a good discriminator of coding potential. This feature is utilized by us to detect probable coding regions in DNA sequences, by examining the local signal-to-noise ratio of the peak within a sliding window. While the overall accuracy is comparable to that of other techniques currently in use, the measure that is presently proposed is independent of training sets or existing database information, and can thus find general application. AVAILABILITY A computer program GeneScan which locates coding open reading frames and exonic regions in genomic sequences has been developed, and is available on request.","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1997-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.3.263","citationCount":"468","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer applications in the biosciences : CABIOS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/13.3.263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 468

Abstract

MOTIVATION The major signal in coding regions of genomic sequences is a three-base periodicity. Our aim is to use Fourier techniques to analyse this periodicity, and thereby to develop a tool to recognize coding regions in genomic DNA. RESULT The three-base periodicity in the nucleotide arrangement is evidenced as a sharp peak at frequency f = 1/3 in the Fourier (or power) spectrum. From extensive spectral analysis of DNA sequences of total length over 5.5 million base pairs from a wide variety or organisms (including the human genome), and by separately examining coding and non-coding sequences, we find that the relative-height of the peak at f = 1/3 in the Fourier spectrum is a good discriminator of coding potential. This feature is utilized by us to detect probable coding regions in DNA sequences, by examining the local signal-to-noise ratio of the peak within a sliding window. While the overall accuracy is comparable to that of other techniques currently in use, the measure that is presently proposed is independent of training sets or existing database information, and can thus find general application. AVAILABILITY A computer program GeneScan which locates coding open reading frames and exonic regions in genomic sequences has been developed, and is available on request.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用基因组序列的傅立叶分析预测可能基因。
动机:基因组序列编码区域的主要信号是三碱基周期性。我们的目标是使用傅里叶技术来分析这种周期性,从而开发一种工具来识别基因组DNA中的编码区域。结果:在傅里叶(或功率)谱中,在频率f = 1/3处有一个尖峰,证明了核苷酸排列的三碱基周期性。通过对来自各种生物体(包括人类基因组)的总长度超过550万碱基对的DNA序列进行广泛的光谱分析,并通过分别检查编码和非编码序列,我们发现傅里叶谱中f = 1/3处峰的相对高度是编码潜力的良好鉴别器。我们利用这一特征来检测DNA序列中可能的编码区域,通过检查滑动窗口内峰值的局部信噪比。虽然总体精度与目前使用的其他技术相当,但目前提出的测量方法不依赖于训练集或现有数据库信息,因此可以普遍应用。可用性:已经开发了一个计算机程序GeneScan,它可以定位基因组序列中的编码开放阅读框和外显子区域,并且可以根据要求提供。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A genetic algorithm for multiple molecular sequence alignment. Displaying the information contents of structural RNA alignments: the structure logos. Q-RT-PCR: data analysis software for measurement of gene expression by competitive RT-PCR. SS3D-P2: a three dimensional substructure search program for protein motifs based on secondary structure elements. XDOM, a graphical tool to analyse domain arrangements in any set of protein sequences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1