Hypermotifs: Novel Discriminatory Patterns for Nucleotide Sequences and their Application to Core Promoter Prediction in Eukaryotes

C. Pridgeon, D. Corne
{"title":"Hypermotifs: Novel Discriminatory Patterns for Nucleotide Sequences and their Application to Core Promoter Prediction in Eukaryotes","authors":"C. Pridgeon, D. Corne","doi":"10.1109/CIBCB.2005.1594949","DOIUrl":null,"url":null,"abstract":"We approach the general problem of finding a model that discriminates between classes of nucleotide sequences. In this area, a common approach is to train a model, such as a neural network, or a hidden Markov model, to perform the discrimination, using as inputs either the raw sequences encoded in a standard form, or features derived from the raw data in a pre-processing stage. In this paper a novel discriminatory pattern structure for nucleotide sequences is introduced, called a hypermotif, and evolutionary computation is used to evolve a collection of specific hypermotifs which discriminate between classes in the data. The raw nucleotide data are then processed, transforming it into feature vectors, where the features are the individual scores on the evolved hypermotifs. Using this transformation, any classification method may then be used to build an accurate predictive model. The approach is tested on a database of eukaryotic promoters, and find that this method enables us to outperform a standard multilayer perceptron (despite using a linear discriminant as the final classifier), and provides similar performance to the best approach so far for these data (which uses a time delay neural network)","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2005.1594949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

We approach the general problem of finding a model that discriminates between classes of nucleotide sequences. In this area, a common approach is to train a model, such as a neural network, or a hidden Markov model, to perform the discrimination, using as inputs either the raw sequences encoded in a standard form, or features derived from the raw data in a pre-processing stage. In this paper a novel discriminatory pattern structure for nucleotide sequences is introduced, called a hypermotif, and evolutionary computation is used to evolve a collection of specific hypermotifs which discriminate between classes in the data. The raw nucleotide data are then processed, transforming it into feature vectors, where the features are the individual scores on the evolved hypermotifs. Using this transformation, any classification method may then be used to build an accurate predictive model. The approach is tested on a database of eukaryotic promoters, and find that this method enables us to outperform a standard multilayer perceptron (despite using a linear discriminant as the final classifier), and provides similar performance to the best approach so far for these data (which uses a time delay neural network)
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高基序:核苷酸序列的新区分模式及其在真核生物核心启动子预测中的应用
我们接近的一般问题,找到一个模型之间的核苷酸序列类的区别。在这一领域,一种常见的方法是训练一个模型,如神经网络或隐马尔可夫模型,使用以标准形式编码的原始序列或从预处理阶段的原始数据中导出的特征作为输入来执行识别。本文引入了一种新的核苷酸序列区分模式结构,称为超基序,并使用进化计算来进化出一组特定的超基序,这些超基序可以区分数据中的类别。然后对原始核苷酸数据进行处理,将其转换为特征向量,其中特征是进化的超基序上的个体得分。使用这种转换,任何分类方法都可以用来构建准确的预测模型。该方法在真核生物启动子数据库上进行了测试,并发现该方法使我们能够优于标准多层感知器(尽管使用线性判别器作为最终分类器),并且对于这些数据(使用时间延迟神经网络)提供与迄今为止最佳方法相似的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An Internet-based Melanoma Diagnostic System - Toward the Practical Application - Network Motifs, Feedback Loops and the Dynamics of Genetic Regulatory Networks Multicategory Classification using Extended SVM-RFE and Markov Blanket on SELDI-TOF Mass Spectrometry Data Improving Protein Secondary-Structure Prediction by Predicting Ends of Secondary-Structure Segments Preliminary Results for GAMI: A Genetic Algorithms Approach to Motif Inference
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1