基因组规律指导真菌和后生动物的基因预测。

Q4 Pharmacology, Toxicology and Pharmaceutics International Journal of Computational Biology and Drug Design Pub Date : 2013-01-01 Epub Date: 2013-02-21 DOI:10.1504/IJCBDD.2013.052197
Yaping Fang, Jun Li
{"title":"基因组规律指导真菌和后生动物的基因预测。","authors":"Yaping Fang,&nbsp;Jun Li","doi":"10.1504/IJCBDD.2013.052197","DOIUrl":null,"url":null,"abstract":"<p><p>Protein coding gene prediction by computational approaches is a fundamental step for genome annotation. However, it is a challenge to accurately predict eukaryotic genes in silico. By surveying the model genomes, we found that the Spearman's rank correlation coefficient between the number of experimental-verified genes and the size of genomes was 0.96 for all eukaryotes except plants, indicating the relationship between genome size and the number of coding genes can be expressed with a monotonic function. Regression analysis found that the relationship of total protein coding genes over genome size followed a logarithmic equation. We integrated the equation into ab initio gene prediction software to guide the gene prediction by constraining the total number of predicted genes. We evaluated the software in three eukaryotic genomes. Results showed that >90% of false positive predictions were removed while >80% of true positives were retained, resulting in much higher specificity.</p>","PeriodicalId":39227,"journal":{"name":"International Journal of Computational Biology and Drug Design","volume":" ","pages":"157-69"},"PeriodicalIF":0.0000,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJCBDD.2013.052197","citationCount":"2","resultStr":"{\"title\":\"Genomic law guided gene prediction in fungi and metazoans.\",\"authors\":\"Yaping Fang,&nbsp;Jun Li\",\"doi\":\"10.1504/IJCBDD.2013.052197\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Protein coding gene prediction by computational approaches is a fundamental step for genome annotation. However, it is a challenge to accurately predict eukaryotic genes in silico. By surveying the model genomes, we found that the Spearman's rank correlation coefficient between the number of experimental-verified genes and the size of genomes was 0.96 for all eukaryotes except plants, indicating the relationship between genome size and the number of coding genes can be expressed with a monotonic function. Regression analysis found that the relationship of total protein coding genes over genome size followed a logarithmic equation. We integrated the equation into ab initio gene prediction software to guide the gene prediction by constraining the total number of predicted genes. We evaluated the software in three eukaryotic genomes. Results showed that >90% of false positive predictions were removed while >80% of true positives were retained, resulting in much higher specificity.</p>\",\"PeriodicalId\":39227,\"journal\":{\"name\":\"International Journal of Computational Biology and Drug Design\",\"volume\":\" \",\"pages\":\"157-69\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1504/IJCBDD.2013.052197\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computational Biology and Drug Design\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJCBDD.2013.052197\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2013/2/21 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"Pharmacology, Toxicology and Pharmaceutics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computational Biology and Drug Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJCBDD.2013.052197","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2013/2/21 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"Pharmacology, Toxicology and Pharmaceutics","Score":null,"Total":0}
引用次数: 2

摘要

利用计算方法预测蛋白质编码基因是基因组注释的基本步骤。然而,在计算机上准确预测真核基因是一个挑战。通过对模型基因组的调查,我们发现除植物外,所有真核生物经实验验证的基因数量与基因组大小之间的Spearman's秩相关系数为0.96,表明基因组大小与编码基因数量之间的关系可以用单调函数来表达。回归分析发现,总蛋白编码基因与基因组大小的关系遵循对数方程。我们将方程整合到从头算基因预测软件中,通过约束预测基因总数来指导基因预测。我们在三个真核生物基因组中评估了该软件。结果显示,>90%的假阳性预测被去除,而>80%的真阳性预测被保留,从而获得更高的特异性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Genomic law guided gene prediction in fungi and metazoans.

Protein coding gene prediction by computational approaches is a fundamental step for genome annotation. However, it is a challenge to accurately predict eukaryotic genes in silico. By surveying the model genomes, we found that the Spearman's rank correlation coefficient between the number of experimental-verified genes and the size of genomes was 0.96 for all eukaryotes except plants, indicating the relationship between genome size and the number of coding genes can be expressed with a monotonic function. Regression analysis found that the relationship of total protein coding genes over genome size followed a logarithmic equation. We integrated the equation into ab initio gene prediction software to guide the gene prediction by constraining the total number of predicted genes. We evaluated the software in three eukaryotic genomes. Results showed that >90% of false positive predictions were removed while >80% of true positives were retained, resulting in much higher specificity.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Computational Biology and Drug Design
International Journal of Computational Biology and Drug Design Pharmacology, Toxicology and Pharmaceutics-Drug Discovery
CiteScore
1.00
自引率
0.00%
发文量
8
期刊最新文献
Assessment and Validation of Emulgel Based Salicylic acid Formulation Development to Drug release and Optimization by Statistical Design EyeRIS: Image-Based Identification of Goats using Iris Advanced DEEPCNN Breast Cancer Mammogram Image Detection and Classification with Butterfly Optimization Algorithm A Unique Noise Detector Developed for the Filtering of X-Ray Images of Bone Fractures Residue Interaction Network analysis and Molecular dynamics simulation of 6K Viroporin: Chikungunya Virus Channel Proteins
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1