GINGER: an integrated method for high-accuracy prediction of gene structure in higher eukaryotes at the gene and exon level.

IF 3.9 2区 生物学 Q1 GENETICS & HEREDITY DNA Research Pub Date : 2023-08-01 DOI:10.1093/dnares/dsad017
Takeaki Taniguchi, Miki Okuno, Takahiro Shinoda, Fumiya Kobayashi, Kazuki Takahashi, Hideaki Yuasa, Yuta Nakamura, Hiroyuki Tanaka, Rei Kajitani, Takehiko Itoh
{"title":"GINGER: an integrated method for high-accuracy prediction of gene structure in higher eukaryotes at the gene and exon level.","authors":"Takeaki Taniguchi, Miki Okuno, Takahiro Shinoda, Fumiya Kobayashi, Kazuki Takahashi, Hideaki Yuasa, Yuta Nakamura, Hiroyuki Tanaka, Rei Kajitani, Takehiko Itoh","doi":"10.1093/dnares/dsad017","DOIUrl":null,"url":null,"abstract":"<p><p>The prediction of gene structure within the genome sequence is the starting point of genome analysis, and its accuracy has a significant impact on the quality of subsequent analyses. Gene structure prediction is roughly divided into RNA-Seq-based methods, ab initio-based methods, homology-based methods, and the integration of individual prediction methods. Integrated methods are mainstream in recent genome projects because they improve prediction accuracy by combining or taking the best individual prediction findings; however, adequate prediction accuracy for eukaryotic species has not yet been achieved. Therefore, we developed an integrated tool, GINGER, that solves various issues related to gene structure prediction in higher eukaryotes. By handling artefacts in alignments of RNA and protein sequences, reconstructing gene structures via dynamic programming with appropriately weighted and scored exon/intron/intergenic regions, and applying different prediction processes and filtering criteria to multi-exon and single-exon genes, we achieved a significant improvement in accuracy compared to the existing integration methods. The feature of GINGER is its high prediction accuracy at the gene and exon levels, which is pronounced for species with more complex gene architectures. GINGER is implemented using Nextflow, which allows for the efficient and effective use of computing resources.</p>","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":"30 4","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/cb/45/dsad017.PMC10439787.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DNA Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/dnares/dsad017","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

The prediction of gene structure within the genome sequence is the starting point of genome analysis, and its accuracy has a significant impact on the quality of subsequent analyses. Gene structure prediction is roughly divided into RNA-Seq-based methods, ab initio-based methods, homology-based methods, and the integration of individual prediction methods. Integrated methods are mainstream in recent genome projects because they improve prediction accuracy by combining or taking the best individual prediction findings; however, adequate prediction accuracy for eukaryotic species has not yet been achieved. Therefore, we developed an integrated tool, GINGER, that solves various issues related to gene structure prediction in higher eukaryotes. By handling artefacts in alignments of RNA and protein sequences, reconstructing gene structures via dynamic programming with appropriately weighted and scored exon/intron/intergenic regions, and applying different prediction processes and filtering criteria to multi-exon and single-exon genes, we achieved a significant improvement in accuracy compared to the existing integration methods. The feature of GINGER is its high prediction accuracy at the gene and exon levels, which is pronounced for species with more complex gene architectures. GINGER is implemented using Nextflow, which allows for the efficient and effective use of computing resources.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GINGER:在基因和外显子水平高精度预测高等真核生物基因结构的综合方法。
基因组序列中基因结构的预测是基因组分析的起点,其准确性对后续分析的质量有着重要影响。基因结构预测大致分为基于RNA-Seq的方法、基于ab initio的方法、基于同源性的方法以及单个预测方法的整合。整合方法是近年来基因组项目的主流,因为它们通过合并或提取最佳的单个预测结果来提高预测精度;然而,真核生物物种尚未达到足够的预测精度。因此,我们开发了一种集成工具--GINGER,以解决与高等真核生物基因结构预测相关的各种问题。通过处理 RNA 和蛋白质序列比对中的伪差,通过动态编程与适当加权和计分的外显子/内含子/内含子区域重建基因结构,以及对多外显子和单外显子基因应用不同的预测过程和过滤标准,与现有的整合方法相比,我们在准确性方面取得了显著提高。GINGER 的特点是在基因和外显子水平上具有很高的预测准确率,这一点在基因结构较为复杂的物种上表现明显。GINGER 是用 Nextflow 实现的,它可以高效地利用计算资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
DNA Research
DNA Research 生物-遗传学
CiteScore
6.00
自引率
4.90%
发文量
39
审稿时长
4.5 months
期刊介绍: DNA Research is an internationally peer-reviewed journal which aims at publishing papers of highest quality in broad aspects of DNA and genome-related research. Emphasis will be made on the following subjects: 1) Sequencing and characterization of genomes/important genomic regions, 2) Comprehensive analysis of the functions of genes, gene families and genomes, 3) Techniques and equipments useful for structural and functional analysis of genes, gene families and genomes, 4) Computer algorithms and/or their applications relevant to structural and functional analysis of genes and genomes. The journal also welcomes novel findings in other scientific disciplines related to genomes.
期刊最新文献
Chromosome-level genome assembly of Pontederia cordata L. provides insights into its rapid adaptation and variation of flower colors. Genome-resolved analysis of Serratia marcescens SMTT infers niche specialization as a hydrocarbon-degrader. A fully phased, chromosome-scale genome of sugar beet line FC309 enables the discovery of Fusarium yellows resistance QTL. The haplotype-phased genome assembly facilitated the deciphering of the bud dormancy-related QTLs in Prunus mume. Near-complete telomere-to-telomere de novo genome assembly in Egyptian clover (Trifolium alexandrinum).
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1