GAAP:一个基于gui的基因组组装和注释包。

IF 1.8 4区 生物学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY Current Genomics Pub Date : 2022-06-10 DOI:10.2174/1389202923666220128155537
Deepak Singla, Inderjit Singh Yadav
{"title":"GAAP:一个基于gui的基因组组装和注释包。","authors":"Deepak Singla,&nbsp;Inderjit Singh Yadav","doi":"10.2174/1389202923666220128155537","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Background</i>:</b> Next-generation sequencing (NGS) technologies are being continuously used for high-throughput sequencing data generation that requires easy-to-use GUI-based data analysis software. These kinds of software could be used in-parallel with sequencing for the automatic data analysis. At present, very few software are available for use and most of them are commercial, thus creating a gap between data generation and data analysis. <b><i>Methods</i>:</b> GAAP is developed on the NodeJS platform that uses HTML, JavaScript as the front-end for communication with users. We have implemented FastQC and trimmomatic tool for quality checking and control. Velvet and Prodigal are integrated for genome assembly and gene prediction. The annotation will be done with the help of remote NCBI Blast and IPR-Scan. In the back- end, we have used PERL and JavaScript for the processing of data. To evaluate the performance of GAAP, we have assembled a viral (SRR11621811), bacterial (SRR17153353) and human genome (SRR16845439). <b><i>Results</i>:</b> We have used GAAP software to assemble, and annotate a COVID-19 genome on a desktop computer that resulted in a single contig of 27994bp with 99.57% reference genome coverage. This assembly predicted 11 genes, of which 10 were annotated using annotation module of GAAP. We have also assembled a bacterial and human genome 138 and 194281 contigs with N50 value 100399 and 610, respectively. <b><i>Conclusion</i>:</b> In this study, we have developed freely available, platform-independent genome assembly and annotation (GAAP) software (www.deepaklab.com/gaap). The software itself acts as a complete data analysis package with quality check, quality control, <i>de-novo</i> genome assembly, gene prediction and annotation (Blast, PFAM, GO-Term, pathway and enzyme mapping) modules.</p>","PeriodicalId":10803,"journal":{"name":"Current Genomics","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2022-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/e7/4f/CG-23-77.PMC9878834.pdf","citationCount":"2","resultStr":"{\"title\":\"GAAP: A GUI-based Genome Assembly and Annotation Package.\",\"authors\":\"Deepak Singla,&nbsp;Inderjit Singh Yadav\",\"doi\":\"10.2174/1389202923666220128155537\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b><i>Background</i>:</b> Next-generation sequencing (NGS) technologies are being continuously used for high-throughput sequencing data generation that requires easy-to-use GUI-based data analysis software. These kinds of software could be used in-parallel with sequencing for the automatic data analysis. At present, very few software are available for use and most of them are commercial, thus creating a gap between data generation and data analysis. <b><i>Methods</i>:</b> GAAP is developed on the NodeJS platform that uses HTML, JavaScript as the front-end for communication with users. We have implemented FastQC and trimmomatic tool for quality checking and control. Velvet and Prodigal are integrated for genome assembly and gene prediction. The annotation will be done with the help of remote NCBI Blast and IPR-Scan. In the back- end, we have used PERL and JavaScript for the processing of data. To evaluate the performance of GAAP, we have assembled a viral (SRR11621811), bacterial (SRR17153353) and human genome (SRR16845439). <b><i>Results</i>:</b> We have used GAAP software to assemble, and annotate a COVID-19 genome on a desktop computer that resulted in a single contig of 27994bp with 99.57% reference genome coverage. This assembly predicted 11 genes, of which 10 were annotated using annotation module of GAAP. We have also assembled a bacterial and human genome 138 and 194281 contigs with N50 value 100399 and 610, respectively. <b><i>Conclusion</i>:</b> In this study, we have developed freely available, platform-independent genome assembly and annotation (GAAP) software (www.deepaklab.com/gaap). The software itself acts as a complete data analysis package with quality check, quality control, <i>de-novo</i> genome assembly, gene prediction and annotation (Blast, PFAM, GO-Term, pathway and enzyme mapping) modules.</p>\",\"PeriodicalId\":10803,\"journal\":{\"name\":\"Current Genomics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2022-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/e7/4f/CG-23-77.PMC9878834.pdf\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current Genomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.2174/1389202923666220128155537\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.2174/1389202923666220128155537","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 2

摘要

背景:下一代测序(NGS)技术正在不断用于高通量测序数据生成,这需要易于使用的基于gui的数据分析软件。这些软件可以与测序并行使用,用于自动数据分析。目前可供使用的软件很少,而且大多是商业软件,这就造成了数据生成和数据分析之间的差距。方法:GAAP在NodeJS平台上开发,以HTML、JavaScript为前端与用户通信。我们已经实施了FastQC和trimmomatic工具进行质量检查和控制。将Velvet和Prodigal集成用于基因组组装和基因预测。注释将在远程NCBI Blast和知识产权扫描的帮助下完成。在后台,我们使用PERL和JavaScript对数据进行处理。为了评估GAAP的性能,我们组装了病毒(SRR11621811)、细菌(SRR17153353)和人类基因组(SRR16845439)。结果:我们使用GAAP软件在台式计算机上组装并注释了COVID-19基因组,结果得到27994bp的单个基因组,参考基因组覆盖率为99.57%。该组合预测了11个基因,其中10个基因使用GAAP的注释模块进行了注释。我们还组装了细菌和人类基因组的138和194281个contigs, N50值分别为100399和610。结论:在这项研究中,我们开发了免费的、与平台无关的基因组组装和注释(GAAP)软件(www.deepaklab.com/gaap)。该软件本身作为一个完整的数据分析包,具有质量检查,质量控制,从头基因组组装,基因预测和注释(Blast, PFAM, GO-Term,途径和酶制图)模块。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GAAP: A GUI-based Genome Assembly and Annotation Package.

Background: Next-generation sequencing (NGS) technologies are being continuously used for high-throughput sequencing data generation that requires easy-to-use GUI-based data analysis software. These kinds of software could be used in-parallel with sequencing for the automatic data analysis. At present, very few software are available for use and most of them are commercial, thus creating a gap between data generation and data analysis. Methods: GAAP is developed on the NodeJS platform that uses HTML, JavaScript as the front-end for communication with users. We have implemented FastQC and trimmomatic tool for quality checking and control. Velvet and Prodigal are integrated for genome assembly and gene prediction. The annotation will be done with the help of remote NCBI Blast and IPR-Scan. In the back- end, we have used PERL and JavaScript for the processing of data. To evaluate the performance of GAAP, we have assembled a viral (SRR11621811), bacterial (SRR17153353) and human genome (SRR16845439). Results: We have used GAAP software to assemble, and annotate a COVID-19 genome on a desktop computer that resulted in a single contig of 27994bp with 99.57% reference genome coverage. This assembly predicted 11 genes, of which 10 were annotated using annotation module of GAAP. We have also assembled a bacterial and human genome 138 and 194281 contigs with N50 value 100399 and 610, respectively. Conclusion: In this study, we have developed freely available, platform-independent genome assembly and annotation (GAAP) software (www.deepaklab.com/gaap). The software itself acts as a complete data analysis package with quality check, quality control, de-novo genome assembly, gene prediction and annotation (Blast, PFAM, GO-Term, pathway and enzyme mapping) modules.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Current Genomics
Current Genomics 生物-生化与分子生物学
CiteScore
5.20
自引率
0.00%
发文量
29
审稿时长
>0 weeks
期刊介绍: Current Genomics is a peer-reviewed journal that provides essential reading about the latest and most important developments in genome science and related fields of research. Systems biology, systems modeling, machine learning, network inference, bioinformatics, computational biology, epigenetics, single cell genomics, extracellular vesicles, quantitative biology, and synthetic biology for the study of evolution, development, maintenance, aging and that of human health, human diseases, clinical genomics and precision medicine are topics of particular interest. The journal covers plant genomics. The journal will not consider articles dealing with breeding and livestock. Current Genomics publishes three types of articles including: i) Research papers from internationally-recognized experts reporting on new and original data generated at the genome scale level. Position papers dealing with new or challenging methodological approaches, whether experimental or mathematical, are greatly welcome in this section. ii) Authoritative and comprehensive full-length or mini reviews from widely recognized experts, covering the latest developments in genome science and related fields of research such as systems biology, statistics and machine learning, quantitative biology, and precision medicine. Proposals for mini-hot topics (2-3 review papers) and full hot topics (6-8 review papers) guest edited by internationally-recognized experts are welcome in this section. Hot topic proposals should not contain original data and they should contain articles originating from at least 2 different countries. iii) Opinion papers from internationally recognized experts addressing contemporary questions and issues in the field of genome science and systems biology and basic and clinical research practices.
期刊最新文献
Circular RNA Involvement in Aging and Longevity. An Update on Non-invasive Approaches for Genetic Testing of the Preimplantation Embryo. Heuristic Analysis of Genomic Sequence Processing Models for High Efficiency Prediction: A Statistical Perspective. The Potential Role of Plastome Copy Number as a Quality Biomarker for Plant Products using Real-time Quantitative Polymerase Chain Reaction. Long Non-coding RNAs: Pivotal Epigenetic Regulators in Diabetic Retinopathy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1