mBLAST: Keeping up with the sequencing explosion for (meta)genome analysis.

Journal of Data Mining in Genomics & Proteomics Pub Date : 2015-08-01 Epub Date: 2013-07-31 DOI:10.4172/2153-0602.1000135

Curtis Davis, Karthik Kota, Venkat Baldhandapani, Wei Gong, Sahar Abubucker, Eric Becker, John Martin, Kristine M Wylie, Radhika Khetani, Matthew E Hudson, George M Weinstock, Makedonka Mitreva

{"title":"mBLAST: Keeping up with the sequencing explosion for (meta)genome analysis.","authors":"Curtis Davis, Karthik Kota, Venkat Baldhandapani, Wei Gong, Sahar Abubucker, Eric Becker, John Martin, Kristine M Wylie, Radhika Khetani, Matthew E Hudson, George M Weinstock, Makedonka Mitreva","doi":"10.4172/2153-0602.1000135","DOIUrl":null,"url":null,"abstract":"<p><p>Recent advances in next-generation sequencing technologies require alignment algorithms and software that can keep pace with the heightened data production. Standard algorithms, especially protein similarity searches, represent significant bottlenecks in analysis pipelines. For metagenomic approaches in particular, it is now often necessary to search hundreds of millions of sequence reads against large databases. Here we describe mBLAST, an accelerated search algorithm for translated and/or protein alignments to large datasets based on the Basic Local Alignment Search Tool (BLAST) and retaining the high sensitivity of BLAST. The mBLAST algorithms achieve substantial speed up over the National Center for Biotechnology Information (NCBI) programs BLASTX, TBLASTX and BLASTP for large datasets, allowing analysis within reasonable timeframes on standard computer architectures. In this article, the impact of mBLAST is demonstrated with sequences originating from the microbiota of healthy humans from the Human Microbiome Project. mBLAST is designed as a plug-in replacement for BLAST for any study that involves short-read sequences and includes high-throughput analysis. The mBLAST software is freely available to academic users at www.multicorewareinc.com.</p>","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"4 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4612494/pdf/nihms696431.pdf","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Data Mining in Genomics & Proteomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4172/2153-0602.1000135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2013/7/31 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

Abstract

Recent advances in next-generation sequencing technologies require alignment algorithms and software that can keep pace with the heightened data production. Standard algorithms, especially protein similarity searches, represent significant bottlenecks in analysis pipelines. For metagenomic approaches in particular, it is now often necessary to search hundreds of millions of sequence reads against large databases. Here we describe mBLAST, an accelerated search algorithm for translated and/or protein alignments to large datasets based on the Basic Local Alignment Search Tool (BLAST) and retaining the high sensitivity of BLAST. The mBLAST algorithms achieve substantial speed up over the National Center for Biotechnology Information (NCBI) programs BLASTX, TBLASTX and BLASTP for large datasets, allowing analysis within reasonable timeframes on standard computer architectures. In this article, the impact of mBLAST is demonstrated with sequences originating from the microbiota of healthy humans from the Human Microbiome Project. mBLAST is designed as a plug-in replacement for BLAST for any study that involves short-read sequences and includes high-throughput analysis. The mBLAST software is freely available to academic users at www.multicorewareinc.com.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

mBLAST:紧跟(元)基因组分析的测序爆炸。

新一代测序技术的最新进展要求校准算法和软件能够跟上数据生产的步伐。标准算法，特别是蛋白质相似性搜索，是分析管道中的重要瓶颈。特别是对于宏基因组方法，现在经常需要在大型数据库中搜索数以亿计的序列读取。mBLAST是一种基于基本局部比对搜索工具(Basic Local Alignment search Tool, BLAST)并保留BLAST的高灵敏度的大型数据集翻译和/或蛋白质比对的加速搜索算法。与国家生物技术信息中心(NCBI)项目BLASTX、TBLASTX和BLASTP相比，mBLAST算法实现了对大型数据集的大幅提速，允许在合理的时间框架内在标准计算机架构上进行分析。在这篇文章中，mBLAST的影响是通过来自人类微生物组计划的健康人类微生物群的序列来证明的。mBLAST被设计为BLAST的插件替代品，适用于任何涉及短读序列和高通量分析的研究。mBLAST软件免费提供给学术用户，网址是www.multicorewareinc.com。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Data Mining in Genomics & Proteomics

自引率

0.00%

发文量