Curtis Davis, Karthik Kota, Venkat Baldhandapani, Wei Gong, Sahar Abubucker, Eric Becker, John Martin, Kristine M Wylie, Radhika Khetani, Matthew E Hudson, George M Weinstock, Makedonka Mitreva
{"title":"mBLAST:紧跟(元)基因组分析的测序爆炸。","authors":"Curtis Davis, Karthik Kota, Venkat Baldhandapani, Wei Gong, Sahar Abubucker, Eric Becker, John Martin, Kristine M Wylie, Radhika Khetani, Matthew E Hudson, George M Weinstock, Makedonka Mitreva","doi":"10.4172/2153-0602.1000135","DOIUrl":null,"url":null,"abstract":"<p><p>Recent advances in next-generation sequencing technologies require alignment algorithms and software that can keep pace with the heightened data production. Standard algorithms, especially protein similarity searches, represent significant bottlenecks in analysis pipelines. For metagenomic approaches in particular, it is now often necessary to search hundreds of millions of sequence reads against large databases. Here we describe mBLAST, an accelerated search algorithm for translated and/or protein alignments to large datasets based on the Basic Local Alignment Search Tool (BLAST) and retaining the high sensitivity of BLAST. The mBLAST algorithms achieve substantial speed up over the National Center for Biotechnology Information (NCBI) programs BLASTX, TBLASTX and BLASTP for large datasets, allowing analysis within reasonable timeframes on standard computer architectures. In this article, the impact of mBLAST is demonstrated with sequences originating from the microbiota of healthy humans from the Human Microbiome Project. mBLAST is designed as a plug-in replacement for BLAST for any study that involves short-read sequences and includes high-throughput analysis. The mBLAST software is freely available to academic users at www.multicorewareinc.com.</p>","PeriodicalId":15630,"journal":{"name":"Journal of Data Mining in Genomics & Proteomics","volume":"4 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4612494/pdf/nihms696431.pdf","citationCount":"28","resultStr":"{\"title\":\"mBLAST: Keeping up with the sequencing explosion for (meta)genome analysis.\",\"authors\":\"Curtis Davis, Karthik Kota, Venkat Baldhandapani, Wei Gong, Sahar Abubucker, Eric Becker, John Martin, Kristine M Wylie, Radhika Khetani, Matthew E Hudson, George M Weinstock, Makedonka Mitreva\",\"doi\":\"10.4172/2153-0602.1000135\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Recent advances in next-generation sequencing technologies require alignment algorithms and software that can keep pace with the heightened data production. Standard algorithms, especially protein similarity searches, represent significant bottlenecks in analysis pipelines. For metagenomic approaches in particular, it is now often necessary to search hundreds of millions of sequence reads against large databases. Here we describe mBLAST, an accelerated search algorithm for translated and/or protein alignments to large datasets based on the Basic Local Alignment Search Tool (BLAST) and retaining the high sensitivity of BLAST. The mBLAST algorithms achieve substantial speed up over the National Center for Biotechnology Information (NCBI) programs BLASTX, TBLASTX and BLASTP for large datasets, allowing analysis within reasonable timeframes on standard computer architectures. In this article, the impact of mBLAST is demonstrated with sequences originating from the microbiota of healthy humans from the Human Microbiome Project. mBLAST is designed as a plug-in replacement for BLAST for any study that involves short-read sequences and includes high-throughput analysis. The mBLAST software is freely available to academic users at www.multicorewareinc.com.</p>\",\"PeriodicalId\":15630,\"journal\":{\"name\":\"Journal of Data Mining in Genomics & Proteomics\",\"volume\":\"4 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4612494/pdf/nihms696431.pdf\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Data Mining in Genomics & Proteomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4172/2153-0602.1000135\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2013/7/31 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Data Mining in Genomics & Proteomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4172/2153-0602.1000135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2013/7/31 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28
摘要
新一代测序技术的最新进展要求校准算法和软件能够跟上数据生产的步伐。标准算法,特别是蛋白质相似性搜索,是分析管道中的重要瓶颈。特别是对于宏基因组方法,现在经常需要在大型数据库中搜索数以亿计的序列读取。mBLAST是一种基于基本局部比对搜索工具(Basic Local Alignment search Tool, BLAST)并保留BLAST的高灵敏度的大型数据集翻译和/或蛋白质比对的加速搜索算法。与国家生物技术信息中心(NCBI)项目BLASTX、TBLASTX和BLASTP相比,mBLAST算法实现了对大型数据集的大幅提速,允许在合理的时间框架内在标准计算机架构上进行分析。在这篇文章中,mBLAST的影响是通过来自人类微生物组计划的健康人类微生物群的序列来证明的。mBLAST被设计为BLAST的插件替代品,适用于任何涉及短读序列和高通量分析的研究。mBLAST软件免费提供给学术用户,网址是www.multicorewareinc.com。
mBLAST: Keeping up with the sequencing explosion for (meta)genome analysis.
Recent advances in next-generation sequencing technologies require alignment algorithms and software that can keep pace with the heightened data production. Standard algorithms, especially protein similarity searches, represent significant bottlenecks in analysis pipelines. For metagenomic approaches in particular, it is now often necessary to search hundreds of millions of sequence reads against large databases. Here we describe mBLAST, an accelerated search algorithm for translated and/or protein alignments to large datasets based on the Basic Local Alignment Search Tool (BLAST) and retaining the high sensitivity of BLAST. The mBLAST algorithms achieve substantial speed up over the National Center for Biotechnology Information (NCBI) programs BLASTX, TBLASTX and BLASTP for large datasets, allowing analysis within reasonable timeframes on standard computer architectures. In this article, the impact of mBLAST is demonstrated with sequences originating from the microbiota of healthy humans from the Human Microbiome Project. mBLAST is designed as a plug-in replacement for BLAST for any study that involves short-read sequences and includes high-throughput analysis. The mBLAST software is freely available to academic users at www.multicorewareinc.com.