Zhimeng Xu , Yuting Mai , Denghui Liu , Wenjun He , Xinyuan Lin , Chi Xu , Lei Zhang , Xin Meng , Joseph Mafofo , Walid Abbas Zaher , Ashish Koshy , Yi Li , Nan Qiao
{"title":"Fast-bonito:一个更快的基于深度学习的纳米孔测序碱基调用器","authors":"Zhimeng Xu , Yuting Mai , Denghui Liu , Wenjun He , Xinyuan Lin , Chi Xu , Lei Zhang , Xin Meng , Joseph Mafofo , Walid Abbas Zaher , Ashish Koshy , Yi Li , Nan Qiao","doi":"10.1016/j.ailsci.2021.100011","DOIUrl":null,"url":null,"abstract":"<div><p>Nanopore sequencing from Oxford Nanopore Technologies (ONT) is a promising third-generation sequencing (TGS) technology that generates relatively longer sequencing reads compared to the next-generation sequencing (NGS) technology. A basecaller is a piece of software that translates the original electrical current signals into nucleotide sequences. The accuracy of the basecaller is crucially important to downstream analysis. Bonito is a deep learning-based basecaller recently developed by ONT. Its neural network architecture is composed of a single convolutional layer followed by three stacked bidirectional gated recurrent unit (GRU) layers. Although Bonito has achieved state-of-the-art base calling accuracy, its speed is too slow to be used in production. We therefore developed Fast-Bonito, by using the neural architecture search (NAS) technique to search for a brand-new neural network backbone, and trained it from scratch using several advanced deep learning model training techniques. The new Fast-Bonito model balanced performance in terms of speed and accuracy. Fast-Bonito was 153.8% faster than the original Bonito on NVIDIA V100 GPU. When running on HUAWEI Ascend 910 NPU, Fast-Bonito was 565% faster than the original Bonito. The accuracy of Fast-Bonito was also slightly higher than that of Bonito. We have made Fast-Bonito open source, hoping it will boost the adoption of TGS in both academia and industry.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318521000118/pdfft?md5=fd79b6a6d202e645142894875f87c96d&pid=1-s2.0-S2667318521000118-main.pdf","citationCount":"16","resultStr":"{\"title\":\"Fast-bonito: A faster deep learning based basecaller for nanopore sequencing\",\"authors\":\"Zhimeng Xu , Yuting Mai , Denghui Liu , Wenjun He , Xinyuan Lin , Chi Xu , Lei Zhang , Xin Meng , Joseph Mafofo , Walid Abbas Zaher , Ashish Koshy , Yi Li , Nan Qiao\",\"doi\":\"10.1016/j.ailsci.2021.100011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Nanopore sequencing from Oxford Nanopore Technologies (ONT) is a promising third-generation sequencing (TGS) technology that generates relatively longer sequencing reads compared to the next-generation sequencing (NGS) technology. A basecaller is a piece of software that translates the original electrical current signals into nucleotide sequences. The accuracy of the basecaller is crucially important to downstream analysis. Bonito is a deep learning-based basecaller recently developed by ONT. Its neural network architecture is composed of a single convolutional layer followed by three stacked bidirectional gated recurrent unit (GRU) layers. Although Bonito has achieved state-of-the-art base calling accuracy, its speed is too slow to be used in production. We therefore developed Fast-Bonito, by using the neural architecture search (NAS) technique to search for a brand-new neural network backbone, and trained it from scratch using several advanced deep learning model training techniques. The new Fast-Bonito model balanced performance in terms of speed and accuracy. Fast-Bonito was 153.8% faster than the original Bonito on NVIDIA V100 GPU. When running on HUAWEI Ascend 910 NPU, Fast-Bonito was 565% faster than the original Bonito. The accuracy of Fast-Bonito was also slightly higher than that of Bonito. We have made Fast-Bonito open source, hoping it will boost the adoption of TGS in both academia and industry.</p></div>\",\"PeriodicalId\":72304,\"journal\":{\"name\":\"Artificial intelligence in the life sciences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2667318521000118/pdfft?md5=fd79b6a6d202e645142894875f87c96d&pid=1-s2.0-S2667318521000118-main.pdf\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial intelligence in the life sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667318521000118\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in the life sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667318521000118","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fast-bonito: A faster deep learning based basecaller for nanopore sequencing
Nanopore sequencing from Oxford Nanopore Technologies (ONT) is a promising third-generation sequencing (TGS) technology that generates relatively longer sequencing reads compared to the next-generation sequencing (NGS) technology. A basecaller is a piece of software that translates the original electrical current signals into nucleotide sequences. The accuracy of the basecaller is crucially important to downstream analysis. Bonito is a deep learning-based basecaller recently developed by ONT. Its neural network architecture is composed of a single convolutional layer followed by three stacked bidirectional gated recurrent unit (GRU) layers. Although Bonito has achieved state-of-the-art base calling accuracy, its speed is too slow to be used in production. We therefore developed Fast-Bonito, by using the neural architecture search (NAS) technique to search for a brand-new neural network backbone, and trained it from scratch using several advanced deep learning model training techniques. The new Fast-Bonito model balanced performance in terms of speed and accuracy. Fast-Bonito was 153.8% faster than the original Bonito on NVIDIA V100 GPU. When running on HUAWEI Ascend 910 NPU, Fast-Bonito was 565% faster than the original Bonito. The accuracy of Fast-Bonito was also slightly higher than that of Bonito. We have made Fast-Bonito open source, hoping it will boost the adoption of TGS in both academia and industry.
Artificial intelligence in the life sciencesPharmacology, Biochemistry, Genetics and Molecular Biology (General), Computer Science Applications, Health Informatics, Drug Discovery, Veterinary Science and Veterinary Medicine (General)