Rongjie Wang, Mingxiang Teng, Yang Bai, Tianyi Zang, Yadong Wang
{"title":"细菌基因组压缩的动态马尔可夫模型","authors":"Rongjie Wang, Mingxiang Teng, Yang Bai, Tianyi Zang, Yadong Wang","doi":"10.1109/BIBM.2016.7822621","DOIUrl":null,"url":null,"abstract":"Genome data increasing exponentially since the last decade, compressing genome with Markov models has been proposed as an effective statistical method. However, existing methods set a static order-k Markov models to compress various genomes. Employing static order-k Markov model could result in a sub-optimal orders on some genomes. In this paper, we propose a compression method that relies on a pre-analysis of the data before compression, with the aim of estimating Markov models order k, yielding improvements over static Markov models. Experimental results on the latest complete bacterial genome data show that our method could effectively compress genome with a better performance than the state-of-the-art method. The codes of DMcompress are available at https://rongjiewang.github.io/DMcompress","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"11 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"DMcompress: Dynamic Markov models for bacterial genome compression\",\"authors\":\"Rongjie Wang, Mingxiang Teng, Yang Bai, Tianyi Zang, Yadong Wang\",\"doi\":\"10.1109/BIBM.2016.7822621\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Genome data increasing exponentially since the last decade, compressing genome with Markov models has been proposed as an effective statistical method. However, existing methods set a static order-k Markov models to compress various genomes. Employing static order-k Markov model could result in a sub-optimal orders on some genomes. In this paper, we propose a compression method that relies on a pre-analysis of the data before compression, with the aim of estimating Markov models order k, yielding improvements over static Markov models. Experimental results on the latest complete bacterial genome data show that our method could effectively compress genome with a better performance than the state-of-the-art method. The codes of DMcompress are available at https://rongjiewang.github.io/DMcompress\",\"PeriodicalId\":345384,\"journal\":{\"name\":\"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"volume\":\"11 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBM.2016.7822621\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822621","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DMcompress: Dynamic Markov models for bacterial genome compression
Genome data increasing exponentially since the last decade, compressing genome with Markov models has been proposed as an effective statistical method. However, existing methods set a static order-k Markov models to compress various genomes. Employing static order-k Markov model could result in a sub-optimal orders on some genomes. In this paper, we propose a compression method that relies on a pre-analysis of the data before compression, with the aim of estimating Markov models order k, yielding improvements over static Markov models. Experimental results on the latest complete bacterial genome data show that our method could effectively compress genome with a better performance than the state-of-the-art method. The codes of DMcompress are available at https://rongjiewang.github.io/DMcompress