P. Hanus, J. Dingel, Georg Chalkidis, J. Hagenauer
{"title":"多序列比对的源编码方案","authors":"P. Hanus, J. Dingel, Georg Chalkidis, J. Hagenauer","doi":"10.1109/DCC.2009.64","DOIUrl":null,"url":null,"abstract":"Rapid development of DNA sequencing technologies exponentially increases the amount of publicly available genomic data. Whole genome multiple sequence alignments represent a particularly voluminous, frequently downloaded static dataset. In this work we propose an asymmetric source coding scheme for such alignments using evolutionary prediction in combination with lossless black and white image compression. Compared to the Lempel-Ziv algorithm used so far the compression rates are almost halved.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Source Coding Scheme for Multiple Sequence Alignments\",\"authors\":\"P. Hanus, J. Dingel, Georg Chalkidis, J. Hagenauer\",\"doi\":\"10.1109/DCC.2009.64\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rapid development of DNA sequencing technologies exponentially increases the amount of publicly available genomic data. Whole genome multiple sequence alignments represent a particularly voluminous, frequently downloaded static dataset. In this work we propose an asymmetric source coding scheme for such alignments using evolutionary prediction in combination with lossless black and white image compression. Compared to the Lempel-Ziv algorithm used so far the compression rates are almost halved.\",\"PeriodicalId\":377880,\"journal\":{\"name\":\"2009 Data Compression Conference\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Data Compression Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC.2009.64\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2009.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Source Coding Scheme for Multiple Sequence Alignments
Rapid development of DNA sequencing technologies exponentially increases the amount of publicly available genomic data. Whole genome multiple sequence alignments represent a particularly voluminous, frequently downloaded static dataset. In this work we propose an asymmetric source coding scheme for such alignments using evolutionary prediction in combination with lossless black and white image compression. Compared to the Lempel-Ziv algorithm used so far the compression rates are almost halved.