{"title":"DNA压缩采用参考压缩算法","authors":"Kanika Mehta, S. P. Ghrera","doi":"10.1109/IC3.2015.7346654","DOIUrl":null,"url":null,"abstract":"With rapid technological development and growth of sequencing data, an umpteen gamut of biological data has been generated. As an alternative, Data Compression is employed to reduce the size of data. In this direction, this paper proposes a new reference-based compression approach, which is employed as a solution. Firstly, a reference has been constructed from the common sub strings of randomly selected input sequences. Reference set is a pair of key and value, where key is a fingerprint (or a unique id) and value is a sequence of characters. Next, these given sequences are compressed using referential compression algorithm. This is attained by matching the input with the reference and hence, replacing the match found in input by its fingerprints contained in the reference, thereby achieving better compression. The experimental results of this paper show that the approach proposed herein, outperforms the existing approaches and methodologies applied so far.","PeriodicalId":217950,"journal":{"name":"2015 Eighth International Conference on Contemporary Computing (IC3)","volume":"407 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"DNA compression using referential compression algorithm\",\"authors\":\"Kanika Mehta, S. P. Ghrera\",\"doi\":\"10.1109/IC3.2015.7346654\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With rapid technological development and growth of sequencing data, an umpteen gamut of biological data has been generated. As an alternative, Data Compression is employed to reduce the size of data. In this direction, this paper proposes a new reference-based compression approach, which is employed as a solution. Firstly, a reference has been constructed from the common sub strings of randomly selected input sequences. Reference set is a pair of key and value, where key is a fingerprint (or a unique id) and value is a sequence of characters. Next, these given sequences are compressed using referential compression algorithm. This is attained by matching the input with the reference and hence, replacing the match found in input by its fingerprints contained in the reference, thereby achieving better compression. The experimental results of this paper show that the approach proposed herein, outperforms the existing approaches and methodologies applied so far.\",\"PeriodicalId\":217950,\"journal\":{\"name\":\"2015 Eighth International Conference on Contemporary Computing (IC3)\",\"volume\":\"407 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 Eighth International Conference on Contemporary Computing (IC3)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC3.2015.7346654\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Eighth International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2015.7346654","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DNA compression using referential compression algorithm
With rapid technological development and growth of sequencing data, an umpteen gamut of biological data has been generated. As an alternative, Data Compression is employed to reduce the size of data. In this direction, this paper proposes a new reference-based compression approach, which is employed as a solution. Firstly, a reference has been constructed from the common sub strings of randomly selected input sequences. Reference set is a pair of key and value, where key is a fingerprint (or a unique id) and value is a sequence of characters. Next, these given sequences are compressed using referential compression algorithm. This is attained by matching the input with the reference and hence, replacing the match found in input by its fingerprints contained in the reference, thereby achieving better compression. The experimental results of this paper show that the approach proposed herein, outperforms the existing approaches and methodologies applied so far.