The identification of strings that are, by some measure, redundant or rare in the context of larger sequences is an implicit goal of any data compression method. In the straightforward approach to searching for unusual substrings, the words (up to a certain length) are enumerated more or less exhaustively and individually checked in terms of observed and expected frequencies, variances, and scores of discrepancy and significance thereof. As is well known, clever methods are available to compute and organize the counts of occurrences of all substrings of a given string. The corresponding tables take up the tree-like structure of a special kind of digital search index or trie. We show here that under several accepted measures of deviation from expected frequency, the candidate over- or under-represented words are restricted to the O(n) words that end at internal nodes of a compact suffix tree, as opposed to the /spl Theta/(n/sup 2/) possible substrings. This surprising fact is a consequence of properties in the form that if a word that ends in the middle of an arc is, say, over-represented, then its extension to the nearest node of the tree is even more so. Based on this, we design global linear detectors of favoured and unfavored words for our probabilistic framework, and display the results of some preliminary that apply our constructions to the analysis of genomic sequences.
{"title":"Linear global detectors of redundant and rare substrings","authors":"A. Apostolico, M. Bock, S. Lonardi","doi":"10.1109/DCC.1999.755666","DOIUrl":"https://doi.org/10.1109/DCC.1999.755666","url":null,"abstract":"The identification of strings that are, by some measure, redundant or rare in the context of larger sequences is an implicit goal of any data compression method. In the straightforward approach to searching for unusual substrings, the words (up to a certain length) are enumerated more or less exhaustively and individually checked in terms of observed and expected frequencies, variances, and scores of discrepancy and significance thereof. As is well known, clever methods are available to compute and organize the counts of occurrences of all substrings of a given string. The corresponding tables take up the tree-like structure of a special kind of digital search index or trie. We show here that under several accepted measures of deviation from expected frequency, the candidate over- or under-represented words are restricted to the O(n) words that end at internal nodes of a compact suffix tree, as opposed to the /spl Theta/(n/sup 2/) possible substrings. This surprising fact is a consequence of properties in the form that if a word that ends in the middle of an arc is, say, over-represented, then its extension to the nearest node of the tree is even more so. Based on this, we design global linear detectors of favoured and unfavored words for our probabilistic framework, and display the results of some preliminary that apply our constructions to the analysis of genomic sequences.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131410210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. Lossless image coding may be performed by applying arithmetic coding sequentially to probabilities conditioned on the past data. Therefore the model is very important. A new image model is applied to image coding. The model is based on a Markov process involving hidden states. An underlying Markov process called the slice process specifies D rows with the width of the image. Each new row of the image coincides with row N of an instance of the slice process. The N-1 previous rows are read from the causal part of the image and the last D-N rows are hidden. This gives a description of the current row conditioned on the N-1 previous rows. From the slice process we may decompose the description into a sequence of conditional probabilities, involving a combination of a forward and a backward pass. In effect the causal part of the last N rows of the image becomes the context. The forward pass obtained directly from the slice process starts from the left for each row with D-N hidden rows. The backward pass starting from the right additionally has the current row as hidden. The backward pass may be described as a completion of the forward pass. It plays the role of normalizing the possible completions of the forward pass for each pixel. The hidden states may effectively be represented in a trellis structure as in an HMM. For the slice process we use a state of D rows and V-1 columns, thus involving V columns in each transition. The new model was applied to a bi-level image (SO9 of the JBIG test set) in a two-part coding scheme.
{"title":"Image coding using Markov models with hidden states","authors":"S. Forchhammer","doi":"10.1109/DCC.1999.785681","DOIUrl":"https://doi.org/10.1109/DCC.1999.785681","url":null,"abstract":"Summary form only given. Lossless image coding may be performed by applying arithmetic coding sequentially to probabilities conditioned on the past data. Therefore the model is very important. A new image model is applied to image coding. The model is based on a Markov process involving hidden states. An underlying Markov process called the slice process specifies D rows with the width of the image. Each new row of the image coincides with row N of an instance of the slice process. The N-1 previous rows are read from the causal part of the image and the last D-N rows are hidden. This gives a description of the current row conditioned on the N-1 previous rows. From the slice process we may decompose the description into a sequence of conditional probabilities, involving a combination of a forward and a backward pass. In effect the causal part of the last N rows of the image becomes the context. The forward pass obtained directly from the slice process starts from the left for each row with D-N hidden rows. The backward pass starting from the right additionally has the current row as hidden. The backward pass may be described as a completion of the forward pass. It plays the role of normalizing the possible completions of the forward pass for each pixel. The hidden states may effectively be represented in a trellis structure as in an HMM. For the slice process we use a state of D rows and V-1 columns, thus involving V columns in each transition. The new model was applied to a bi-level image (SO9 of the JBIG test set) in a two-part coding scheme.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122458257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Wagner, Ralf Herz, H. Hartenstein, R. Hamzaoui, D. Saupe
Summary form only given. We present a new AVQ-based video coder for very low bitrates. To encode a block from a frame, the encoder offers three modes: (1) a block from the same position in the last frame can be taken; (2) the block can be represented with a vector from the codebook; or (3) a new vector, that sufficiently represents a block, can be inserted into the codebook. For mode 2 a mean-removed VQ scheme is used. The decision on how blocks are encoded and how the codebook is updated is done in an rate-distortion (R-D) optimized fashion. The codebook of shape blocks is updated once per frame. First results for an implementation of such a scheme have been reported previously. Here we extend the method to incorporate a wavelet image transform before coding in order to enhance the compression performance. In addition the rate-distortion optimization is comprehensively discussed. Our R-D optimization is based on an efficient convex-hull computation. This method is compared to common R-D optimizations that use a Lagrangian multiplier approach. In the discussion of our R-D method we show the similarities and differences between our scheme and the generalized threshold replenishment (GTR) method of Fowler et al. (1997). Furthermore, we demonstrate that the translation of our R-D optimized AVQ into the wavelet domain leads to an improved coding performance. We present coding results that show that one can achieve the same encoding quality as with comparable standard transform coding (H.263). In addition we offer an empirical analysis of the short- and long-term behavior of the adaptive codebook. This analysis indicates that the AVQ method uses the vectors in its codebook for some kind of long-term prediction.
{"title":"A video codec based on R/D-optimized adaptive vector quantization","authors":"M. Wagner, Ralf Herz, H. Hartenstein, R. Hamzaoui, D. Saupe","doi":"10.1109/DCC.1999.785713","DOIUrl":"https://doi.org/10.1109/DCC.1999.785713","url":null,"abstract":"Summary form only given. We present a new AVQ-based video coder for very low bitrates. To encode a block from a frame, the encoder offers three modes: (1) a block from the same position in the last frame can be taken; (2) the block can be represented with a vector from the codebook; or (3) a new vector, that sufficiently represents a block, can be inserted into the codebook. For mode 2 a mean-removed VQ scheme is used. The decision on how blocks are encoded and how the codebook is updated is done in an rate-distortion (R-D) optimized fashion. The codebook of shape blocks is updated once per frame. First results for an implementation of such a scheme have been reported previously. Here we extend the method to incorporate a wavelet image transform before coding in order to enhance the compression performance. In addition the rate-distortion optimization is comprehensively discussed. Our R-D optimization is based on an efficient convex-hull computation. This method is compared to common R-D optimizations that use a Lagrangian multiplier approach. In the discussion of our R-D method we show the similarities and differences between our scheme and the generalized threshold replenishment (GTR) method of Fowler et al. (1997). Furthermore, we demonstrate that the translation of our R-D optimized AVQ into the wavelet domain leads to an improved coding performance. We present coding results that show that one can achieve the same encoding quality as with comparable standard transform coding (H.263). In addition we offer an empirical analysis of the short- and long-term behavior of the adaptive codebook. This analysis indicates that the AVQ method uses the vectors in its codebook for some kind of long-term prediction.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127986219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leiming Qian, Douglas L. Jones, K. Ramchandran, S. Appadwedula
With the rapid growth of multimedia content in wireless communication, there is an increasing demand for efficient image and video transmission systems. We present a joint source-channel matching scheme for wireless video transmission which jointly optimizes the source and channel coder to yield the optimal transmission quality while satisfying real-time delay and buffer constraints. We utilize a parametric model approach which avoids the necessity of having detailed a priori knowledge of the coders, thus making the scheme applicable to a wide variety of source and channel coder pairs. Simulations show that the scheme yields excellent results and works for several different types of source and channel coders.
{"title":"A general joint source-channel matching method for wireless video transmission","authors":"Leiming Qian, Douglas L. Jones, K. Ramchandran, S. Appadwedula","doi":"10.1109/DCC.1999.755691","DOIUrl":"https://doi.org/10.1109/DCC.1999.755691","url":null,"abstract":"With the rapid growth of multimedia content in wireless communication, there is an increasing demand for efficient image and video transmission systems. We present a joint source-channel matching scheme for wireless video transmission which jointly optimizes the source and channel coder to yield the optimal transmission quality while satisfying real-time delay and buffer constraints. We utilize a parametric model approach which avoids the necessity of having detailed a priori knowledge of the coders, thus making the scheme applicable to a wide variety of source and channel coder pairs. Simulations show that the scheme yields excellent results and works for several different types of source and channel coders.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128477600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. Arithmetic coding is a popular and efficient lossless compression technique that maps a sequence of source symbols to an interval of numbers between zero and one. We consider the important problem of decoding an arithmetic code stream when an initial segment of that code stream is unknown. We call decoding under these conditions resynchronizing an arithmetic code. This problem has importance in both error resilience and cryptology. If an initial segment of the code stream is corrupted by channel noise, then the decoder must attempt to determine the original source sequence without full knowledge of the code stream. In this case, the ability to resynchronize helps the decoder to recover from the channel errors. But in the situation of encryption one would like to have very high time complexity for resynchronization. We consider the problem of resynchronizing simple arithmetic codes. This research lays the groundwork for future analysis of arithmetic codes with high-order context models. In order for the decoder to achieve full resynchronization, the unknown, initial b bits of the code stream must be determined exactly. When the source is approximately IID, the search complexity associated with choosing the correct sequence is at least O(2/sup b/2/). Therefore, when b is 100 or more, the time complexity required to achieve full resynchronization is prohibitively high. To partially resynchronize, the decoder must determine the coding interval after b bits have been output by the encoder. For a stationary source and a finite-precision static binary arithmetic coder, the complexity of determining the code interval is O(2/sup 2s/), where the precision is s bits.
{"title":"Resynchronization properties of arithmetic coding","authors":"P. W. Moo, Xiaolin Wu","doi":"10.1109/DCC.1999.785697","DOIUrl":"https://doi.org/10.1109/DCC.1999.785697","url":null,"abstract":"Summary form only given. Arithmetic coding is a popular and efficient lossless compression technique that maps a sequence of source symbols to an interval of numbers between zero and one. We consider the important problem of decoding an arithmetic code stream when an initial segment of that code stream is unknown. We call decoding under these conditions resynchronizing an arithmetic code. This problem has importance in both error resilience and cryptology. If an initial segment of the code stream is corrupted by channel noise, then the decoder must attempt to determine the original source sequence without full knowledge of the code stream. In this case, the ability to resynchronize helps the decoder to recover from the channel errors. But in the situation of encryption one would like to have very high time complexity for resynchronization. We consider the problem of resynchronizing simple arithmetic codes. This research lays the groundwork for future analysis of arithmetic codes with high-order context models. In order for the decoder to achieve full resynchronization, the unknown, initial b bits of the code stream must be determined exactly. When the source is approximately IID, the search complexity associated with choosing the correct sequence is at least O(2/sup b/2/). Therefore, when b is 100 or more, the time complexity required to achieve full resynchronization is prohibitively high. To partially resynchronize, the decoder must determine the coding interval after b bits have been output by the encoder. For a stationary source and a finite-precision static binary arithmetic coder, the complexity of determining the code interval is O(2/sup 2s/), where the precision is s bits.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116054011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The practical lossless digital image compressors that achieve the best results in terms of compression ratio are also simple and fast algorithms with low complexity both in terms of memory usage and running time. Surprisingly, the compression ratio achieved by these systems cannot be substantially improved even by using image-by-image optimization techniques or more sophisticate and complex algorithms. Meyer and Tischer (1998) were able, with their TMW, to improve some current best results (they do not report results for all test images) by using global optimization techniques and multiple blended linear predictors. Our investigation is directed to determine the effectiveness of an algorithm that uses multiple adaptive linear predictors, locally optimized on a pixel-by-pixel basis. The results we obtained on a test set of nine standard images are encouraging, where we improve over CALIC on some images.
{"title":"Adaptive linear prediction lossless image coding","authors":"G. Motta, J. Storer, B. Carpentieri","doi":"10.1109/DCC.1999.755699","DOIUrl":"https://doi.org/10.1109/DCC.1999.755699","url":null,"abstract":"The practical lossless digital image compressors that achieve the best results in terms of compression ratio are also simple and fast algorithms with low complexity both in terms of memory usage and running time. Surprisingly, the compression ratio achieved by these systems cannot be substantially improved even by using image-by-image optimization techniques or more sophisticate and complex algorithms. Meyer and Tischer (1998) were able, with their TMW, to improve some current best results (they do not report results for all test images) by using global optimization techniques and multiple blended linear predictors. Our investigation is directed to determine the effectiveness of an algorithm that uses multiple adaptive linear predictors, locally optimized on a pixel-by-pixel basis. The results we obtained on a test set of nine standard images are encouraging, where we improve over CALIC on some images.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125443982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When developing multimedia-based material the format in which we want to enclose our images is an important question. It is a more crucial question if we want our material to appear in commerce because the cost of developing CD (master copy and copies) changes according to the amount of data on the disk. In the case of fractal-based compression (FIF) we can save space and money. Our studies verified that this compression method also results in an improvement in quality in the case of raster image size (640/spl times/480, 800/spl times/600, 1024/spl times/768). This is extremely true in the case of images full of shades, and in the case of the enlargement of parts.
{"title":"Comparison and application possibilities of JPEG and fractal-based image compressing methods in the development of multimedia-based material","authors":"J. Berke","doi":"10.1109/DCC.1999.785674","DOIUrl":"https://doi.org/10.1109/DCC.1999.785674","url":null,"abstract":"When developing multimedia-based material the format in which we want to enclose our images is an important question. It is a more crucial question if we want our material to appear in commerce because the cost of developing CD (master copy and copies) changes according to the amount of data on the disk. In the case of fractal-based compression (FIF) we can save space and money. Our studies verified that this compression method also results in an improvement in quality in the case of raster image size (640/spl times/480, 800/spl times/600, 1024/spl times/768). This is extremely true in the case of images full of shades, and in the case of the enlargement of parts.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125015017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
[Summary form only given]. There has been a growing interest in reversible integer-to-integer wavelet transforms for image coding applications. In this paper, a number of such transforms are compared on the basis of their objective and subjective lossy compression performance, lossless compression performance, and computational complexity. Of the transforms considered, several were found to perform particularly well, with the best choice for a given application depending on the relative importance of lossless compression performance, lossy compression performance, and computational complexity. Reversible integer-to-integer versions of numerous transforms are also compared to their conventional (i.e., nonreversible real-valued) counterparts for lossy compression. In many cases, the reversible integer-to-integer and conventional versions of a transform were found to yield results with comparable image quality.
{"title":"Performance evaluation of reversible integer-to-integer wavelet transforms for image compression","authors":"M. Adams, F. Kossentini","doi":"10.1109/DCC.1999.785671","DOIUrl":"https://doi.org/10.1109/DCC.1999.785671","url":null,"abstract":"[Summary form only given]. There has been a growing interest in reversible integer-to-integer wavelet transforms for image coding applications. In this paper, a number of such transforms are compared on the basis of their objective and subjective lossy compression performance, lossless compression performance, and computational complexity. Of the transforms considered, several were found to perform particularly well, with the best choice for a given application depending on the relative importance of lossless compression performance, lossy compression performance, and computational complexity. Reversible integer-to-integer versions of numerous transforms are also compared to their conventional (i.e., nonreversible real-valued) counterparts for lossy compression. In many cases, the reversible integer-to-integer and conventional versions of a transform were found to yield results with comparable image quality.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125670038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. The increased information content of hyperspectral imagery over multispectral data has attracted significant interest from the defense and remote sensing communities. We develop a mechanism for compressing hyperspectral imagery with no loss of information. The challenge of hyperspectral image compression lies in the non-isotropy and non-stationarity that is displayed across the spectral channels. Short-range dependence is exhibited over the spatial axes due to the finite extent of objects/texture on the imaged area, while long-range dependence is shown by the spectral axis due to the spectral response of the imaged pixel and transmission medium. A secondary, though critical, challenge is one of speed. In order to be of practical interest, a good solution must be able to scale up to speeds of the order of 20 MByte/s. We use an integerizable eigendecomposition along the spectral channel to optimally extract spectral redundancies. Subsequently, we apply wavelet-based encoding to transmit the residuals of eigendecomposition. We use contextual arithmetic encoding implemented with several innovations that guarantee speed and performance. Our implementation attains operating speeds of 550 kBytes of raw imagery per second, and achieves a compression ratio of around 2.7:1 on typical AVIRIS data. This demonstrates the utility and applicability of our algorithm towards realizing a deployable hyperspectral image compression system.
{"title":"Eigen wavelet: hyperspectral image compression algorithm","authors":"S. Srinivasan, L. Kanal","doi":"10.1109/DCC.1999.785707","DOIUrl":"https://doi.org/10.1109/DCC.1999.785707","url":null,"abstract":"Summary form only given. The increased information content of hyperspectral imagery over multispectral data has attracted significant interest from the defense and remote sensing communities. We develop a mechanism for compressing hyperspectral imagery with no loss of information. The challenge of hyperspectral image compression lies in the non-isotropy and non-stationarity that is displayed across the spectral channels. Short-range dependence is exhibited over the spatial axes due to the finite extent of objects/texture on the imaged area, while long-range dependence is shown by the spectral axis due to the spectral response of the imaged pixel and transmission medium. A secondary, though critical, challenge is one of speed. In order to be of practical interest, a good solution must be able to scale up to speeds of the order of 20 MByte/s. We use an integerizable eigendecomposition along the spectral channel to optimally extract spectral redundancies. Subsequently, we apply wavelet-based encoding to transmit the residuals of eigendecomposition. We use contextual arithmetic encoding implemented with several innovations that guarantee speed and performance. Our implementation attains operating speeds of 550 kBytes of raw imagery per second, and achieves a compression ratio of around 2.7:1 on typical AVIRIS data. This demonstrates the utility and applicability of our algorithm towards realizing a deployable hyperspectral image compression system.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124594906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The goal of lossless data compression is to map the set of strings from a given source into a set of binary code strings. A variable-to-fixed length encoding procedure is a mapping from a dictionary of variable length strings of source outputs to the set of codewords of a given length. For memoryless sources, the Tunstall procedure can be applied to construct optimal uniquely parsable dictionaries and the resulting codes are known to work especially well for sources with small entropies. We introduce the idea of plurally parsable dictionaries and show how to design plurally parsable dictionaries that can outperform the Tunstall dictionary of the same size on very predictable binary, memoryless sources.
{"title":"Variable-to-fixed length codes and plurally parsable dictionaries","authors":"S. Savari","doi":"10.1109/DCC.1999.755695","DOIUrl":"https://doi.org/10.1109/DCC.1999.755695","url":null,"abstract":"The goal of lossless data compression is to map the set of strings from a given source into a set of binary code strings. A variable-to-fixed length encoding procedure is a mapping from a dictionary of variable length strings of source outputs to the set of codewords of a given length. For memoryless sources, the Tunstall procedure can be applied to construct optimal uniquely parsable dictionaries and the resulting codes are known to work especially well for sources with small entropies. We introduce the idea of plurally parsable dictionaries and show how to design plurally parsable dictionaries that can outperform the Tunstall dictionary of the same size on very predictable binary, memoryless sources.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130905031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}