[Summary form only given]. Introduced by Bentley et al (1986), move-to-front (MTF) coding is an adaptive, self-organizing list (permutation) technique. Motivated with the MTF coder's utilization of small size permutations which are restricted to the data source's alphabet size, we investigate compression of data files by using the canonical sorting permutations followed by permutation based inversion coding (PBIC) from the set of {0, ..., n-1}, where n is the size of the data source. The technique introduced yields better compression gain than the MTF coder and improves the compression gain in block sorting techniques.
{"title":"Move-to-front and permutation based inversion coding","authors":"Z. Arnavut","doi":"10.1109/DCC.1999.785672","DOIUrl":"https://doi.org/10.1109/DCC.1999.785672","url":null,"abstract":"[Summary form only given]. Introduced by Bentley et al (1986), move-to-front (MTF) coding is an adaptive, self-organizing list (permutation) technique. Motivated with the MTF coder's utilization of small size permutations which are restricted to the data source's alphabet size, we investigate compression of data files by using the canonical sorting permutations followed by permutation based inversion coding (PBIC) from the set of {0, ..., n-1}, where n is the size of the data source. The technique introduced yields better compression gain than the MTF coder and improves the compression gain in block sorting techniques.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116202835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the past three or so years, particularly during the JPEG 2000 standardization process that was launched last year, statistical context modeling of embedded wavelet bit streams has received a lot of attention from the image compression community. High-order context modeling has been proven to be indispensable for high rate-distortion performance of wavelet image coders. However, if care is not taken in algorithm design and implementation, the formation of high-order modeling contexts can be both CPU and memory greedy, creating a computation bottleneck for wavelet coding systems. In this paper we focus on the operational aspect of high-order statistical context modeling, and introduce some fast algorithm techniques that can drastically reduce both time and space complexities of high-order context modeling in the wavelet domain.
{"title":"Low complexity high-order context modeling of embedded wavelet bit streams","authors":"Xiaolin Wu","doi":"10.1109/DCC.1999.755660","DOIUrl":"https://doi.org/10.1109/DCC.1999.755660","url":null,"abstract":"In the past three or so years, particularly during the JPEG 2000 standardization process that was launched last year, statistical context modeling of embedded wavelet bit streams has received a lot of attention from the image compression community. High-order context modeling has been proven to be indispensable for high rate-distortion performance of wavelet image coders. However, if care is not taken in algorithm design and implementation, the formation of high-order modeling contexts can be both CPU and memory greedy, creating a computation bottleneck for wavelet coding systems. In this paper we focus on the operational aspect of high-order statistical context modeling, and introduce some fast algorithm techniques that can drastically reduce both time and space complexities of high-order context modeling in the wavelet domain.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126716733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Three-dimensional (2D+T) wavelet coding of video using SPIHT has been shown to outperform standard predictive video coders on complex high-motion sequences, and is competitive with standard predictive video coders on simple low-motion sequences. However, on a number of typical moderate-motion sequences characterized by largely rigid motions, 3D SPIHT performs several dB worse than motion-compensated predictive coders, because it is does not take advantage of the real physical motion underlying the scene. We introduce global motion compensation for 3D subband video coders, and find 0.5 to 2 dB gain on sequences with dominant background motion. Our approach is a hybrid of video coding based on sprites, or mosaics, and subband coding.
{"title":"Three-dimensional wavelet coding of video with global motion compensation","authors":"Albert Wang, Zixiang Xiong, P. Chou, S. Mehrotra","doi":"10.1109/DCC.1999.755690","DOIUrl":"https://doi.org/10.1109/DCC.1999.755690","url":null,"abstract":"Three-dimensional (2D+T) wavelet coding of video using SPIHT has been shown to outperform standard predictive video coders on complex high-motion sequences, and is competitive with standard predictive video coders on simple low-motion sequences. However, on a number of typical moderate-motion sequences characterized by largely rigid motions, 3D SPIHT performs several dB worse than motion-compensated predictive coders, because it is does not take advantage of the real physical motion underlying the scene. We introduce global motion compensation for 3D subband video coders, and find 0.5 to 2 dB gain on sequences with dominant background motion. Our approach is a hybrid of video coding based on sprites, or mosaics, and subband coding.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130260346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We design an edge-adaptive predictor for lossless image coding. The predictor adaptively weights a four-directional predictor together with an adaptive linear predictor based on information from neighbouring pixels. Although conceptually simple, the performance of the resulting coder is comparable to state-of-the-art image coders when a simple context-based coder is used to encode the prediction errors.
{"title":"Edge-adaptive prediction for lossless image coding","authors":"Wee Sun Lee","doi":"10.1109/DCC.1999.755698","DOIUrl":"https://doi.org/10.1109/DCC.1999.755698","url":null,"abstract":"We design an edge-adaptive predictor for lossless image coding. The predictor adaptively weights a four-directional predictor together with an adaptive linear predictor based on information from neighbouring pixels. Although conceptually simple, the performance of the resulting coder is comparable to state-of-the-art image coders when a simple context-based coder is used to encode the prediction errors.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134257490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Effros, Karthik Venkat Ramanan, S. R. Kulkarni, S. Verdú
We here consider a theoretical evaluation of data compression algorithms based on the Burrows Wheeler transform (BWT). The main contributions include a variety of very simple new techniques for BWT-based universal lossless source coding on finite-memory sources and a set of new rate of convergence results for BWT-based source codes. The result is a theoretical validation and quantification of the earlier experimental observation that BWT-based lossless source codes give performance better than that of Ziv-Lempel-style codes and almost as good as that of prediction by partial mapping (PPM) algorithms.
{"title":"Universal lossless source coding with the Burrows Wheeler transform","authors":"M. Effros, Karthik Venkat Ramanan, S. R. Kulkarni, S. Verdú","doi":"10.1109/DCC.1999.755667","DOIUrl":"https://doi.org/10.1109/DCC.1999.755667","url":null,"abstract":"We here consider a theoretical evaluation of data compression algorithms based on the Burrows Wheeler transform (BWT). The main contributions include a variety of very simple new techniques for BWT-based universal lossless source coding on finite-memory sources and a set of new rate of convergence results for BWT-based source codes. The result is a theoretical validation and quantification of the earlier experimental observation that BWT-based lossless source codes give performance better than that of Ziv-Lempel-style codes and almost as good as that of prediction by partial mapping (PPM) algorithms.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132678213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data mining, a burgeoning new technology, is about looking for patterns in data. Likewise, text mining is about looking for patterns in text. Text mining is possible because you do not have to understand text in order to extract useful information from it. Here are four examples. First, if only names could be identified, links could be inserted automatically to other places that mention the same name, links that are "dynamically evaluated" by calling upon a search engine to bind them at click time. Second, actions can be associated with different types of data, using either explicit programming or programming-by-demonstration techniques. A day/time specification appearing anywhere within one's E-mail could be associated with diary actions such as updating a personal organizer or creating an automatic reminder, and each mention of a day/time in the text could raise a popup menu of calendar-based actions. Third, text could be mined for data in tabular format, allowing databases to be created from formatted tables such as stock-market information on Web pages. Fourth, an agent could monitor incoming newswire stories for company names and collect documents that mention them, an automated press clipping service. This paper aims to promote text compression as a key technology for text mining.
{"title":"Text mining: a new frontier for lossless compression","authors":"I. Witten, Zane Bray, M. Mahoui, W. Teahan","doi":"10.1109/DCC.1999.755669","DOIUrl":"https://doi.org/10.1109/DCC.1999.755669","url":null,"abstract":"Data mining, a burgeoning new technology, is about looking for patterns in data. Likewise, text mining is about looking for patterns in text. Text mining is possible because you do not have to understand text in order to extract useful information from it. Here are four examples. First, if only names could be identified, links could be inserted automatically to other places that mention the same name, links that are \"dynamically evaluated\" by calling upon a search engine to bind them at click time. Second, actions can be associated with different types of data, using either explicit programming or programming-by-demonstration techniques. A day/time specification appearing anywhere within one's E-mail could be associated with diary actions such as updating a personal organizer or creating an automatic reminder, and each mention of a day/time in the text could raise a popup menu of calendar-based actions. Third, text could be mined for data in tabular format, allowing databases to be created from formatted tables such as stock-market information on Web pages. Fourth, an agent could monitor incoming newswire stories for company names and collect documents that mention them, an automated press clipping service. This paper aims to promote text compression as a key technology for text mining.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130849680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent progress in context modeling and adaptive entropy coding of wavelet coefficients has probably been the most important catalyst for the rapidly maturing area of wavelet image compression technology. In this paper we identify statistical context modeling of wavelet coefficients as the determining factor of rate-distortion performance of wavelet codecs. We propose a new context quantization algorithm for minimum conditional entropy. The algorithm is a dynamic programming process guided by Fisher's linear discriminant. It facilitates high-order context modeling and adaptive entropy coding of embedded wavelet bit streams, and leads to superb compression performance in both lossy and lossless cases.
{"title":"Context quantization with Fisher discriminant for adaptive embedded wavelet image coding","authors":"Xiaolin Wu","doi":"10.1109/DCC.1999.755659","DOIUrl":"https://doi.org/10.1109/DCC.1999.755659","url":null,"abstract":"Recent progress in context modeling and adaptive entropy coding of wavelet coefficients has probably been the most important catalyst for the rapidly maturing area of wavelet image compression technology. In this paper we identify statistical context modeling of wavelet coefficients as the determining factor of rate-distortion performance of wavelet codecs. We propose a new context quantization algorithm for minimum conditional entropy. The algorithm is a dynamic programming process guided by Fisher's linear discriminant. It facilitates high-order context modeling and adaptive entropy coding of embedded wavelet bit streams, and leads to superb compression performance in both lossy and lossless cases.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126169915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. We developed a wavelet-based SAR image compression algorithm which combines tree-structured texture analysis, soft-thresholding speckle reduction, quadtree homogeneous decomposition, and a modified zero-tree coding scheme. First, the tree-structured wavelet transform is applied to the SAR image. The decomposition is no longer simply applied to the low-scale subsignals recursively but to the output of any filter. The measurement of the decomposition is the energy of the image. If the energy of a subimage is significantly smaller than others, we stop the decomposition in this region since it contains less information. The texture factors are created after this step, which represents the amount of texture information. Second, quadtree decomposition is used to split the components in the lowest scale component into two sets, a homogeneous set and a target set. The homogeneous set consists of the relatively homogeneous regions. The target set consists of those non-homogeneous regions which have been further decomposed into single component regions. A conventional soft-threshold is applied to reduce speckle noise on all the wavelet coefficients except those of the lowest scale. The feature factor is used to set the threshold. Finally, the conventional SPIHT methods are modified based on the result from the tree-structured decomposition and the quadtree decomposition. In the encoder, the amount of speckle reduction is chosen based on the requirements of the user. Different coding schemes are applied to the homogeneous set and the target set. The skewed distribution of the residuals makes arithmetic coding the best choice for lossless compression.
{"title":"Modified SPIHT encoding for SAR image data","authors":"Z. Zeng, I. Cumming","doi":"10.1109/DCC.1999.785719","DOIUrl":"https://doi.org/10.1109/DCC.1999.785719","url":null,"abstract":"Summary form only given. We developed a wavelet-based SAR image compression algorithm which combines tree-structured texture analysis, soft-thresholding speckle reduction, quadtree homogeneous decomposition, and a modified zero-tree coding scheme. First, the tree-structured wavelet transform is applied to the SAR image. The decomposition is no longer simply applied to the low-scale subsignals recursively but to the output of any filter. The measurement of the decomposition is the energy of the image. If the energy of a subimage is significantly smaller than others, we stop the decomposition in this region since it contains less information. The texture factors are created after this step, which represents the amount of texture information. Second, quadtree decomposition is used to split the components in the lowest scale component into two sets, a homogeneous set and a target set. The homogeneous set consists of the relatively homogeneous regions. The target set consists of those non-homogeneous regions which have been further decomposed into single component regions. A conventional soft-threshold is applied to reduce speckle noise on all the wavelet coefficients except those of the lowest scale. The feature factor is used to set the threshold. Finally, the conventional SPIHT methods are modified based on the result from the tree-structured decomposition and the quadtree decomposition. In the encoder, the amount of speckle reduction is chosen based on the requirements of the user. Different coding schemes are applied to the homogeneous set and the target set. The skewed distribution of the residuals makes arithmetic coding the best choice for lossless compression.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129473334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. The Joint Bi-Level Expert Group (JBIG), an international study group affiliated with the ISO/IEC and ITU-T, has recently completed a committee draft of the JBIG2 standard for lossy and lossless bi-level image compression. We study design considerations for a purely lossless encoder. First, we outline the JBIG2 bitstream, focusing on the options and parameters available to an encoder. Then, we present numerous lossless encoder design strategies, including lossy to lossless coding approaches. For each strategy, we determine the compression performance, and the execution times for both encoding and decoding. The strategy that achieved the highest compression performance in our experiment used a double dictionary approach, with a residue cleanup. In this strategy, small and unique symbols were coded as a generic region residue. Only repeated symbols or those used as a basis for soft matches were added to a dictionary, with the remaining symbols embedded as refinements in the symbol region segment. The second dictionary was encoded as a refinement-aggregate dictionary, where dictionary symbols were encoded as refinements of symbols from the first dictionary, or previous entries in the second dictionary. With all other bitstream parameters optimized, this strategy can easily achieve an additional 30% compression over simpler symbol dictionary approaches. Next, we continue the experiment with an evaluation of each of the bitstream options and configuration parameters, and their impact on complexity and compression. We also demonstrate the consequences of choosing incorrect parameters. We conclude with a summary of our compression results, and general recommendations for encoder designers.
{"title":"Lossless JBIG2 coding performance","authors":"D. Tompkins, F. Kossentini","doi":"10.1109/DCC.1999.785710","DOIUrl":"https://doi.org/10.1109/DCC.1999.785710","url":null,"abstract":"Summary form only given. The Joint Bi-Level Expert Group (JBIG), an international study group affiliated with the ISO/IEC and ITU-T, has recently completed a committee draft of the JBIG2 standard for lossy and lossless bi-level image compression. We study design considerations for a purely lossless encoder. First, we outline the JBIG2 bitstream, focusing on the options and parameters available to an encoder. Then, we present numerous lossless encoder design strategies, including lossy to lossless coding approaches. For each strategy, we determine the compression performance, and the execution times for both encoding and decoding. The strategy that achieved the highest compression performance in our experiment used a double dictionary approach, with a residue cleanup. In this strategy, small and unique symbols were coded as a generic region residue. Only repeated symbols or those used as a basis for soft matches were added to a dictionary, with the remaining symbols embedded as refinements in the symbol region segment. The second dictionary was encoded as a refinement-aggregate dictionary, where dictionary symbols were encoded as refinements of symbols from the first dictionary, or previous entries in the second dictionary. With all other bitstream parameters optimized, this strategy can easily achieve an additional 30% compression over simpler symbol dictionary approaches. Next, we continue the experiment with an evaluation of each of the bitstream options and configuration parameters, and their impact on complexity and compression. We also demonstrate the consequences of choosing incorrect parameters. We conclude with a summary of our compression results, and general recommendations for encoder designers.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123037559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe a precompression algorithm that effectively represents any long common strings that appear in a file. The algorithm interacts well with standard compression algorithms that represent shorter strings that are near in the input text. Our experiments show that some real data sets do indeed contain many long common strings. We extend the fingerprint mechanisms of our algorithm to a program that identifies long common strings in an input file. This program gives interesting insights into the structure of real data files that contain long common strings.
{"title":"Data compression using long common strings","authors":"J. Bentley","doi":"10.1109/DCC.1999.755678","DOIUrl":"https://doi.org/10.1109/DCC.1999.755678","url":null,"abstract":"We describe a precompression algorithm that effectively represents any long common strings that appear in a file. The algorithm interacts well with standard compression algorithms that represent shorter strings that are near in the input text. Our experiments show that some real data sets do indeed contain many long common strings. We extend the fingerprint mechanisms of our algorithm to a program that identifies long common strings in an input file. This program gives interesting insights into the structure of real data files that contain long common strings.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"356 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116241451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}