Summary form only given. Error resiliency is the ability to tolerate uncorrectable errors with graceful quality degradation. It differs from traditional uses of error-correcting coding (ECC) in two major respects: (1) it assigns differentiated (rather than uniform) error protection to different segments of the data, and (2) if errors cannot be corrected in some data segments, a good- (albeit degraded-) quality reconstruction of the data is still possible. The error resiliency approach is suitable in lossy compression, particularly, DCT-based and wavelet-based compression. Under those compression schemes, the data is separated into different frequencies or frequency bands. Since the human visual and auditory systems are more sensitive to lower-frequency data than to higher-frequency data, it is better to protect the lower-frequency data more than the higher-frequency data, given a constant ECC bit rate. With this differentiated error protection, the probability of recovering from errors in the lower-frequency data is higher, and thus the probability of reconstructing good-quality data (e.g., image, video or sound) is higher. For effective and efficient error resiliency, many issues need careful study and some are addressed in this paper. We investigate our error resiliency approaches applied to wavelet compression of images.
{"title":"Error resiliency issues in wavelet compression","authors":"A. Youssef","doi":"10.1109/DCC.1997.582150","DOIUrl":"https://doi.org/10.1109/DCC.1997.582150","url":null,"abstract":"Summary form only given. Error resiliency is the ability to tolerate uncorrectable errors with graceful quality degradation. It differs from traditional uses of error-correcting coding (ECC) in two major respects: (1) it assigns differentiated (rather than uniform) error protection to different segments of the data, and (2) if errors cannot be corrected in some data segments, a good- (albeit degraded-) quality reconstruction of the data is still possible. The error resiliency approach is suitable in lossy compression, particularly, DCT-based and wavelet-based compression. Under those compression schemes, the data is separated into different frequencies or frequency bands. Since the human visual and auditory systems are more sensitive to lower-frequency data than to higher-frequency data, it is better to protect the lower-frequency data more than the higher-frequency data, given a constant ECC bit rate. With this differentiated error protection, the probability of recovering from errors in the lower-frequency data is higher, and thus the probability of reconstructing good-quality data (e.g., image, video or sound) is higher. For effective and efficient error resiliency, many issues need careful study and some are addressed in this paper. We investigate our error resiliency approaches applied to wavelet compression of images.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131468102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. Image compression is achieved by eliminating various types of redundancy that exist in the pixel values. Individual gray-scale images contain interpixel, psychovisual, and coding redundancy. However, sets of similar images contain an additional type of redundancy: the set redundancy. Set redundancy is the inter-image redundancy that results from the common information found in more than one image in the set. Set redundancy can be used to improve compression. Medical imaging is one of the best application areas for the enhanced compression model (ECM) and the set redundancy compression (SRC) methods. Medical images classified by modality and type of exam are very similar to one another, because of the standard procedures used in radiology. Therefore, medical image databases contain large amounts of set redundancy, which the ECM can efficiently reduce. Tests performed on a test database of CT brain scans showed significant compression improvement when the images were pre-processed with SRC methods to reduce set redundancy. The images were obtained from a random population of patients, and the tests were performed with the standard compression techniques used in radiology: Huffman encoding, arithmetic coding, and Lempel-Ziv compression. The best improvement resulted from combining the min-max predictive method with Huffman compression. In our tests we used genetic algorithms to identify the sets of similar images in the image database.
{"title":"Image compression in medical image databases using set redundancy","authors":"K. Karadimitriou, M. Fenstermacher","doi":"10.1109/DCC.1997.582104","DOIUrl":"https://doi.org/10.1109/DCC.1997.582104","url":null,"abstract":"Summary form only given. Image compression is achieved by eliminating various types of redundancy that exist in the pixel values. Individual gray-scale images contain interpixel, psychovisual, and coding redundancy. However, sets of similar images contain an additional type of redundancy: the set redundancy. Set redundancy is the inter-image redundancy that results from the common information found in more than one image in the set. Set redundancy can be used to improve compression. Medical imaging is one of the best application areas for the enhanced compression model (ECM) and the set redundancy compression (SRC) methods. Medical images classified by modality and type of exam are very similar to one another, because of the standard procedures used in radiology. Therefore, medical image databases contain large amounts of set redundancy, which the ECM can efficiently reduce. Tests performed on a test database of CT brain scans showed significant compression improvement when the images were pre-processed with SRC methods to reduce set redundancy. The images were obtained from a random population of patients, and the tests were performed with the standard compression techniques used in radiology: Huffman encoding, arithmetic coding, and Lempel-Ziv compression. The best improvement resulted from combining the min-max predictive method with Huffman compression. In our tests we used genetic algorithms to identify the sets of similar images in the image database.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126436945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data compression and learning are, in some sense, two sides of the same coin. If we paraphrase Occam's razor by saying that a small theory is better than a larger theory with the same explanatory power, we can characterize data compression as a preoccupation with small, and learning as a preoccupation with better. Nevill-Manning et al. (see Proc. Data Compression Conference, Los Alamitos, CA, p.244-253, 1994) presented an algorithm, since dubbed SEQUITUR, that presents both faces of the compression/learning coin. Its performance as a data compression scheme outstrips other dictionary schemes, and the structures that it learns from sequences as diverse as DNA and music are intuitively compelling. We present three new results that characterize SEQUITUR's computational and compression performance. First, we prove that SEQUITUR operates in time linear in n, the length of the input sequence, despite its ability to build a hierarchy as deep as log(n). Second, we show that a sequence can be compressed incrementally, improving on the non-incremental algorithm that was described by Nevill-Manning et al., and making on-line compression feasible. Third, we present an intriguing result that emerged during benchmarking; whereas PPMC outperforms SEQUITUR on most files in the Calgary corpus, SEQUITUR regains the lead when tested on multimegabyte sequences. We make some tentative conclusions about the underlying reasons for this phenomenon, and about the nature of current compression benchmarking.
从某种意义上说,数据压缩和学习是同一枚硬币的两面。如果我们将奥卡姆剃刀理论解释为具有相同解释力的小理论比大理论更好,我们可以将数据压缩描述为专注于小,而学习则专注于更好。内维尔-曼宁等人(参见Proc. Data Compression Conference, Los Alamitos, CA, p.244-253, 1994)提出了一种算法,此后被称为SEQUITUR,它呈现了压缩/学习硬币的两个方面。作为一种数据压缩方案,它的性能超过了其他字典方案,而且它从DNA和音乐等多种序列中学习的结构直观上令人信服。我们提出了三个新的结果,表征了SEQUITUR的计算和压缩性能。首先,我们证明了尽管SEQUITUR能够建立一个深度为log(n)的层次结构,但它在输入序列的长度n上是时间线性的。其次,我们证明了序列可以增量压缩,改进了由neville - manning等人描述的非增量算法,并使在线压缩成为可能。第三,我们提出了在基准测试期间出现的一个有趣的结果;虽然PPMC在卡尔加里语料库中的大多数文件上优于SEQUITUR,但在对多兆字节序列进行测试时,SEQUITUR重新获得领先优势。我们对这一现象的潜在原因以及当前压缩基准测试的本质做出了一些初步的结论。
{"title":"Linear-time, incremental hierarchy inference for compression","authors":"C. Nevill-Manning, I. Witten","doi":"10.1109/DCC.1997.581951","DOIUrl":"https://doi.org/10.1109/DCC.1997.581951","url":null,"abstract":"Data compression and learning are, in some sense, two sides of the same coin. If we paraphrase Occam's razor by saying that a small theory is better than a larger theory with the same explanatory power, we can characterize data compression as a preoccupation with small, and learning as a preoccupation with better. Nevill-Manning et al. (see Proc. Data Compression Conference, Los Alamitos, CA, p.244-253, 1994) presented an algorithm, since dubbed SEQUITUR, that presents both faces of the compression/learning coin. Its performance as a data compression scheme outstrips other dictionary schemes, and the structures that it learns from sequences as diverse as DNA and music are intuitively compelling. We present three new results that characterize SEQUITUR's computational and compression performance. First, we prove that SEQUITUR operates in time linear in n, the length of the input sequence, despite its ability to build a hierarchy as deep as log(n). Second, we show that a sequence can be compressed incrementally, improving on the non-incremental algorithm that was described by Nevill-Manning et al., and making on-line compression feasible. Third, we present an intriguing result that emerged during benchmarking; whereas PPMC outperforms SEQUITUR on most files in the Calgary corpus, SEQUITUR regains the lead when tested on multimegabyte sequences. We make some tentative conclusions about the underlying reasons for this phenomenon, and about the nature of current compression benchmarking.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128888589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. Although a simple solution to multi-resolution video coding is the simulcast technique, in which each resolution of multi-resolution video is coded independently, a more efficient method is scalable coding. In scalable video coding, the lower resolution reproduction of the video is used in coding the higher resolution video. Temporal scalability is a tool intended for use in a range of applications, such as broadcasting of interlaced TV and progressive HDTV, for which migration to higher temporal resolution is necessary. Based on the simulation results with an MPEG-2 video encoder, it is observed that scalable coding with nonlinear interpolation achieves a 2-3 dB PSNR improvement over the simulcast coding at the same total bit-rate. However, for progressive input: interlace-to-interlace type temporal scalability, the scalable coding performance is lower than that of single layer progressive coding. This is expected as the coding performance decreases in interlaced coding due to the difficulty of motion estimation in interlaced sequences.
{"title":"Temporally scalable video coding using nonlinear deinterlacing","authors":"S. Bayrakeri, R. Mersereau","doi":"10.1109/DCC.1997.582078","DOIUrl":"https://doi.org/10.1109/DCC.1997.582078","url":null,"abstract":"Summary form only given. Although a simple solution to multi-resolution video coding is the simulcast technique, in which each resolution of multi-resolution video is coded independently, a more efficient method is scalable coding. In scalable video coding, the lower resolution reproduction of the video is used in coding the higher resolution video. Temporal scalability is a tool intended for use in a range of applications, such as broadcasting of interlaced TV and progressive HDTV, for which migration to higher temporal resolution is necessary. Based on the simulation results with an MPEG-2 video encoder, it is observed that scalable coding with nonlinear interpolation achieves a 2-3 dB PSNR improvement over the simulcast coding at the same total bit-rate. However, for progressive input: interlace-to-interlace type temporal scalability, the scalable coding performance is lower than that of single layer progressive coding. This is expected as the coding performance decreases in interlaced coding due to the difficulty of motion estimation in interlaced sequences.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132284610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. Our new approach of storing address traces, the RPS format (recovered program structure), is based on two ideas: first, the structure of the underlying program is reconstructed from the address trace, and second, the output is decomposed in multiple files such that gzip can take advantage of repeated input patterns. In the first step, the control flow of the program is determined by identifying the basic blocks, i.e., the straight segments of code with no jumps in and out. Then, the invocation sequence of basic blocks is written to a file, which can be compressed by a factor of more than 35, since gzip can easily detect patterns in it. The basic block data contains information on the length of a basic block and on the position of load and store instructions among all instructions. Their addresses are stored in separate files. In the second step, the load and store references are partitioned in global, local and unassigned variable classes. Global variables have the same value for all invocations of a basic block, local variables can be represented as base + offset, where offset is a constant and base only changes between invocations of a basic block. All other variables are "unassigned" and their addresses are stored in separate files as a difference to the previous value.
{"title":"Compressing address trace data for cache simulations","authors":"A. Fox, T. Grun","doi":"10.1109/DCC.1997.582096","DOIUrl":"https://doi.org/10.1109/DCC.1997.582096","url":null,"abstract":"Summary form only given. Our new approach of storing address traces, the RPS format (recovered program structure), is based on two ideas: first, the structure of the underlying program is reconstructed from the address trace, and second, the output is decomposed in multiple files such that gzip can take advantage of repeated input patterns. In the first step, the control flow of the program is determined by identifying the basic blocks, i.e., the straight segments of code with no jumps in and out. Then, the invocation sequence of basic blocks is written to a file, which can be compressed by a factor of more than 35, since gzip can easily detect patterns in it. The basic block data contains information on the length of a basic block and on the position of load and store instructions among all instructions. Their addresses are stored in separate files. In the second step, the load and store references are partitioned in global, local and unassigned variable classes. Global variables have the same value for all invocations of a basic block, local variables can be represented as base + offset, where offset is a constant and base only changes between invocations of a basic block. All other variables are \"unassigned\" and their addresses are stored in separate files as a difference to the previous value.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114475725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the design of entropy-constrained successively refinable scalar quantizers. We propose two algorithms to minimize the average distortion and design such a quantizer. We consider two sets of constraints on the entropy: (i) constraint on the average rate and (ii) constraint on aggregate rates. Both algorithms can be easily extended to design vector quantizers.
{"title":"Entropy-constrained successively refinable scalar quantization","authors":"H. Jafarkhani, H. Brunk, N. Farvardin","doi":"10.1109/DCC.1997.582057","DOIUrl":"https://doi.org/10.1109/DCC.1997.582057","url":null,"abstract":"We study the design of entropy-constrained successively refinable scalar quantizers. We propose two algorithms to minimize the average distortion and design such a quantizer. We consider two sets of constraints on the entropy: (i) constraint on the average rate and (ii) constraint on aggregate rates. Both algorithms can be easily extended to design vector quantizers.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128743167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. We discuss the use of a new algorithm to preprocess text in order to improve the compression ratio of textual documents, in particular online documents such as web pages on the World Wide Web. The algorithm was first introduced in an earlier paper, and in this paper we discuss the applicability of our algorithm in Internet and Intranet environments, and present additional performance measurements regarding compression ratios, memory requirements and run time. Our results show that our preprocessing algorithm usually leads to a significantly improved compression ratio. Our algorithm requires a static dictionary shared by the compressor and the decompressor. The basic idea of the algorithm is to define a unique encryption or signature for each word in the dictionary, and to replace each word in the input text by its signature. Each signature consists mostly of the special character '*' plus as many alphabetic characters as necessary to make the signature unique among all words of the same length in the dictionary. In the resulting cryptic text the most frequently used character is typically the '*' character, and standard compression algorithms like LZW applied to the cryptic text can exploit this redundancy in order to achieve better compression ratios. We compared the performance of our algorithm to other text compression algorithms, including standard compression algorithms such as gzip, Unix 'compress' and PPM, and to one text compression algorithm which uses a static dictionary.
{"title":"Data compression using text encryption","authors":"H. Kruse, A. Mukherjee","doi":"10.1109/DCC.1997.582107","DOIUrl":"https://doi.org/10.1109/DCC.1997.582107","url":null,"abstract":"Summary form only given. We discuss the use of a new algorithm to preprocess text in order to improve the compression ratio of textual documents, in particular online documents such as web pages on the World Wide Web. The algorithm was first introduced in an earlier paper, and in this paper we discuss the applicability of our algorithm in Internet and Intranet environments, and present additional performance measurements regarding compression ratios, memory requirements and run time. Our results show that our preprocessing algorithm usually leads to a significantly improved compression ratio. Our algorithm requires a static dictionary shared by the compressor and the decompressor. The basic idea of the algorithm is to define a unique encryption or signature for each word in the dictionary, and to replace each word in the input text by its signature. Each signature consists mostly of the special character '*' plus as many alphabetic characters as necessary to make the signature unique among all words of the same length in the dictionary. In the resulting cryptic text the most frequently used character is typically the '*' character, and standard compression algorithms like LZW applied to the cryptic text can exploit this redundancy in order to achieve better compression ratios. We compared the performance of our algorithm to other text compression algorithms, including standard compression algorithms such as gzip, Unix 'compress' and PPM, and to one text compression algorithm which uses a static dictionary.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133225446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. The ELS-coder is a new entropy-coding algorithm combining rapid encoding and decoding with near-optimum compression ratios. It can be combined with data-modeling methods to produce data-compression applications for text, images, or any type of digital data. Previous algorithms for entropy coding include Huffman coding, arithmetic coding, and the Q- and QM-coders, but all show limitations of speed or compression performance, so that new algorithms continue to be of interest. The ELS-coder, which uses no multiplication or division operations, operates more rapidly than traditional arithmetic coding. It compresses more effectively than Huffman coding (especially for a binary alphabet) and more effectively than the Q- or QM-coder except for symbol probabilities very close to zero or one.
{"title":"The ELS-coder: a rapid entropy coder","authors":"D. Withers","doi":"10.1109/DCC.1997.582144","DOIUrl":"https://doi.org/10.1109/DCC.1997.582144","url":null,"abstract":"Summary form only given. The ELS-coder is a new entropy-coding algorithm combining rapid encoding and decoding with near-optimum compression ratios. It can be combined with data-modeling methods to produce data-compression applications for text, images, or any type of digital data. Previous algorithms for entropy coding include Huffman coding, arithmetic coding, and the Q- and QM-coders, but all show limitations of speed or compression performance, so that new algorithms continue to be of interest. The ELS-coder, which uses no multiplication or division operations, operates more rapidly than traditional arithmetic coding. It compresses more effectively than Huffman coding (especially for a binary alphabet) and more effectively than the Q- or QM-coder except for symbol probabilities very close to zero or one.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125643698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. The authors discuss context tree weighting (Willems et al. 1995). This was originally introduced as a sequential universal source coding method for the class of binary tree sources. The paper discusses the application of the method to the compaction of ASCII sequences. The estimation of redundancy and model redundancy are also considered.
只提供摘要形式。作者讨论了上下文树权重(Willems et al. 1995)。这最初是作为二叉树源类的顺序通用源编码方法引入的。本文讨论了该方法在ASCII序列压缩中的应用。同时考虑了冗余估计和模型冗余估计。
{"title":"A context-tree weighting method for text generating sources","authors":"T. Tjalkens, P. Volf, F. Willems","doi":"10.1109/DCC.1997.582140","DOIUrl":"https://doi.org/10.1109/DCC.1997.582140","url":null,"abstract":"Summary form only given. The authors discuss context tree weighting (Willems et al. 1995). This was originally introduced as a sequential universal source coding method for the class of binary tree sources. The paper discusses the application of the method to the compaction of ASCII sequences. The estimation of redundancy and model redundancy are also considered.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128035466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary form only given. The best-performing method in the data compression literature for computing probability estimates of sequences on-line using a suffix-tree model is the blending technique used by PPM. Blending can be viewed as a bottom-up recursive procedure for computing a mixture, barring one missing term for each level of the recursion, where a mixture is basically a weighted average of several probability estimates. The author shows the relative effectiveness of most combinations of mixture weighting functions and inheritance evaluation times. The results of a study on the value of using update exclusion, especially in models using state selection, are also sown.
{"title":"Generalization and improvement to PPM's \"blending\"","authors":"S. Bunton","doi":"10.1109/DCC.1997.582082","DOIUrl":"https://doi.org/10.1109/DCC.1997.582082","url":null,"abstract":"Summary form only given. The best-performing method in the data compression literature for computing probability estimates of sequences on-line using a suffix-tree model is the blending technique used by PPM. Blending can be viewed as a bottom-up recursive procedure for computing a mixture, barring one missing term for each level of the recursion, where a mixture is basically a weighted average of several probability estimates. The author shows the relative effectiveness of most combinations of mixture weighting functions and inheritance evaluation times. The results of a study on the value of using update exclusion, especially in models using state selection, are also sown.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127843006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}