首页 > 最新文献

Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)最新文献

英文 中文
Lexical attraction for text compression 文本压缩的词汇吸引力
Pub Date : 1999-03-29 DOI: 10.1109/DCC.1999.785673
Joscha Bach, I. Witten
[Summary form only given]. The best methods of text compression work by conditioning each symbol's probability on its predecessors. Prior symbols establish a context that governs the probability distribution for the next one, and the actual. The next symbol is encoded with respect to this distribution. However, the best predictors for words in natural language are not necessarily their immediate predecessors. Verbs may depend on nouns, pronouns on names, closing brackets on opening ones, question marks on "wh"-words. To establish a more appropriate dependency structure, the lexical attraction of a pair of words is defined as the likelihood that they will appear (in that order) within a sentence, regardless of how far apart they are. This is estimated by counting the co-occurrences of words in the sentences of a large corpus. Then, for each sentence, an undirected (planar, acydic) graph is found that maximizes the lexical attraction between linked items, effectively reorganizing the text in the form of a low-entropy model. We encode a series of linked sentences and transmit them in the same manner as order-1 word-level PPM. To prime the lexical attraction linker, the whole document is processed to acquire the co-occurrence counts, and again to re-link the sentences. Pairs that occur twice or less are excluded from the statistics, which significantly reduces the size of the model. The encoding stage utilizes an adaptive PPM-style method. Encouraging results have been obtained using this method.
[仅提供摘要形式]。最好的文本压缩方法是根据每个符号的前一个符号的概率进行调整。先验符号建立了一个上下文,该上下文支配着下一个和实际的概率分布。下一个符号是根据这个分布进行编码的。然而,自然语言中单词的最佳预测器不一定是它们的直接前身。动词可以依赖于名词,代词可以依赖于名称,右括号可以依赖于开头,问号可以依赖于“wh”。为了建立一个更合适的依赖关系结构,一对单词的词汇吸引力被定义为它们在一个句子中出现的可能性(按该顺序),而不管它们相距多远。这是通过计算一个大型语料库中句子中单词的共现次数来估计的。然后,对于每个句子,找到一个无向(平面,无向)图,最大限度地提高链接项之间的词汇吸引力,以低熵模型的形式有效地重组文本。我们对一系列相连的句子进行编码,并以与order-1字级PPM相同的方式传输它们。为了启动词汇吸引链接器,对整个文档进行处理以获得共现次数,并再次重新链接句子。出现两次或更少的配对被排除在统计数据之外,这大大减少了模型的大小。编码阶段采用自适应ppm风格的方法。采用该方法取得了令人鼓舞的效果。
{"title":"Lexical attraction for text compression","authors":"Joscha Bach, I. Witten","doi":"10.1109/DCC.1999.785673","DOIUrl":"https://doi.org/10.1109/DCC.1999.785673","url":null,"abstract":"[Summary form only given]. The best methods of text compression work by conditioning each symbol's probability on its predecessors. Prior symbols establish a context that governs the probability distribution for the next one, and the actual. The next symbol is encoded with respect to this distribution. However, the best predictors for words in natural language are not necessarily their immediate predecessors. Verbs may depend on nouns, pronouns on names, closing brackets on opening ones, question marks on \"wh\"-words. To establish a more appropriate dependency structure, the lexical attraction of a pair of words is defined as the likelihood that they will appear (in that order) within a sentence, regardless of how far apart they are. This is estimated by counting the co-occurrences of words in the sentences of a large corpus. Then, for each sentence, an undirected (planar, acydic) graph is found that maximizes the lexical attraction between linked items, effectively reorganizing the text in the form of a low-entropy model. We encode a series of linked sentences and transmit them in the same manner as order-1 word-level PPM. To prime the lexical attraction linker, the whole document is processed to acquire the co-occurrence counts, and again to re-link the sentences. Pairs that occur twice or less are excluded from the statistics, which significantly reduces the size of the model. The encoding stage utilizes an adaptive PPM-style method. Encouraging results have been obtained using this method.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130531902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Compression of arbitrary cutting planes 任意切割平面的压缩
Pub Date : 1999-03-29 DOI: 10.1109/DCC.1999.785685
Yanlin Guan, R. Moorhead
Summary form only given. We present an efficient algorithm for compressing the data necessary to represent an arbitrary cutting plane extracted from a three-dimensional curvilinear data set. The cutting plane technique is an important visualization method for time-varying 3D simulation results since the data sets are often so large. An efficient compression algorithm for these cutting planes is especially important when the simulation running on a remote server is being tracked or the data set is stored on a remote server. Various aspects of the visualization process are considered in the algorithm design, such as the inherent data reduction in going from 3D to 2D when generating a cutting plane, the numerical accuracy required in the cutting plane, and the potential to decimate the triangle mesh. After separating each floating point number into mantissa and exponent, a block sorting algorithm and an entropy coding algorithm are used to perform lossless compression.
只提供摘要形式。我们提出了一种有效的算法来压缩必要的数据,以表示从三维曲线数据集中提取的任意切割平面。由于数据集往往很大,切割平面技术是时变三维仿真结果可视化的一种重要方法。当跟踪在远程服务器上运行的仿真或将数据集存储在远程服务器上时,对这些切割平面的有效压缩算法尤为重要。在算法设计中考虑了可视化过程的各个方面,例如在生成切割平面时从3D到2D的固有数据减少,切割平面所需的数值精度以及三角形网格抽取的可能性。将每个浮点数分离成尾数和指数后,采用块排序算法和熵编码算法进行无损压缩。
{"title":"Compression of arbitrary cutting planes","authors":"Yanlin Guan, R. Moorhead","doi":"10.1109/DCC.1999.785685","DOIUrl":"https://doi.org/10.1109/DCC.1999.785685","url":null,"abstract":"Summary form only given. We present an efficient algorithm for compressing the data necessary to represent an arbitrary cutting plane extracted from a three-dimensional curvilinear data set. The cutting plane technique is an important visualization method for time-varying 3D simulation results since the data sets are often so large. An efficient compression algorithm for these cutting planes is especially important when the simulation running on a remote server is being tracked or the data set is stored on a remote server. Various aspects of the visualization process are considered in the algorithm design, such as the inherent data reduction in going from 3D to 2D when generating a cutting plane, the numerical accuracy required in the cutting plane, and the potential to decimate the triangle mesh. After separating each floating point number into mantissa and exponent, a block sorting algorithm and an entropy coding algorithm are used to perform lossless compression.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127255025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast progressive wavelet coding 快速递进小波编码
Pub Date : 1999-03-29 DOI: 10.1109/DCC.1999.755683
Henrique S. Malvar
Fast and efficient image compression can be achieved with the progressive wavelet coder (PWC) introduced. Unlike many previous wavelet coders, PWC does not rely on zerotrees or other ordering schemes based on parent-child wavelet relationships. PWC has a very simple structure, based on two key concepts: (1) data-independent reordering and blocking, and (2) low-complexity independent encoding of each block via adaptive Rice coding of bit planes. In that way, PWC allows for progressive image encoding that is scalable both in resolution and bit rate, with a fully embedded bitstream. PWC achieves a rate/distortion performance that is comparable to that of the state-of-the-art SPIHT (set partitioning in hierarchical trees) coder, but with a better performance/complexity ratio.
采用渐进式小波编码器(PWC)可以实现快速高效的图像压缩。与许多以前的小波编码器不同,PWC不依赖于零树或其他基于父子小波关系的排序方案。PWC的结构非常简单,基于两个关键概念:(1)数据无关的重排序和块,(2)通过位平面的自适应Rice编码对每个块进行低复杂度的独立编码。通过这种方式,PWC允许在分辨率和比特率上都可扩展的渐进式图像编码,并具有完全嵌入的比特流。PWC实现了与最先进的SPIHT(分层树中设置分区)编码器相当的速率/失真性能,但具有更好的性能/复杂性比。
{"title":"Fast progressive wavelet coding","authors":"Henrique S. Malvar","doi":"10.1109/DCC.1999.755683","DOIUrl":"https://doi.org/10.1109/DCC.1999.755683","url":null,"abstract":"Fast and efficient image compression can be achieved with the progressive wavelet coder (PWC) introduced. Unlike many previous wavelet coders, PWC does not rely on zerotrees or other ordering schemes based on parent-child wavelet relationships. PWC has a very simple structure, based on two key concepts: (1) data-independent reordering and blocking, and (2) low-complexity independent encoding of each block via adaptive Rice coding of bit planes. In that way, PWC allows for progressive image encoding that is scalable both in resolution and bit rate, with a fully embedded bitstream. PWC achieves a rate/distortion performance that is comparable to that of the state-of-the-art SPIHT (set partitioning in hierarchical trees) coder, but with a better performance/complexity ratio.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125492467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Parallel memories in video encoding 视频编码中的并行存储器
Pub Date : 1999-03-29 DOI: 10.1109/DCC.1999.785709
Jarno K. Tanskanen, J. Niittylahti
Summary form only given. A novel architecture with parallel memories suitable for hybrid video coding is presented. It efficiently relieves the memory bandwidth bottleneck in motion estimation, DCT, and IDCT involved in the real-time low-bit rate ITU-T H.263 video compression standard. There are four parallel processing elements and eight parallel memory blocks in the system. The address space is divided into three areas. Coordinate areas 0 and 1 can be accessed simultaneously for row or column formats, needed in the motion estimation, DCT, and IDCT. Alternatively, the area 2 can be accessed for a more complex formats. Such formats are needed, for example, in zigzag scanning and interpolation. The module assignment function S(i,j), expresses how data is stored in the memory modules. We can describe the memory space as a 2D coordinate system with horizontal and vertical coordinates (i,j). The coordinate values are restricted to positive values, and (0,0) is fixed to the uppermost left corner of the coordinate area. The function S(i,j) simply describes the memory block, where the value of coordinate point (i,j) is stored. Memory addresses are described by the address function a(i,j). The coordinate area 0 deals with the memory blocks 0...3, the area 1 with the blocks 4...7 and the area 2 with the blocks 0...7. The constants a/sub 0max/ and a/sub 1max/ are the maximum addresses of the coordinate areas 0 and 1, respectively. The width of the coordinate area is given by L/sub i/. The processing power increases linearly with the number of parallel processing elements. Using more parallel memory blocks enables use of more access formats.
只提供摘要形式。提出了一种适用于混合视频编码的并行存储结构。它有效地缓解了实时低比特率ITU-T H.263视频压缩标准中运动估计、DCT和IDCT的内存带宽瓶颈。系统中有4个并行处理单元和8个并行存储块。地址空间分为三个区域。坐标区域0和1可以同时访问行或列格式,需要在运动估计,DCT和IDCT。另外,可以访问区域2以获取更复杂的格式。这样的格式是需要的,例如,在之字形扫描和插值。模块分配函数S(i,j)表示数据如何存储在内存模块中。我们可以将存储空间描述为具有水平和垂直坐标(i,j)的二维坐标系。坐标值被限制为正值,(0,0)固定在坐标区域的左上角。函数S(i,j)简单地描述了存储坐标点(i,j)的值的内存块。内存地址由地址函数a(i,j)描述。坐标区域0处理内存块0…3、有街区的区域7和2块0…7。常数a/sub 0max/和a/sub 1max/分别是坐标区域0和1的最大地址。坐标区域的宽度由L/下标i/给出。处理能力随着并行处理单元的数量线性增加。使用更多的并行内存块可以使用更多的访问格式。
{"title":"Parallel memories in video encoding","authors":"Jarno K. Tanskanen, J. Niittylahti","doi":"10.1109/DCC.1999.785709","DOIUrl":"https://doi.org/10.1109/DCC.1999.785709","url":null,"abstract":"Summary form only given. A novel architecture with parallel memories suitable for hybrid video coding is presented. It efficiently relieves the memory bandwidth bottleneck in motion estimation, DCT, and IDCT involved in the real-time low-bit rate ITU-T H.263 video compression standard. There are four parallel processing elements and eight parallel memory blocks in the system. The address space is divided into three areas. Coordinate areas 0 and 1 can be accessed simultaneously for row or column formats, needed in the motion estimation, DCT, and IDCT. Alternatively, the area 2 can be accessed for a more complex formats. Such formats are needed, for example, in zigzag scanning and interpolation. The module assignment function S(i,j), expresses how data is stored in the memory modules. We can describe the memory space as a 2D coordinate system with horizontal and vertical coordinates (i,j). The coordinate values are restricted to positive values, and (0,0) is fixed to the uppermost left corner of the coordinate area. The function S(i,j) simply describes the memory block, where the value of coordinate point (i,j) is stored. Memory addresses are described by the address function a(i,j). The coordinate area 0 deals with the memory blocks 0...3, the area 1 with the blocks 4...7 and the area 2 with the blocks 0...7. The constants a/sub 0max/ and a/sub 1max/ are the maximum addresses of the coordinate areas 0 and 1, respectively. The width of the coordinate area is given by L/sub i/. The processing power increases linearly with the number of parallel processing elements. Using more parallel memory blocks enables use of more access formats.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126485778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Design consideration for multi-lingual cascading text compressors 多语言级联文本压缩器的设计考虑
Pub Date : 1999-03-29 DOI: 10.1109/DCC.1999.785677
Chi-Hung Chi, IV YanZhang
Summary form only given. We study the cascading of LZ variants to Huffman coding for multilingual documents. Two models are proposed: the static model and the adaptive (dynamic) model. The static model makes use of the dictionary generated by the LZW algorithm in Chinese dictionary-based Huffman compression to achieve better performance. The dynamic model is an extension of the static cascading model. During the insertion of phrases into the dictionary the frequency count of the phrases is updated so that a dynamic Huffman tree with variable length output tokens is obtained. We propose a new method to capture the "LZW dictionary" "by picking up the dictionary entries during decompression. The general idea is the adding of delimiters during the decompression process so that the decompressed files are segmented into phrases that reflect how the LZW compressor makes use of its dictionary phrases to encode the source. The idea of the adaptive cascading model can be thought as an extension of the Chinese LZW compression. Since the size of the header is one important performance bottleneck in the static cascading model we propose the adaptive cascading model to address this issue. The LZW compressor is now outputting not a fixed length token, but a variable length Huffman code from the Huffman tree. It is expected that such a compressor can achieve very good compression performance. In our adaptive cascading model we choose LZW instead of LZSS because the LZW algorithm preserves more information than the LZSS algorithm does. This characteristic is found to be very useful in helping Chinese compressors to attain better performance.
只提供摘要形式。我们研究了多语言文档的LZ变体到霍夫曼编码的级联。提出了静态模型和自适应(动态)模型。静态模型在基于中文字典的霍夫曼压缩中利用LZW算法生成的字典来达到更好的性能。动态模型是静态级联模型的扩展。在将短语插入字典期间,更新短语的频率计数,从而获得具有可变长度输出令牌的动态霍夫曼树。我们提出了一种通过在解压过程中提取字典条目来捕获“LZW字典”的新方法。一般思想是在解压缩过程中添加分隔符,以便将解压缩的文件分割成短语,以反映LZW压缩器如何使用其字典短语对源进行编码。自适应级联模型的思想可以看作是对中国LZW压缩的扩展。由于报头的大小是静态级联模型中一个重要的性能瓶颈,我们提出了自适应级联模型来解决这个问题。LZW压缩器现在输出的不是固定长度的令牌,而是来自霍夫曼树的可变长度的霍夫曼代码。期望这样的压缩机可以达到非常好的压缩性能。在我们的自适应级联模型中,我们选择LZW而不是LZSS,因为LZW算法比LZSS算法保留了更多的信息。这一特性被发现对帮助中国压缩机获得更好的性能非常有用。
{"title":"Design consideration for multi-lingual cascading text compressors","authors":"Chi-Hung Chi, IV YanZhang","doi":"10.1109/DCC.1999.785677","DOIUrl":"https://doi.org/10.1109/DCC.1999.785677","url":null,"abstract":"Summary form only given. We study the cascading of LZ variants to Huffman coding for multilingual documents. Two models are proposed: the static model and the adaptive (dynamic) model. The static model makes use of the dictionary generated by the LZW algorithm in Chinese dictionary-based Huffman compression to achieve better performance. The dynamic model is an extension of the static cascading model. During the insertion of phrases into the dictionary the frequency count of the phrases is updated so that a dynamic Huffman tree with variable length output tokens is obtained. We propose a new method to capture the \"LZW dictionary\" \"by picking up the dictionary entries during decompression. The general idea is the adding of delimiters during the decompression process so that the decompressed files are segmented into phrases that reflect how the LZW compressor makes use of its dictionary phrases to encode the source. The idea of the adaptive cascading model can be thought as an extension of the Chinese LZW compression. Since the size of the header is one important performance bottleneck in the static cascading model we propose the adaptive cascading model to address this issue. The LZW compressor is now outputting not a fixed length token, but a variable length Huffman code from the Huffman tree. It is expected that such a compressor can achieve very good compression performance. In our adaptive cascading model we choose LZW instead of LZSS because the LZW algorithm preserves more information than the LZSS algorithm does. This characteristic is found to be very useful in helping Chinese compressors to attain better performance.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125538785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Almost-optimal fully LZW-compressed pattern matching 几乎最优的完全lzw压缩模式匹配
Pub Date : 1999-03-29 DOI: 10.1109/DCC.1999.755681
L. Gąsieniec, W. Rytter
Given two strings: pattern P and text T of lengths |P|=M and |T|=N, a string matching problem is to find all occurrences of pattern P in text T. A fully compressed string matching problem is the string matching problem with input strings P and T given in compressed forms p and t respectively, where |p|=m and |t|=n. We present first, almost-optimal, string matching algorithms for LZW-compressed strings running in: (1) O((n+m)log(n+m)) time on a single processor machine; and (2) O/sup /spl tilde//(n+m) work on a (n+m)-processor PRAM. The techniques used can be used in design of efficient algorithms for a wide range of the most typical string problems, in the compressed LZW setting, including: computing a period of a word, finding repetitions, symmetries, counting subwords, and multi-pattern matching.
给定两个字符串:模式P和文本T,长度分别为|P|=M和|T|=N,字符串匹配问题是找出文本T中模式P的所有出现情况。完全压缩字符串匹配问题是输入字符串P和T分别以压缩形式P和T给出的字符串匹配问题,其中|P|=M和|T|=N。我们首先提出了lzw压缩字符串的几乎最优的字符串匹配算法:(1)在单处理器机器上运行O((n+m)log(n+m))时间;(2) O/sup /spl波浪//(n+m)在(n+m)处理器PRAM上工作。在压缩LZW设置中,所使用的技术可用于为各种最典型的字符串问题设计高效算法,包括:计算单词的周期、查找重复、对称性、计算子词和多模式匹配。
{"title":"Almost-optimal fully LZW-compressed pattern matching","authors":"L. Gąsieniec, W. Rytter","doi":"10.1109/DCC.1999.755681","DOIUrl":"https://doi.org/10.1109/DCC.1999.755681","url":null,"abstract":"Given two strings: pattern P and text T of lengths |P|=M and |T|=N, a string matching problem is to find all occurrences of pattern P in text T. A fully compressed string matching problem is the string matching problem with input strings P and T given in compressed forms p and t respectively, where |p|=m and |t|=n. We present first, almost-optimal, string matching algorithms for LZW-compressed strings running in: (1) O((n+m)log(n+m)) time on a single processor machine; and (2) O/sup /spl tilde//(n+m) work on a (n+m)-processor PRAM. The techniques used can be used in design of efficient algorithms for a wide range of the most typical string problems, in the compressed LZW setting, including: computing a period of a word, finding repetitions, symmetries, counting subwords, and multi-pattern matching.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127752242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Bounding the compression loss of the FGK algorithm 限制FGK算法的压缩损失
Pub Date : 1999-03-29 DOI: 10.1109/DCC.1999.785696
R. Milidiú, E. Laber, A. Pessoa
[Summary form only given]. For data communication purposes, the initial parsing required by the static Huffman algorithm represents a big disadvantage. This is because the data must be transmitted on-line. As soon as the symbol arrives at the transmitter, it must be encoded and transmitted to the receiver. In these situations, adaptive Huffman codes have been largely used. This method determines the mapping from symbol alphabet to codewords based upon a running estimate of the alphabet symbol weights. The code is adaptive, just changing to remain optimal for the current estimates. Two methods have been presented in the literature for implementing dynamic Huffman coding. The first one was the FGK algorithm (Knuth, 1985) and the second was the /spl Lambda/ algorithm (Vitter, 1987). Vitter proved that the total number of bits D/sub t/ transmitted by the FGK algorithm for a message with t symbols is bounded below by S/sub t/-n+1, where S/sub t/ is the number of bits required by the static Huffman method and bounded above by 2S/sub t/+t-4n+2. Furthermore, he conjectured that D/sub t/ is bounded above by S/sub t/+O(t). We present an amortized analysis to prove this conjecture by showing that D/sub t//spl les/S/sub t/+2t-2k-[log min(k+1,n)], where k is the number of distinct symbols in the message. We also present an example where D/sub t/=S/sub t/+2t-2k-3[(t-k)/k]-[log(k+1)], showing that the proposed bound is asymptotically tight. These results explain the good performance of FGK observed by some authors through practical experiments.
[仅提供摘要形式]。对于数据通信的目的,静态霍夫曼算法所需的初始解析是一个很大的缺点。这是因为数据必须在线传输。符号一到达发射机,就必须进行编码并传送给接收机。在这些情况下,自适应霍夫曼码被大量使用。该方法根据对字母符号权重的运行估计来确定从符号字母表到码字的映射。代码是自适应的,只是更改以保持当前估计的最佳状态。文献中提出了两种实现动态霍夫曼编码的方法。第一个是FGK算法(Knuth, 1985),第二个是/spl Lambda/算法(Vitter, 1987)。Vitter证明了FGK算法对包含t个符号的消息传输的总比特数D/下标t/在下面以S/下标t/-n+1为界,其中S/下标t/为静态霍夫曼方法所需的比特数,在上面以2S/下标t/+t-4n+2为界。更进一步,他推测D/ t/以S/ t/+O(t)为上界。我们通过证明D/下标t//spl等于/S/下标t/+2t-2k-[log min(k+1,n)]给出了一个平摊分析来证明这个猜想,其中k是消息中不同符号的数目。我们还给出了一个D/下标t/=S/下标t/+2t-2k-3[(t-k)/k]-[log(k+1)]的例子,证明了所提出的界是渐近紧的。这些结果解释了一些作者通过实际实验观察到的FGK的良好性能。
{"title":"Bounding the compression loss of the FGK algorithm","authors":"R. Milidiú, E. Laber, A. Pessoa","doi":"10.1109/DCC.1999.785696","DOIUrl":"https://doi.org/10.1109/DCC.1999.785696","url":null,"abstract":"[Summary form only given]. For data communication purposes, the initial parsing required by the static Huffman algorithm represents a big disadvantage. This is because the data must be transmitted on-line. As soon as the symbol arrives at the transmitter, it must be encoded and transmitted to the receiver. In these situations, adaptive Huffman codes have been largely used. This method determines the mapping from symbol alphabet to codewords based upon a running estimate of the alphabet symbol weights. The code is adaptive, just changing to remain optimal for the current estimates. Two methods have been presented in the literature for implementing dynamic Huffman coding. The first one was the FGK algorithm (Knuth, 1985) and the second was the /spl Lambda/ algorithm (Vitter, 1987). Vitter proved that the total number of bits D/sub t/ transmitted by the FGK algorithm for a message with t symbols is bounded below by S/sub t/-n+1, where S/sub t/ is the number of bits required by the static Huffman method and bounded above by 2S/sub t/+t-4n+2. Furthermore, he conjectured that D/sub t/ is bounded above by S/sub t/+O(t). We present an amortized analysis to prove this conjecture by showing that D/sub t//spl les/S/sub t/+2t-2k-[log min(k+1,n)], where k is the number of distinct symbols in the message. We also present an example where D/sub t/=S/sub t/+2t-2k-3[(t-k)/k]-[log(k+1)], showing that the proposed bound is asymptotically tight. These results explain the good performance of FGK observed by some authors through practical experiments.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"505 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127048613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Constrained wavelet packets for tree-structured video coding algorithms 约束小波包的树状结构视频编码算法
Pub Date : 1999-03-29 DOI: 10.1109/DCC.1999.755685
H. Khalil, A. Jacquin, C. Podilchuk
Traditional wavelet packet (WP) optimization techniques neglect information about the structure of the lossy part of the compression scheme. Such information, however, can help guide the optimization procedure so as to result in efficient WP structures. We propose a wavelet packet algorithm with a constrained rate-distortion optimization which makes it suited to subsequent tree-structured coding such as with the set partitioning in hierarchical trees (SPMT) algorithm. The (octave-band) wavelet transform lends itself to simple and coherent tree-shaped spatial relations which can then be used to define zero-trees. Yet, input images have different frequency distributions and an adaptive transform such as WP is bound to be more efficient on an image-by-image basis. With WP algorithms, the coefficients in the WP domain can be rearranged to produce what resembles (or simulates) the normal wavelet transform structure. This stage is usually performed to simplify the coding stage. However, an unconstrained optimization can result in a transformed image with complicated or incoherent tree-shaped spatial relations. This work aims to show that the efficiency of embedded coders such as SPMT and Shapiro's Zerotrees strongly depends on WP structures with coherent spatial tree relationships.
传统的小波包优化技术忽略了压缩方案中有损部分的结构信息。然而,这些信息可以帮助指导优化过程,从而产生高效的WP结构。我们提出了一种具有约束率失真优化的小波包算法,使其适用于后续的树结构编码,如分层树中的集合划分(SPMT)算法。(八度频带)小波变换适用于简单而连贯的树形空间关系,然后可用于定义零树。然而,输入图像具有不同的频率分布,而像WP这样的自适应变换必然会在逐幅图像的基础上更有效。使用小波变换算法,小波变换域中的系数可以重新排列以产生类似(或模拟)正常小波变换结构的东西。执行这个阶段通常是为了简化编码阶段。然而,无约束优化可能导致变换后的图像具有复杂或不连贯的树状空间关系。这项工作的目的是表明嵌入式编码器的效率,如SPMT和夏皮罗的零树在很大程度上依赖于具有相干空间树关系的WP结构。
{"title":"Constrained wavelet packets for tree-structured video coding algorithms","authors":"H. Khalil, A. Jacquin, C. Podilchuk","doi":"10.1109/DCC.1999.755685","DOIUrl":"https://doi.org/10.1109/DCC.1999.755685","url":null,"abstract":"Traditional wavelet packet (WP) optimization techniques neglect information about the structure of the lossy part of the compression scheme. Such information, however, can help guide the optimization procedure so as to result in efficient WP structures. We propose a wavelet packet algorithm with a constrained rate-distortion optimization which makes it suited to subsequent tree-structured coding such as with the set partitioning in hierarchical trees (SPMT) algorithm. The (octave-band) wavelet transform lends itself to simple and coherent tree-shaped spatial relations which can then be used to define zero-trees. Yet, input images have different frequency distributions and an adaptive transform such as WP is bound to be more efficient on an image-by-image basis. With WP algorithms, the coefficients in the WP domain can be rearranged to produce what resembles (or simulates) the normal wavelet transform structure. This stage is usually performed to simplify the coding stage. However, an unconstrained optimization can result in a transformed image with complicated or incoherent tree-shaped spatial relations. This work aims to show that the efficiency of embedded coders such as SPMT and Shapiro's Zerotrees strongly depends on WP structures with coherent spatial tree relationships.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114926786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Decision trees for error concealment in video decoding 视频解码中错误隐藏的决策树
Pub Date : 1999-03-01 DOI: 10.1109/DCC.1999.755688
Song Cen, P. Cosman, F. Azadegan
When macroblocks are lost in an MPEG decoder, the decoder can try to conceal the error by estimating or interpolating the missing area. Many different methods for this type of concealment have been proposed, operating in the spatial, frequency, or temporal domains, or some hybrid combination of them. We show how the use of a decision tree that can adaptively choose among several different error concealment methods can outperform each single method. We also propose two promising new methods for temporal error concealment.
当宏块在MPEG解码器中丢失时,解码器可以尝试通过估计或插值丢失区域来隐藏错误。对于这种类型的隐藏,已经提出了许多不同的方法,在空间、频率或时间域中操作,或者它们的一些混合组合。我们展示了如何使用决策树,它可以自适应地在几种不同的错误隐藏方法中进行选择,从而优于每种方法。我们还提出了两种有前途的时间误差隐藏新方法。
{"title":"Decision trees for error concealment in video decoding","authors":"Song Cen, P. Cosman, F. Azadegan","doi":"10.1109/DCC.1999.755688","DOIUrl":"https://doi.org/10.1109/DCC.1999.755688","url":null,"abstract":"When macroblocks are lost in an MPEG decoder, the decoder can try to conceal the error by estimating or interpolating the missing area. Many different methods for this type of concealment have been proposed, operating in the spatial, frequency, or temporal domains, or some hybrid combination of them. We show how the use of a decision tree that can adaptively choose among several different error concealment methods can outperform each single method. We also propose two promising new methods for temporal error concealment.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132939762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
New methods for multiplication-free arithmetic coding 无乘法算术编码的新方法
Pub Date : 1900-01-01 DOI: 10.1109/DCC.1999.785712
R. van der Vleuten
Summary form only given. Arithmetic coding is a well-known technique for lossless coding or data compression. We have developed two new multiplication-free methods. Our first new method is to round A to x bits instead of truncating it. Rounding is equivalent to truncating A to its x most significant bits if the (x+1)th most significant bit of A is a 0 and adding 1 to the truncated representation if the (x+1)th most significant bit is a 1. The rounding that is applied in our new method increases the complexity (compared to truncation), since, in about half of the cases, 1 has to be added to the truncated representation. As an alternative, we therefore developed a second new method, which we call "partial rounding". By partial rounding we mean that 1 is only added to the truncated representation of A in the case when the (x+1)th most significant bit is a 1 and the xth most significant bit is a 0. In the implementation this means that the xth bit of the approximation of A equals the logical OR of the xth and (x+l)th most significant bits of the original A. The partial rounding of this second new method results in the same approximation as the "full rounding" of the first method in about 75% of the cases, but its complexity is as low as that of truncation (since the complexity of the OR is negligible). Applying the various multiplication-free methods in the arithmetic coder has demonstrated that our new rounding-based method outperforms the previously published multiplication-free methods. The "partial rounding" method outperforms the previously published truncation-based methods.
只提供摘要形式。算术编码是一种众所周知的无损编码或数据压缩技术。我们开发了两种新的不需要乘法的方法。我们的第一个新方法是将A四舍五入到x位,而不是截断它。舍入相当于如果A的(x+1)最高位是0,则将A截断为其x个最高位,如果(x+1)最高位是1,则将截断的表示形式加1。在我们的新方法中应用的舍入增加了复杂性(与截断相比),因为在大约一半的情况下,必须将1添加到截断的表示中。因此,作为替代方案,我们开发了第二种新方法,我们称之为“部分四舍五入”。通过部分四舍五入,我们的意思是,只有在(x+1)第一个最高有效位是1,第x个最高有效位是0的情况下,才会将1添加到A的截断表示中。在实现中,这意味着A的近似值的第x位等于原始A的第x位和(x+l)位最有效位的逻辑或。在大约75%的情况下,第二种新方法的部分四舍五入与第一种方法的“完全四舍五入”产生相同的近似值,但其复杂性与截断一样低(因为OR的复杂性可以忽略不计)。在算术编码器中应用各种无乘法方法表明,我们的基于四舍五入的新方法优于先前发布的无乘法方法。“部分舍入”方法优于先前发布的基于截断的方法。
{"title":"New methods for multiplication-free arithmetic coding","authors":"R. van der Vleuten","doi":"10.1109/DCC.1999.785712","DOIUrl":"https://doi.org/10.1109/DCC.1999.785712","url":null,"abstract":"Summary form only given. Arithmetic coding is a well-known technique for lossless coding or data compression. We have developed two new multiplication-free methods. Our first new method is to round A to x bits instead of truncating it. Rounding is equivalent to truncating A to its x most significant bits if the (x+1)th most significant bit of A is a 0 and adding 1 to the truncated representation if the (x+1)th most significant bit is a 1. The rounding that is applied in our new method increases the complexity (compared to truncation), since, in about half of the cases, 1 has to be added to the truncated representation. As an alternative, we therefore developed a second new method, which we call \"partial rounding\". By partial rounding we mean that 1 is only added to the truncated representation of A in the case when the (x+1)th most significant bit is a 1 and the xth most significant bit is a 0. In the implementation this means that the xth bit of the approximation of A equals the logical OR of the xth and (x+l)th most significant bits of the original A. The partial rounding of this second new method results in the same approximation as the \"full rounding\" of the first method in about 75% of the cases, but its complexity is as low as that of truncation (since the complexity of the OR is negligible). Applying the various multiplication-free methods in the arithmetic coder has demonstrated that our new rounding-based method outperforms the previously published multiplication-free methods. The \"partial rounding\" method outperforms the previously published truncation-based methods.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115593788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1