首页 > 最新文献

2009 Data Compression Conference最新文献

英文 中文
Affine Modeling for the Complexity of Vector Quantizers 向量量化器复杂性的仿射建模
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.55
Estevan P. Seraco, J. Gomes
We use a scalar function Θ to describe the complexity of data compression systems based on vector quantizers (VQs). This function is associated with the analog hardware implementation of a VQ, as done for example in focal-plane image compression systems. The rate and distortion of a VQ are represented by a Lagrangian cost function J. In this work we propose an affine model for the relationship between J and Θ, based on several VQ encoders performing the map R^M → {1, 2, . . . ,K}. A discrete source is obtained by partitioning images into 4×4 pixel blocks and extracting M = 4 principal components from each block. To design entropy-constrained VQs (ECVQs), we use the Generalized Lloyd Algorithm. To design simple interpolative VQs (IVQs), we consider only the simplest encoder: a linear transformation, followed by a layer of M scalar quantizers in parallel – the K cells of RM are defined by a set of thresholds {t1, . . . , tT}. The T thresholds are obtained from a non-linear unconstrained optimization method based on the Nelder-Mead algorithm.The fundamental unit of complexity Θ is "transistor": we only count the transistors that are used to implement the signal processing part of a VQ analog circuit: inner products, squares, summations, winner-takes-all, and comparators. The complexity functions for ECVQs and IVQs are as follows: ΘECVQ = 2KM + 9K + 3M + 4 and ΘIVQ = 4Mw1+2Mw2+3Mb1+Mb2+4M+3T, where Mw1 and Mw2 are the numbers of multiplications by positive and by negative weights. The numbers of positive and negative bias values are Mb1 and Mb2. Since ΘECVQ and ΘIVQ are scalar functions gathering the complexities of several different operations under the same unit, they are useful for the development of models relating rate-distortion cost to complexity.Using a training set, we designed several ECVQs and plotted all (J, Θ) points on a plane with axes log10(Θ) and log10(J) (J values from a test set). An affine model log10(Θ) = a1 log10(J) + a2 became apparent; a straightforward application of least squares yields the slope and offset coefficients. This procedure was repeated for IVQs. The error between the model and the data has a variance equal to 0.005 for ECVQs and 0.02 for IVQs. To validate the ECVQ and IVQ complexity models, we repeated the design and test procedure using new training and test sets. Then, we used the previously computed complexity models to predict the Θ of the VQs designed independently: the error between the model and the data has a variance equal to 0.01 for ECVQs and 0.02 for IVQs. This shows we are able to predict the rate-distortion performance of independently designed ECVQs and IVQs. This result serves as a starting point for studies on complexity gradients between J and Θ, and as a guideline for introducing complexity constraints in the traditional entropy-constrained Lagrangian cost.
我们使用标量函数Θ来描述基于矢量量化器(VQs)的数据压缩系统的复杂性。该功能与VQ的模拟硬件实现相关联,例如在焦平面图像压缩系统中。VQ的速率和失真用拉格朗日代价函数J来表示。在本文中,我们基于几个VQ编码器执行映射R^M→{1,2,…,提出了J和Θ之间关系的仿射模型。K}。将图像划分为4×4像素块,从每个像素块中提取M = 4个主成分,得到离散源。为了设计熵约束VQs (ECVQs),我们使用了广义劳埃德算法。为了设计简单的插值VQs (IVQs),我们只考虑最简单的编码器:线性变换,然后是并行的M个标量量化器层- RM的K个单元由一组阈值{t1,…定义。, tT}。T阈值由基于Nelder-Mead算法的非线性无约束优化方法获得。复杂度的基本单位Θ是“晶体管”:我们只计算用于实现VQ模拟电路的信号处理部分的晶体管:内积、平方、求和、赢者通吃和比较器。ECVQs和IVQs的复杂度函数分别为ΘECVQ = 2KM + 9K +3M +4和ΘIVQ = 4Mw1+2Mw2+3Mb1+Mb2+4M+3T,其中Mw1和Mw2分别为正负权相乘次数。正偏置值和负偏置值的个数分别为Mb1和Mb2。由于ΘECVQ和ΘIVQ是标量函数,集合了同一单元下几个不同操作的复杂性,因此它们对于开发将速率扭曲成本与复杂性相关的模型很有用。使用训练集,我们设计了几个ecvq,并以log10(Θ)和log10(J)(来自测试集的J值)为轴绘制平面上的所有(J, Θ)点。一个仿射模型log10(Θ) = a1 log10(J) + a2变得明显;一个简单的最小二乘应用可以得到斜率和偏移系数。IVQs重复此步骤。模型与数据之间的误差方差对于ECVQs为0.005,对于ivq为0.02。为了验证ECVQ和IVQ复杂性模型,我们使用新的训练集和测试集重复了设计和测试过程。然后,我们使用之前计算的复杂性模型来预测独立设计的VQs的Θ:模型与数据之间的误差在ECVQs和IVQs之间的方差分别为0.01和0.02。这表明我们能够预测独立设计的ecvq和ivq的率失真性能。这一结果为研究J与Θ之间的复杂度梯度提供了起点,也为在传统的熵约束拉格朗日代价中引入复杂度约束提供了指导。
{"title":"Affine Modeling for the Complexity of Vector Quantizers","authors":"Estevan P. Seraco, J. Gomes","doi":"10.1109/DCC.2009.55","DOIUrl":"https://doi.org/10.1109/DCC.2009.55","url":null,"abstract":"We use a scalar function Θ to describe the complexity of data compression systems based on vector quantizers (VQs). This function is associated with the analog hardware implementation of a VQ, as done for example in focal-plane image compression systems. The rate and distortion of a VQ are represented by a Lagrangian cost function J. In this work we propose an affine model for the relationship between J and Θ, based on several VQ encoders performing the map R^M → {1, 2, . . . ,K}. A discrete source is obtained by partitioning images into 4×4 pixel blocks and extracting M = 4 principal components from each block. To design entropy-constrained VQs (ECVQs), we use the Generalized Lloyd Algorithm. To design simple interpolative VQs (IVQs), we consider only the simplest encoder: a linear transformation, followed by a layer of M scalar quantizers in parallel – the K cells of RM are defined by a set of thresholds {t1, . . . , tT}. The T thresholds are obtained from a non-linear unconstrained optimization method based on the Nelder-Mead algorithm.The fundamental unit of complexity Θ is \"transistor\": we only count the transistors that are used to implement the signal processing part of a VQ analog circuit: inner products, squares, summations, winner-takes-all, and comparators. The complexity functions for ECVQs and IVQs are as follows: ΘECVQ = 2KM + 9K + 3M + 4 and ΘIVQ = 4Mw1+2Mw2+3Mb1+Mb2+4M+3T, where Mw1 and Mw2 are the numbers of multiplications by positive and by negative weights. The numbers of positive and negative bias values are Mb1 and Mb2. Since ΘECVQ and ΘIVQ are scalar functions gathering the complexities of several different operations under the same unit, they are useful for the development of models relating rate-distortion cost to complexity.Using a training set, we designed several ECVQs and plotted all (J, Θ) points on a plane with axes log10(Θ) and log10(J) (J values from a test set). An affine model log10(Θ) = a1 log10(J) + a2 became apparent; a straightforward application of least squares yields the slope and offset coefficients. This procedure was repeated for IVQs. The error between the model and the data has a variance equal to 0.005 for ECVQs and 0.02 for IVQs. To validate the ECVQ and IVQ complexity models, we repeated the design and test procedure using new training and test sets. Then, we used the previously computed complexity models to predict the Θ of the VQs designed independently: the error between the model and the data has a variance equal to 0.01 for ECVQs and 0.02 for IVQs. This shows we are able to predict the rate-distortion performance of independently designed ECVQs and IVQs. This result serves as a starting point for studies on complexity gradients between J and Θ, and as a guideline for introducing complexity constraints in the traditional entropy-constrained Lagrangian cost.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116216051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Invertible Integer Lie Group Transforms 可逆整数李群变换
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.38
Yusong Yan, Hongmei Zhu
Invertible integer transforms are essential for lossless source encoding. Using lifting schemes, we develop a new family of invertible integer transforms based on discrete generalized cosine transforms. The discrete generalized cosine transforms that arise in connection with compact semi-simple Lie groups of rank 2, are orthogonal over a fundamental region and have recently attracted more attention in digital image processing. Since these integer transforms are invertible, they have potential applications in lossless image compression and encryption.
可逆整数变换对无损源编码至关重要。利用提升格式,在离散广义余弦变换的基础上,构造了一类新的可逆整数变换。离散广义余弦变换与秩为2的紧半简单李群有关,它们在基本区域上正交,近年来在数字图像处理中引起了越来越多的关注。由于这些整数变换是可逆的,它们在无损图像压缩和加密中有潜在的应用。
{"title":"Invertible Integer Lie Group Transforms","authors":"Yusong Yan, Hongmei Zhu","doi":"10.1109/DCC.2009.38","DOIUrl":"https://doi.org/10.1109/DCC.2009.38","url":null,"abstract":"Invertible integer transforms are essential for lossless source encoding. Using lifting schemes, we develop a new family of invertible integer transforms based on discrete generalized cosine transforms. The discrete generalized cosine transforms that arise in connection with compact semi-simple Lie groups of rank 2, are orthogonal over a fundamental region and have recently attracted more attention in digital image processing. Since these integer transforms are invertible, they have potential applications in lossless image compression and encryption.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130665383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Source Coding Scheme for Multiple Sequence Alignments 多序列比对的源编码方案
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.64
P. Hanus, J. Dingel, Georg Chalkidis, J. Hagenauer
Rapid development of DNA sequencing technologies exponentially increases the amount of publicly available genomic data. Whole genome multiple sequence alignments represent a particularly voluminous, frequently downloaded static dataset. In this work we propose an asymmetric source coding scheme for such alignments using evolutionary prediction in combination with lossless black and white image compression. Compared to the Lempel-Ziv algorithm used so far the compression rates are almost halved.
DNA测序技术的快速发展使可公开获得的基因组数据数量呈指数级增长。全基因组多序列比对是一个特别庞大、经常被下载的静态数据集。在这项工作中,我们提出了一种非对称源编码方案,使用进化预测结合无损黑白图像压缩。与目前使用的Lempel-Ziv算法相比,压缩率几乎减半。
{"title":"Source Coding Scheme for Multiple Sequence Alignments","authors":"P. Hanus, J. Dingel, Georg Chalkidis, J. Hagenauer","doi":"10.1109/DCC.2009.64","DOIUrl":"https://doi.org/10.1109/DCC.2009.64","url":null,"abstract":"Rapid development of DNA sequencing technologies exponentially increases the amount of publicly available genomic data. Whole genome multiple sequence alignments represent a particularly voluminous, frequently downloaded static dataset. In this work we propose an asymmetric source coding scheme for such alignments using evolutionary prediction in combination with lossless black and white image compression. Compared to the Lempel-Ziv algorithm used so far the compression rates are almost halved.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134511456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Decentralized Estimation Using Learning Vector Quantization 使用学习向量量化的分散估计
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.77
Mihajlo Grbovic, S. Vucetic
Decentralized estimation is an essential problem for a number of data fusion applications. In this paper we propose a variation of the Learning Vector Quantization (LVQ) algorithm, the Distortion Sensitive LVQ (DSLVQ), to be used for quantizer design in decentralized estimation. Experimental results suggest that DSLVQ results in high-quality quantizers and that it allows easy adjustment of the complexity of the resulting quantizers to computational constraints of decentralized sensors. In addition, DSLVQ approach shows significant improvements over the popular LVQ2 algorithm as well as the previously proposed Regression Tree approach for decentralized estimation.
分散估计是许多数据融合应用的关键问题。在本文中,我们提出了学习向量量化(LVQ)算法的一种变体,失真敏感LVQ (DSLVQ),用于分散估计中的量化器设计。实验结果表明,DSLVQ可以产生高质量的量化器,并且可以根据分散传感器的计算约束轻松调整所产生量化器的复杂性。此外,DSLVQ方法比流行的LVQ2算法以及先前提出的用于分散估计的回归树方法有显著改进。
{"title":"Decentralized Estimation Using Learning Vector Quantization","authors":"Mihajlo Grbovic, S. Vucetic","doi":"10.1109/DCC.2009.77","DOIUrl":"https://doi.org/10.1109/DCC.2009.77","url":null,"abstract":"Decentralized estimation is an essential problem for a number of data fusion applications. In this paper we propose a variation of the Learning Vector Quantization (LVQ) algorithm, the Distortion Sensitive LVQ (DSLVQ), to be used for quantizer design in decentralized estimation. Experimental results suggest that DSLVQ results in high-quality quantizers and that it allows easy adjustment of the complexity of the resulting quantizers to computational constraints of decentralized sensors. In addition, DSLVQ approach shows significant improvements over the popular LVQ2 algorithm as well as the previously proposed Regression Tree approach for decentralized estimation.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134518521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Adaptive Sub-sampling Method for In-memory Compression of Scientific Data 一种科学数据内存压缩的自适应子采样方法
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.65
D. Unat, T. Hromadka, S. Baden
A  current challenge in scientific computing is how to curb the growth of simulation datasets without  losing valuable information. While  wavelet based methods are popular, they require that data be decompressed before it can analyzed,for example, when identifying time-dependent structures in turbulent flows. We present Adaptive Coarsening, an adaptive subsampling compression strategy that enables the compressed data product to be directly manipulated in memory without requiring costly decompression.We demonstrate compression factors of up to 8 in turbulent flow simulations in three dimensions.Our compression strategy produces a non-progressive multiresolution representation, subdividing the dataset into fixed sized regions and compressing each region independently.
当前科学计算面临的一个挑战是如何在不丢失有价值信息的情况下抑制模拟数据集的增长。虽然基于小波的方法很流行,但它们要求在分析数据之前对数据进行解压,例如,在湍流中识别与时间相关的结构时。我们提出了自适应粗化,一种自适应子采样压缩策略,使压缩的数据产品可以直接在内存中操作,而不需要昂贵的解压缩。在三维湍流模拟中,我们证明了压缩因子高达8。我们的压缩策略产生非渐进式多分辨率表示,将数据集细分为固定大小的区域,并独立压缩每个区域。
{"title":"An Adaptive Sub-sampling Method for In-memory Compression of Scientific Data","authors":"D. Unat, T. Hromadka, S. Baden","doi":"10.1109/DCC.2009.65","DOIUrl":"https://doi.org/10.1109/DCC.2009.65","url":null,"abstract":"A  current challenge in scientific computing is how to curb the growth of simulation datasets without  losing valuable information. While  wavelet based methods are popular, they require that data be decompressed before it can analyzed,for example, when identifying time-dependent structures in turbulent flows. We present Adaptive Coarsening, an adaptive subsampling compression strategy that enables the compressed data product to be directly manipulated in memory without requiring costly decompression.We demonstrate compression factors of up to 8 in turbulent flow simulations in three dimensions.Our compression strategy produces a non-progressive multiresolution representation, subdividing the dataset into fixed sized regions and compressing each region independently.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132346101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
l1 Compression of Image Sequences Using the Structural Similarity Index Measure 基于结构相似指数的图像序列压缩
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.28
J. Dahl, Jan Østergaard, T. L. Jensen, S. H. Jensen
We consider lossy compression of image sequences using l1-compression with overcomplete dictionaries. As a fidelity measure for the reconstruction quality, we incorporate the recently proposed structural similarity index measure, and we show that this leads to problem formulations that are very similar to conventional l1 compression algorithms. In addition, we develop efficient large-scale algorithms used for joint encoding of multiple image frames.
我们考虑使用带过完全字典的l1压缩对图像序列进行有损压缩。作为重建质量的保真度度量,我们结合了最近提出的结构相似性指数度量,我们表明这会导致与传统l1压缩算法非常相似的问题公式。此外,我们还开发了用于多帧图像联合编码的高效大规模算法。
{"title":"l1 Compression of Image Sequences Using the Structural Similarity Index Measure","authors":"J. Dahl, Jan Østergaard, T. L. Jensen, S. H. Jensen","doi":"10.1109/DCC.2009.28","DOIUrl":"https://doi.org/10.1109/DCC.2009.28","url":null,"abstract":"We consider lossy compression of image sequences using l1-compression with overcomplete dictionaries. As a fidelity measure for the reconstruction quality, we incorporate the recently proposed structural similarity index measure, and we show that this leads to problem formulations that are very similar to conventional l1 compression algorithms. In addition, we develop efficient large-scale algorithms used for joint encoding of multiple image frames.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"77 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128812644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Entropy Coding via Parametric Source Model with Applications in Fast and Efficient Compression of Image and Video Data 参数源模型熵编码及其在图像和视频数据快速高效压缩中的应用
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.80
K. Minoo, Truong Q. Nguyen
In this paper a framework is proposed for efficient entropy coding of data which can be represented by a parametric distribution model. Based on the proposed framework, an entropy coder achieves coding efficiency by estimating the parameters of the statistical model (for the coded data), either via Maximum A Posteriori (MAP) or Maximum Likelihood (ML) parameter estimation techniques.
本文提出了一种用参数分布模型表示的数据的有效熵编码框架。基于所提出的框架,熵编码器通过最大后验A (MAP)或最大似然(ML)参数估计技术估计统计模型的参数(用于编码数据)来实现编码效率。
{"title":"Entropy Coding via Parametric Source Model with Applications in Fast and Efficient Compression of Image and Video Data","authors":"K. Minoo, Truong Q. Nguyen","doi":"10.1109/DCC.2009.80","DOIUrl":"https://doi.org/10.1109/DCC.2009.80","url":null,"abstract":"In this paper a framework is proposed for efficient entropy coding of data which can be represented by a parametric distribution model. Based on the proposed framework, an entropy coder achieves coding efficiency by estimating the parameters of the statistical model (for the coded data), either via Maximum A Posteriori (MAP) or Maximum Likelihood (ML) parameter estimation techniques.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126027977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
High Performance Word-Codeword Mapping Algorithm on PPM 基于PPM的高性能字码字映射算法
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.40
J. Adiego, Miguel A. Martínez-Prieto, P. Fuente
The word-codeword mapping technique allows words to be managed in PPM modelling when a natural language text file is being compressed. The main idea for managing words is to assign them codes in order to improve the compression. The previous work was focused on proposing several mapping adaptive algorithms and evaluating them. In this paper, we propose a semi-static word-codeword mapping method that takes advantage of by previous knowledge of some statistical data of the vocabulary. We test our idea implementing a basic prototype, dubbed mppm2, which also retains all the desirable features of a word-codeword mapping technique. The comparison with other techniques and compressors shows that our proposal is a very competitive choice for compressing natural language texts. In fact, empirical results show that our prototype achieves a very good compression for this type of documents.
字-码-字映射技术允许在压缩自然语言文本文件时在PPM建模中管理字。管理单词的主要思想是为它们分配代码,以提高压缩。以前的工作主要集中在提出几种映射自适应算法并对它们进行评估。本文提出了一种半静态字码字映射方法,该方法利用了词汇统计数据的先验知识。我们实现了一个称为mppm2的基本原型来测试我们的想法,该原型还保留了字-码-字映射技术的所有理想功能。与其他技术和压缩器的比较表明,我们的方案是一种非常有竞争力的自然语言文本压缩方案。事实上,经验结果表明,我们的原型对这类文档实现了非常好的压缩。
{"title":"High Performance Word-Codeword Mapping Algorithm on PPM","authors":"J. Adiego, Miguel A. Martínez-Prieto, P. Fuente","doi":"10.1109/DCC.2009.40","DOIUrl":"https://doi.org/10.1109/DCC.2009.40","url":null,"abstract":"The word-codeword mapping technique allows words to be managed in PPM modelling when a natural language text file is being compressed. The main idea for managing words is to assign them codes in order to improve the compression. The previous work was focused on proposing several mapping adaptive algorithms and evaluating them. In this paper, we propose a semi-static word-codeword mapping method that takes advantage of by previous knowledge of some statistical data of the vocabulary. We test our idea implementing a basic prototype, dubbed mppm2, which also retains all the desirable features of a word-codeword mapping technique. The comparison with other techniques and compressors shows that our proposal is a very competitive choice for compressing natural language texts. In fact, empirical results show that our prototype achieves a very good compression for this type of documents.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125349817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Practical Parallel Algorithms for Dictionary Data Compression 字典数据压缩的实用并行算法
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.84
L. Cinque, S. Agostino, L. Lombardi
PRAM CREW parallel algorithms requiring logarithmic time and a linear number of processors exist for sliding (LZ1) and static dictionary compression. On the other hand, LZ2 compression seems hard to parallelize. Both adaptive methods work with prefix dictionaries, that is, all prefixes of a dictionary element are dictionary elements.Therefore, it is reasonable to use prefix dictionaries also for the static method. A left to right semi-greedy approach exists to compute an optimal parsing of a string with a prefix static dictionary. The left to right greedy approach is enough to achieve optimal compression with a sliding dictionary since such dictionary is both prefix and suffix. We assume the window is bounded by a constant. With the practical assumption that the dictionary elements have constant length we present PRAM EREW algorithms for sliding and static dictionary compression still requiring logarithmic time and a linear number of processors. A PRAM EREW decoder for static dictionary compression can be easily designed with a linear number of processors and logarithmic time. A work-optimal logarithmic time PRAM EREW decoder exists for sliding dictionary compression when the window has constant length. The simplest model for parallel computation is an array of processors with distibuted memory and no interconnections, therefore, no communication cost. An approximation scheme to optimal compression with prefix static dictionaries was designed running with the same complexity of the previous algorithms on such model. It was presented for a massively parallel architecture but in virtue of its scalability it can be implemented on a small scale system as well.We describe such approach and extend it to the sliding dictionary method. The approximation scheme for sliding dictionaries is suitable for small scale systems but due to its adaptiveness it is practical for a large scale system when the file size is large. A two-dimensional extension of the sliding dictionary method to lossless compression of bi-level images, called BLOCK MATCHING, is also discussed. We designed a parallel implementation of such heuristic on a constant size array of processors and experimented it with up to 32 processors of a 256 Intel Xeon 3.06 GHz  processors machine (avogadro.cilea.it) on a test set of large topographic images. We achieved the expected speed-up, obtaining parallel compression and decompression about twenty-five times faster than the sequential ones.
对于滑动(LZ1)和静态字典压缩,存在需要对数时间和线性处理器数的PRAM CREW并行算法。另一方面,LZ2压缩似乎很难并行化。这两种自适应方法都使用前缀字典,也就是说,字典元素的所有前缀都是字典元素。因此,对于静态方法也使用前缀字典是合理的。存在一种从左到右的半贪婪方法来计算具有前缀静态字典的字符串的最佳解析。从左到右贪婪的方法足以实现滑动字典的最佳压缩,因为这种字典既是前缀又是后缀。我们假设窗口以常数为界。在实际假设字典元素具有恒定长度的情况下,我们提出了用于滑动和静态字典压缩的PRAM EREW算法,该算法仍然需要对数时间和线性数量的处理器。用于静态字典压缩的PRAM EREW解码器可以很容易地用线性数量的处理器和对数时间设计。当窗口长度不变时,存在一种适合滑动字典压缩的对数时间PRAM - EREW解码器。最简单的并行计算模型是一组具有分布式内存的处理器,没有互连,因此没有通信成本。设计了一种带前缀静态字典的最优压缩近似方案,在该模型上以相同的复杂度运行。它是为大规模并行架构提出的,但由于其可扩展性,它也可以在小规模系统上实现。我们描述了这种方法,并将其扩展到滑动字典方法。滑动字典的近似方案适用于小规模系统,但由于其自适应性,它适用于文件大小较大的大规模系统。本文还讨论了滑动字典方法在双级图像无损压缩中的二维扩展,即块匹配。我们在一个固定大小的处理器阵列上设计了这种启发式算法的并行实现,并在一台256 Intel Xeon 3.06 GHz处理器的机器(avogadro.cilea.it)上使用多达32个处理器在大型地形图像的测试集上进行了实验。我们实现了预期的加速,获得的并行压缩和解压缩比顺序压缩快25倍。
{"title":"Practical Parallel Algorithms for Dictionary Data Compression","authors":"L. Cinque, S. Agostino, L. Lombardi","doi":"10.1109/DCC.2009.84","DOIUrl":"https://doi.org/10.1109/DCC.2009.84","url":null,"abstract":"PRAM CREW parallel algorithms requiring logarithmic time and a linear number of processors exist for sliding (LZ1) and static dictionary compression. On the other hand, LZ2 compression seems hard to parallelize. Both adaptive methods work with prefix dictionaries, that is, all prefixes of a dictionary element are dictionary elements.Therefore, it is reasonable to use prefix dictionaries also for the static method. A left to right semi-greedy approach exists to compute an optimal parsing of a string with a prefix static dictionary. The left to right greedy approach is enough to achieve optimal compression with a sliding dictionary since such dictionary is both prefix and suffix. We assume the window is bounded by a constant. With the practical assumption that the dictionary elements have constant length we present PRAM EREW algorithms for sliding and static dictionary compression still requiring logarithmic time and a linear number of processors. A PRAM EREW decoder for static dictionary compression can be easily designed with a linear number of processors and logarithmic time. A work-optimal logarithmic time PRAM EREW decoder exists for sliding dictionary compression when the window has constant length. The simplest model for parallel computation is an array of processors with distibuted memory and no interconnections, therefore, no communication cost. An approximation scheme to optimal compression with prefix static dictionaries was designed running with the same complexity of the previous algorithms on such model. It was presented for a massively parallel architecture but in virtue of its scalability it can be implemented on a small scale system as well.We describe such approach and extend it to the sliding dictionary method. The approximation scheme for sliding dictionaries is suitable for small scale systems but due to its adaptiveness it is practical for a large scale system when the file size is large. A two-dimensional extension of the sliding dictionary method to lossless compression of bi-level images, called BLOCK MATCHING, is also discussed. We designed a parallel implementation of such heuristic on a constant size array of processors and experimented it with up to 32 processors of a 256 Intel Xeon 3.06 GHz  processors machine (avogadro.cilea.it) on a test set of large topographic images. We achieved the expected speed-up, obtaining parallel compression and decompression about twenty-five times faster than the sequential ones.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121887791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
H.264/MPEG-4 AVC Encoder Parameter Selection Algorithms for Complexity Distortion Tradeoff H.264/MPEG-4 AVC编码器参数选择算法的复杂性失真权衡
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.53
R. Vanam, E. Riskin, R. Ladner
The H.264 encoder has input parameters that determine the bit rate and distortion of the compressed video and the encoding complexity. A set of encoder parameters is referred to as a parameter setting. We previously proposed two offline algorithms for choosing H.264 encoder parameter settings that have distortion-complexity performance close to the parameter settings obtained from an exhaustive search, but take significantly fewer encodings. However they generate only a few parameter settings. If there is no available parameter settings for a given encode time, the encoder will need to use a lower complexity parameter setting resulting in a decrease in peak-signal-to-noise-ratio (PSNR). In this paper, we propose two algorithms for finding additional parameter settings over our previous algorithm and show that they improve the PSNR by up to 0.71 dB and 0.43 dB, respectively. We test both our algorithms on Linux and PocketPC platforms.
H.264编码器具有决定压缩视频的比特率和失真以及编码复杂度的输入参数。一组编码器参数被称为参数设置。我们之前提出了两种离线算法,用于选择H.264编码器参数设置,这些参数设置具有接近穷举搜索获得的参数设置的失真复杂性性能,但所需的编码次数明显减少。然而,它们只生成几个参数设置。如果在给定的编码时间内没有可用的参数设置,编码器将需要使用复杂度较低的参数设置,从而降低峰值信噪比(PSNR)。在本文中,我们提出了两种算法,用于在之前的算法基础上寻找额外的参数设置,并表明它们分别将PSNR提高了0.71 dB和0.43 dB。我们在Linux和PocketPC平台上测试了我们的算法。
{"title":"H.264/MPEG-4 AVC Encoder Parameter Selection Algorithms for Complexity Distortion Tradeoff","authors":"R. Vanam, E. Riskin, R. Ladner","doi":"10.1109/DCC.2009.53","DOIUrl":"https://doi.org/10.1109/DCC.2009.53","url":null,"abstract":"The H.264 encoder has input parameters that determine the bit rate and distortion of the compressed video and the encoding complexity. A set of encoder parameters is referred to as a parameter setting. We previously proposed two offline algorithms for choosing H.264 encoder parameter settings that have distortion-complexity performance close to the parameter settings obtained from an exhaustive search, but take significantly fewer encodings. However they generate only a few parameter settings. If there is no available parameter settings for a given encode time, the encoder will need to use a lower complexity parameter setting resulting in a decrease in peak-signal-to-noise-ratio (PSNR). In this paper, we propose two algorithms for finding additional parameter settings over our previous algorithm and show that they improve the PSNR by up to 0.71 dB and 0.43 dB, respectively. We test both our algorithms on Linux and PocketPC platforms.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121979853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
期刊
2009 Data Compression Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1