首页 > 最新文献

2009 Data Compression Conference最新文献

英文 中文
Design of Punctured LDPC Codes for Rate-Compatible Asymmetric Slepian-Wolf Coding 速率兼容非对称睡眠狼编码的穿刺LDPC码设计
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.25
F. Cen
This paper considers designing punctured Low-Density Parity-Check (LDPC) codes for rate-compatible asymmetric Slepian-Wolf (SW) coding of correlated binary memoryless sources. A virtual non-uniform channel is employed to model the rate-compatible asymmetric SW coding based on the puncture approach, and then the degree distributions of LDPC codes are exclusively optimized for the smallest and the largest puncture ratios. Punctured extended Irregular Repeat-Accumulated (eIRA) codes are introduced and designed as an example to demonstrate the validation of the proposed design method. Simulation results show that the coding efficiency of the designed eIRA codes is better than all other reported results.
本文考虑设计穿孔低密度奇偶校验(LDPC)码,用于相关二进制无存储源的非对称睡眠-狼(SW)编码。采用虚拟非均匀信道对基于穿刺法的速率兼容非对称SW编码进行建模,并对LDPC码的度分布进行最小和最大穿刺比的专门优化。以穿孔扩展不规则重复累积(eIRA)码为例进行了设计,验证了该设计方法的有效性。仿真结果表明,所设计的eIRA码的编码效率优于已有的编码结果。
{"title":"Design of Punctured LDPC Codes for Rate-Compatible Asymmetric Slepian-Wolf Coding","authors":"F. Cen","doi":"10.1109/DCC.2009.25","DOIUrl":"https://doi.org/10.1109/DCC.2009.25","url":null,"abstract":"This paper considers designing punctured Low-Density Parity-Check (LDPC) codes for rate-compatible asymmetric Slepian-Wolf (SW) coding of correlated binary memoryless sources. A virtual non-uniform channel is employed to model the rate-compatible asymmetric SW coding based on the puncture approach, and then the degree distributions of LDPC codes are exclusively optimized for the smallest and the largest puncture ratios. Punctured extended Irregular Repeat-Accumulated (eIRA) codes are introduced and designed as an example to demonstrate the validation of the proposed design method. Simulation results show that the coding efficiency of the designed eIRA codes is better than all other reported results.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133376063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The Posterior Matching Feedback Scheme for Joint Source-Channel Coding with Bandwidth Expansion 带带宽扩展的源信道联合编码后验匹配反馈方案
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.79
O. Shayevitz, M. Feder
When transmitting a Gaussian source over an AWGN channel with an input power constraint and a quadratic distortion measure, it is well known that optimal performance can be obtained using an analog joint source-channel scalar scheme which merely scales the input and output signals. In the case of bandwidth expansion, such a joint source-channel analog scheme attaining optimal performance is no longer simple. However, when feedback is available a simple and sequential analog linear procedure based on the Schalkwijk-Kailath scheme for communication, is optimal. Recently, we have introduced a fundamental feedback communication scheme,termed textit{posterior matching}, which generalizes the Schalkwijk-Kailath scheme to arbitrary memoryless channels and input distributions. In this paper, we show how the posterior matching scheme can be adapted to the joint source-channel coding setting with bandwidth expansion and a general distortion measure, when feedback is available.
在具有输入功率约束和二次失真测量的AWGN信道上传输高斯源时,采用模拟源信道联合标量方案仅对输入和输出信号进行缩放即可获得最佳性能。在带宽扩展的情况下,这种联合源信道模拟方案获得最佳性能不再简单。然而,当反馈可用时,基于Schalkwijk-Kailath通信方案的简单顺序模拟线性程序是最佳的。最近,我们引入了一种基本的反馈通信方案,称为textit{后置匹配},它将Schalkwijk-Kailath方案推广到任意无记忆信道和输入分布。在本文中,我们展示了后验匹配方案如何适应具有带宽扩展和一般失真测量的联合源信道编码设置,当反馈可用时。
{"title":"The Posterior Matching Feedback Scheme for Joint Source-Channel Coding with Bandwidth Expansion","authors":"O. Shayevitz, M. Feder","doi":"10.1109/DCC.2009.79","DOIUrl":"https://doi.org/10.1109/DCC.2009.79","url":null,"abstract":"When transmitting a Gaussian source over an AWGN channel with an input power constraint and a quadratic distortion measure, it is well known that optimal performance can be obtained using an analog joint source-channel scalar scheme which merely scales the input and output signals. In the case of bandwidth expansion, such a joint source-channel analog scheme attaining optimal performance is no longer simple. However, when feedback is available a simple and sequential analog linear procedure based on the Schalkwijk-Kailath scheme for communication, is optimal. Recently, we have introduced a fundamental feedback communication scheme,termed textit{posterior matching}, which generalizes the Schalkwijk-Kailath scheme to arbitrary memoryless channels and input distributions. In this paper, we show how the posterior matching scheme can be adapted to the joint source-channel coding setting with bandwidth expansion and a general distortion measure, when feedback is available.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123300276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Compressed Transitive Delta Encoding 压缩传递增量编码
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.46
Dana Shapira
Given a source file $S$ and two differencing files $Delta (S,T)$ and $Delta(T,R)$, where $Delta(X,Y)$ is used to denote the delta file of the target file $Y$ with respect to the source file $X$, the objective is to be able to construct $R$.This is intended for the scenario of upgrading software where intermediate releases are missing, or for the case of file system backups, where non consecutive versions must be recovered.The traditional way is to decompress $Delta(S,T)$ in order to construct$T$ and then apply $Delta(T,R)$ on $T$ and obtain $R$.The {it Compressed Transitive Delta Encoding (CTDE)} paradigm, introduced in this paper, is to  construct a delta file $Delta(S,R)$ working directly on the two given delta files, $Delta (S,T)$ and $Delta(T,R)$, without any decompression or the use of the base file $S$. A new algorithm for solving CTDE is proposed and its compression performance is compared against the traditional ``double delta decompression''.Not only does it use constant additional space, as opposed to  the traditional method which uses linear additional memory storage, but experiments show that the size of the delta files involved is reduced by 15% on average.
给定一个源文件$S$和两个不同的文件$Delta(S,T)$和$Delta(T,R)$,其中$Delta(X,Y)$用于表示目标文件$Y$相对于源文件$X$的增量文件,目标是能够构造$R$。这适用于缺少中间版本的软件升级场景,或者用于必须恢复非连续版本的文件系统备份的情况。传统的方法是将$Delta(S,T)$解压缩以构造$T$,然后将$Delta(T,R)$应用于$T$得到$R$。本文介绍的{it压缩传递增量编码(CTDE)}范式是构造一个增量文件$Delta(S,R)$直接处理两个给定的增量文件$Delta(S, T)$和$Delta(T,R)$,而不需要任何解压缩或使用基础文件$S$。提出了一种求解CTDE的新算法,并将其压缩性能与传统的“双增量解压缩”进行了比较。与使用线性额外内存存储的传统方法相反,它不仅使用恒定的额外空间,而且实验表明,所涉及的增量文件的大小平均减少了15%。
{"title":"Compressed Transitive Delta Encoding","authors":"Dana Shapira","doi":"10.1109/DCC.2009.46","DOIUrl":"https://doi.org/10.1109/DCC.2009.46","url":null,"abstract":"Given a source file $S$ and two differencing files $Delta (S,T)$ and $Delta(T,R)$, where $Delta(X,Y)$ is used to denote the delta file of the target file $Y$ with respect to the source file $X$, the objective is to be able to construct $R$.This is intended for the scenario of upgrading software where intermediate releases are missing, or for the case of file system backups, where non consecutive versions must be recovered.The traditional way is to decompress $Delta(S,T)$ in order to construct$T$ and then apply $Delta(T,R)$ on $T$ and obtain $R$.The {it Compressed Transitive Delta Encoding (CTDE)} paradigm, introduced in this paper, is to  construct a delta file $Delta(S,R)$ working directly on the two given delta files, $Delta (S,T)$ and $Delta(T,R)$, without any decompression or the use of the base file $S$. A new algorithm for solving CTDE is proposed and its compression performance is compared against the traditional ``double delta decompression''.Not only does it use constant additional space, as opposed to  the traditional method which uses linear additional memory storage, but experiments show that the size of the delta files involved is reduced by 15% on average.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122239320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Block Size Optimization in Deduplication Systems 重复数据删除系统中的块大小优化
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.51
C. Constantinescu, J. Pieper, Tiancheng Li
Data deduplication is a popular dictionary based compression method in storage archival and backup.The deduplication efficiency (``chunk'' matching) improves for smaller chunk sizes, however the files become highly fragmented requiring many disk accesses during reconstruction or "chattiness"in a client-server architecture.  Within the sequence of chunks that an object (file) is decomposed into, sub-sequences of adjacent chunks tend to repeat. We exploit this insight to optimize the chunk sizes by joining repeated sub-sequences of small chunks into new ``super chunks'' with the constraint to achieve practically the same matching performance. We employ suffix arrays to find these repeating sub-sequences and to determine a new encoding that covers the original sequence.With super chunks we significantly reduce fragmentation, improving reconstruction time and the overall deduplication ratio by lowering the amount of metadata (fewer hashes and dictionary entries).
重复数据删除是存储归档和备份中常用的基于字典的压缩方法。对于较小的块大小,重复数据删除效率(“块”匹配)会得到提高,但是,在重建或客户机-服务器架构中的“聊天”期间,文件变得高度碎片化,需要多次磁盘访问。在对象(文件)被分解成的块序列中,相邻块的子序列往往会重复。我们利用这种洞察力来优化块大小,通过约束将小块的重复子序列连接到新的“超级块”中,以实现几乎相同的匹配性能。我们使用后缀数组来查找这些重复的子序列,并确定覆盖原始序列的新编码。使用超级块,我们可以通过降低元数据的数量(更少的哈希和字典条目)来显著减少碎片,改善重建时间和整体重复数据删除比率。
{"title":"Block Size Optimization in Deduplication Systems","authors":"C. Constantinescu, J. Pieper, Tiancheng Li","doi":"10.1109/DCC.2009.51","DOIUrl":"https://doi.org/10.1109/DCC.2009.51","url":null,"abstract":"Data deduplication is a popular dictionary based compression method in storage archival and backup.The deduplication efficiency (``chunk'' matching) improves for smaller chunk sizes, however the files become highly fragmented requiring many disk accesses during reconstruction or \"chattiness\"in a client-server architecture.  Within the sequence of chunks that an object (file) is decomposed into, sub-sequences of adjacent chunks tend to repeat. We exploit this insight to optimize the chunk sizes by joining repeated sub-sequences of small chunks into new ``super chunks'' with the constraint to achieve practically the same matching performance. We employ suffix arrays to find these repeating sub-sequences and to determine a new encoding that covers the original sequence.With super chunks we significantly reduce fragmentation, improving reconstruction time and the overall deduplication ratio by lowering the amount of metadata (fewer hashes and dictionary entries).","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123719087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Lossy Hyperspectral Images Coding with Exogenous Quasi Optimal Transforms 外生拟最优变换的有损高光谱图像编码
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.8
M. Barret, J. Gutzwiller, Isidore Paul Akam Bita, F. D. Vedova
It is well known in transform coding that the Karhunen-Loève Transform (KLT) can be suboptimal for non Gaussiansources. However in many applications using JPEG2000Part 2 codecs, the KLT is generally considered as the optimal linear transform for reducing redundancies between components of hyperspectral images. In previous works, optimal spectral transforms (OST) compatible with the JPEG2000 Part 2 standard have been introduced, performing better than the KLT but with an heavier computational cost. In this paper, we show that the OST computed on a learningbasis constituted of Hyperion hyperspectral images issuedfrom one sensor performs very well, and even better thanthe KLT, on other images issued from the same sensor.
在变换编码中,karhunen - lo变换(KLT)对于非高斯源是次优的,这是众所周知的。然而,在许多使用JPEG2000Part 2编解码器的应用中,KLT通常被认为是减少高光谱图像组件之间冗余的最佳线性变换。在以前的工作中,已经介绍了与JPEG2000第2部分标准兼容的最优频谱变换(OST),其性能优于KLT,但计算成本更高。在本文中,我们证明了在由来自一个传感器的Hyperion高光谱图像组成的学习基础上计算的OST在来自同一传感器的其他图像上表现非常好,甚至比KLT更好。
{"title":"Lossy Hyperspectral Images Coding with Exogenous Quasi Optimal Transforms","authors":"M. Barret, J. Gutzwiller, Isidore Paul Akam Bita, F. D. Vedova","doi":"10.1109/DCC.2009.8","DOIUrl":"https://doi.org/10.1109/DCC.2009.8","url":null,"abstract":"It is well known in transform coding that the Karhunen-Loève Transform (KLT) can be suboptimal for non Gaussiansources. However in many applications using JPEG2000Part 2 codecs, the KLT is generally considered as the optimal linear transform for reducing redundancies between components of hyperspectral images. In previous works, optimal spectral transforms (OST) compatible with the JPEG2000 Part 2 standard have been introduced, performing better than the KLT but with an heavier computational cost. In this paper, we show that the OST computed on a learningbasis constituted of Hyperion hyperspectral images issuedfrom one sensor performs very well, and even better thanthe KLT, on other images issued from the same sensor.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131230023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Joint Source-Channel Coding at the Application Layer 应用层的联合源信道编码
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.10
Ozgun Y. Bursalioglu, Maria Fresia, G. Caire, H. Poor
The multicasting of an independent and identically distributed Gaussian source over a binary erasure broadcast channel is considered. This model applies to a one-to-many transmission scenario in which some mechanism at the physical layer delivers information packets with losses represented by erasures, and users are subject to different erasure probabilities. The reconstruction signal-to-noise ratio (SNR) region achieved by concatenating a multiresolution source code with a broadcast channel code is characterized and four convex optimization problems corresponding to different performance criteria are solved. Each problem defines a particular operating point on the dominant face of the SNR region. Layered joint source-channel codes are constructed based on the concatenation of embedded scalar quantizers with binary raptor encoders. The proposed schemes are shown to operate very close to the theoretical optimum.
研究了独立同分布高斯源在二进制擦除广播信道上的组播问题。该模型适用于一对多的传输场景,在这种场景中,物理层有某种机制发送信息包,信息包的丢失以擦除表示,用户受到不同的擦除概率。对多分辨率源码与广播信道码串联后的重构信噪比区域进行了表征,并求解了对应于不同性能标准的4个凸优化问题。每个问题在信噪比区域的优势面上定义一个特定的工作点。基于嵌入标量量化器与二进制猛禽编码器的级联,构造了分层联合源信道码。所提出的方案非常接近理论最优。
{"title":"Joint Source-Channel Coding at the Application Layer","authors":"Ozgun Y. Bursalioglu, Maria Fresia, G. Caire, H. Poor","doi":"10.1109/DCC.2009.10","DOIUrl":"https://doi.org/10.1109/DCC.2009.10","url":null,"abstract":"The multicasting of an independent and identically distributed Gaussian source over a binary erasure broadcast channel is considered. This model applies to a one-to-many transmission scenario in which some mechanism at the physical layer delivers information packets with losses represented by erasures, and users are subject to different erasure probabilities. The reconstruction signal-to-noise ratio (SNR) region achieved by concatenating a multiresolution source code with a broadcast channel code is characterized and four convex optimization problems corresponding to different performance criteria are solved. Each problem defines a particular operating point on the dominant face of the SNR region. Layered joint source-channel codes are constructed based on the concatenation of embedded scalar quantizers with binary raptor encoders. The proposed schemes are shown to operate very close to the theoretical optimum.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130424697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Low Bit Rate Vector Quantization of Outlier Contaminated Data Based on Shells of Golay Codes 基于Golay码壳的离群污染数据低比特率矢量量化
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.62
I. Tabus, A. Vasilache
In this paper we study how to encode N-long vectors, with N in the range of hundreds, at low bit rates of 0.5 bit per sample or lower. We adopt a vector quantization structure, where an overall gain is encoded with a scalar quantizer and the remaining scaled vector is encoded  using a vector quantizer built out by combining smaller (length L) binary codes known to be efficient in filling the space, the important examples discussed here  being the Golay codes. Due to the typical nonstationary distribution of the long vectors, a piecewise stationary plus contamination model is assumed.  The generic solution is to encode the outliers using Golomb-Rice codes, and for each L-long subvector to encode the vector of absolute values using the nearest neighbor in a certain shell of a chosen binary {0,1} code, the sign information being transmitted separately. The rate-distortion optimization problem can be very efficiently organized and solved for the unknowns, which include the Hamming weights of the chosen shells for each of the  N/L subvectors, and the overall gain g. The essential properties which influence the selection of a certain binary code as a building block are its space filling properties, the number of shells of various Hamming weights (allowing more or less  flexibility in the rate-distortion optimization), the closeness of N to a multiple of L, and the existence of fast search of nearest neighbor on a shell. We show results when using the Golay codes for vector quantization on audio coding applications.
在本文中,我们研究了如何在每个样本0.5比特或更低的低比特率下编码N个长向量,N在数百范围内。我们采用矢量量化结构,其中总增益用标量量化器编码,剩余的缩放矢量使用矢量量化器编码,该矢量量化器是通过组合较小的(长度为L)二进制码构建的,已知这些二进制码可以有效地填充空间,这里讨论的重要示例是Golay码。由于长矢量具有典型的非平稳分布,本文采用分段平稳加污染模型。一般的解决方案是使用Golomb-Rice码对离群值进行编码,对于每个l长子向量,使用所选二进制{0,1}码的某个外壳中的最近邻对绝对值向量进行编码,符号信息单独传输。速率失真优化问题可以非常有效地组织和解决未知数,其中包括为每个N/L子向量选择的壳层的汉明权重,以及总体增益g。影响选择某个二进制代码作为构建块的基本属性是其空间填充属性,各种汉明权重的壳层数量(允许在速率失真优化中或多或少的灵活性),N与L的一个倍数的接近性,以及壳层上最近邻的快速搜索的存在性。我们展示了在音频编码应用中使用Golay代码进行矢量量化的结果。
{"title":"Low Bit Rate Vector Quantization of Outlier Contaminated Data Based on Shells of Golay Codes","authors":"I. Tabus, A. Vasilache","doi":"10.1109/DCC.2009.62","DOIUrl":"https://doi.org/10.1109/DCC.2009.62","url":null,"abstract":"In this paper we study how to encode N-long vectors, with N in the range of hundreds, at low bit rates of 0.5 bit per sample or lower. We adopt a vector quantization structure, where an overall gain is encoded with a scalar quantizer and the remaining scaled vector is encoded  using a vector quantizer built out by combining smaller (length L) binary codes known to be efficient in filling the space, the important examples discussed here  being the Golay codes. Due to the typical nonstationary distribution of the long vectors, a piecewise stationary plus contamination model is assumed.  The generic solution is to encode the outliers using Golomb-Rice codes, and for each L-long subvector to encode the vector of absolute values using the nearest neighbor in a certain shell of a chosen binary {0,1} code, the sign information being transmitted separately. The rate-distortion optimization problem can be very efficiently organized and solved for the unknowns, which include the Hamming weights of the chosen shells for each of the  N/L subvectors, and the overall gain g. The essential properties which influence the selection of a certain binary code as a building block are its space filling properties, the number of shells of various Hamming weights (allowing more or less  flexibility in the rate-distortion optimization), the closeness of N to a multiple of L, and the existence of fast search of nearest neighbor on a shell. We show results when using the Golay codes for vector quantization on audio coding applications.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131563511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On the Use of Word Alignments to Enhance Bitext Compression 利用字对齐增强文本压缩
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.22
Miguel A. Martínez-Prieto, J. Adiego, F. Sánchez-Martínez, P. Fuente, Rafael C. Carrasco
This paper describes a novel approach for bilingual parallel corpora (bitexts) compression. The approach takes advantage of the fact that the two texts that form a bitext are mutual translations. First, the two texts are aligned both at the sentence and the word level. Then, word alignments are used to define biwords, that is, pairs of two words, each one from a different text, that are mutual translations. Finally, a biword-based PPM compressor is applied. The results obtained compressing the two texts of the bitext together improve the compression ratios achieved when both texts are independently compressed through a word-based PPM compressor; thus, saving storage and transmission costs.
本文提出了一种新的双语并行语料库压缩方法。这种方法利用了构成一个文本的两个文本是相互翻译的这一事实。首先,两篇文章在句子和单词层面都是一致的。然后,单词对齐用于定义双单词,即两个单词对,每个单词来自不同的文本,它们是相互翻译的。最后,应用了基于双字的PPM压缩器。通过基于单词的PPM压缩器分别对文本进行压缩时,得到的压缩结果提高了压缩比;从而节省存储和传输成本。
{"title":"On the Use of Word Alignments to Enhance Bitext Compression","authors":"Miguel A. Martínez-Prieto, J. Adiego, F. Sánchez-Martínez, P. Fuente, Rafael C. Carrasco","doi":"10.1109/DCC.2009.22","DOIUrl":"https://doi.org/10.1109/DCC.2009.22","url":null,"abstract":"This paper describes a novel approach for bilingual parallel corpora (bitexts) compression. The approach takes advantage of the fact that the two texts that form a bitext are mutual translations. First, the two texts are aligned both at the sentence and the word level. Then, word alignments are used to define biwords, that is, pairs of two words, each one from a different text, that are mutual translations. Finally, a biword-based PPM compressor is applied. The results obtained compressing the two texts of the bitext together improve the compression ratios achieved when both texts are independently compressed through a word-based PPM compressor; thus, saving storage and transmission costs.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125208854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A MS-SSIM Optimal JPEG 2000 Encoder 一个MS-SSIM最佳JPEG 2000编码器
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.15
T. Richter, Kil Joong Kim
In this work, we present a SSIM optimal JPEG 2000 rate allocation algorithm. However, our aim is less improving the visual performance of JPEG 2000, but more the study of the performance of the SSIM full reference metric by means beyond correlation measurements.Full reference image quality metrics assign a quality index to a pair of a reference and distorted image. The performance of a metric is then measured by the degree of correlation between the scores obtained from the metric and those from subjective tests. It is the aim of a rate allocation algorithm to minimize the distortion created by a lossy image compression scheme under a rate constraint.Noting this relation between objective function and performance evaluation allows us now to define an alternative approach to evaluate the usefulness of a candidate metric: we want to judge the quality of a metric by its ability to define an objective function for rate control purposes, and evaluate images compressed in this scheme subjectively. It turns out that deficiencies of image quality metrics become much easier visible --- even in the literal sense --- than under traditional correlation experiments.Our candidate metric in this work is the SSIM index proposed by Sheik and Bovik which is both simple enough to be implemented efficiently in rate control algorithms, but yet correlates better to visual quality than MSE; our candidate compression scheme is the highly flexible JPEG 2000 standard.
在这项工作中,我们提出了一种SSIM最优JPEG 2000速率分配算法。然而,我们的目标不是提高JPEG 2000的视觉性能,而是通过超越相关测量的方法来研究SSIM全参考度量的性能。全参考图像质量指标分配一个质量指标的一对参考和扭曲的图像。然后通过从度量中获得的分数与从主观测试中获得的分数之间的相关性程度来衡量度量的性能。速率分配算法的目标是在一定的速率约束下最小化有损图像压缩方案所产生的失真。注意到目标函数和性能评估之间的这种关系,我们现在可以定义一种替代方法来评估候选度量的有用性:我们希望通过定义用于速率控制目的的目标函数的能力来判断度量的质量,并主观地评估在该方案中压缩的图像。事实证明,与传统的相关实验相比,图像质量指标的缺陷变得更容易看到——甚至在字面意义上也是如此。我们的候选度量是Sheik和Bovik提出的SSIM指数,它既简单,可以在速率控制算法中有效实现,但与视觉质量的相关性比MSE更好;我们的候选压缩方案是高度灵活的JPEG 2000标准。
{"title":"A MS-SSIM Optimal JPEG 2000 Encoder","authors":"T. Richter, Kil Joong Kim","doi":"10.1109/DCC.2009.15","DOIUrl":"https://doi.org/10.1109/DCC.2009.15","url":null,"abstract":"In this work, we present a SSIM optimal JPEG 2000 rate allocation algorithm. However, our aim is less improving the visual performance of JPEG 2000, but more the study of the performance of the SSIM full reference metric by means beyond correlation measurements.Full reference image quality metrics assign a quality index to a pair of a reference and distorted image. The performance of a metric is then measured by the degree of correlation between the scores obtained from the metric and those from subjective tests. It is the aim of a rate allocation algorithm to minimize the distortion created by a lossy image compression scheme under a rate constraint.Noting this relation between objective function and performance evaluation allows us now to define an alternative approach to evaluate the usefulness of a candidate metric: we want to judge the quality of a metric by its ability to define an objective function for rate control purposes, and evaluate images compressed in this scheme subjectively. It turns out that deficiencies of image quality metrics become much easier visible --- even in the literal sense --- than under traditional correlation experiments.Our candidate metric in this work is the SSIM index proposed by Sheik and Bovik which is both simple enough to be implemented efficiently in rate control algorithms, but yet correlates better to visual quality than MSE; our candidate compression scheme is the highly flexible JPEG 2000 standard.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122720474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
A Binary Image Scalable Coder Based on Reversible Cellular Automata Transform and Arithmetic Coding 基于可逆元胞自动机变换和算术编码的二值图像可伸缩编码器
Pub Date : 2009-03-16 DOI: 10.1109/DCC.2009.59
S. Milani, Carlos Cruz-Reyes, J. Kari, G. Calvagno
The paper presents an efficient scalable coding approach for bi-level images that relies on reversible non-linear transformations performed by subclasses of Cellular Automata. At each transformation stage the input image is converted into four subimages which are coded separately. In this work we delineate an effective strategy for the entropy coder to code the transformed image into a binary bit stream that outperforms the compression results previously obtained and compares well with the standard JBIG. Experimental results show that our method proves to be more efficient for images where black pixels lie within a connected region and for multiple decomposition levels.
本文提出了一种有效的双层图像可扩展编码方法,该方法依赖于由元胞自动机子类执行的可逆非线性变换。在每个转换阶段,将输入图像转换为四个分别编码的子图像。在这项工作中,我们描述了熵编码器将转换后的图像编码成二进制比特流的有效策略,该策略优于先前获得的压缩结果,并与标准JBIG相比较。实验结果表明,对于黑色像素位于连通区域内的图像和多个分解层次的图像,我们的方法更有效。
{"title":"A Binary Image Scalable Coder Based on Reversible Cellular Automata Transform and Arithmetic Coding","authors":"S. Milani, Carlos Cruz-Reyes, J. Kari, G. Calvagno","doi":"10.1109/DCC.2009.59","DOIUrl":"https://doi.org/10.1109/DCC.2009.59","url":null,"abstract":"The paper presents an efficient scalable coding approach for bi-level images that relies on reversible non-linear transformations performed by subclasses of Cellular Automata. At each transformation stage the input image is converted into four subimages which are coded separately. In this work we delineate an effective strategy for the entropy coder to code the transformed image into a binary bit stream that outperforms the compression results previously obtained and compares well with the standard JBIG. Experimental results show that our method proves to be more efficient for images where black pixels lie within a connected region and for multiple decomposition levels.","PeriodicalId":377880,"journal":{"name":"2009 Data Compression Conference","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126577195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2009 Data Compression Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1