{"title":"分类滑动窗压缩","authors":"U. Graf","doi":"10.1109/DCC.1999.785684","DOIUrl":null,"url":null,"abstract":"Sorted Sliding Window Compression (SSWC) uses a new model (Sorted Sliding Window Model | SSWM) to encode strings e cient, which appear again while encoding a symbol sequence. The SSWM holds statistics of all strings up to certain length k in a sliding window of size n (the sliding window is de ned like in lz77). The compression program can use the SSWM to determine if the string of the next symbols are already contained in the sliding window and returns the length of match. SSWM gives directly statistics (borders of subinterval in an interval) for use in entropy encoding methods like Arithmetic Coding or Dense Coding [Gra97]. For a given number in an interval and the string length the SSWM gives back the corresponding string which is used in decompressing. After an encoding (decoding) step the model is updated with the just encoded (decoded) characters. The Model sorts all string starting points in the sliding window lexicographically. A simple way to implement the SSWM is by exhaustive search in the sliding window. An implementation with a B-tree together with special binary searches is used here. SSWC is a simple compression scheme, which uses this new model to evaluate its properties. It looks on the next characters to encode and determines the longest match with the SSWM. If the match is smaller than 2, the character is encoded. Otherwise the length and the subinterval of the string are encoded. The length values are encoded together with the single characters by using the same adaptive frequency model. Additionally some rules are used to reduce the matching length if the code length get worse. Encoding of frequencies and intervals is done with Dense Coding. SSWC is in average better than gzip [Gai93] on the Calgary corpus: 0:2 0:5 bits-per-byte better on most les and at most 0:03 bits-per-byte worse (progc and progl). This proves the quality and gives con dence in the usability of SSWM as a new building block in models for compression. SSWM has O(log k) computing complexity on all operations and needs O(n) space. SSWM can be used to implement PPM or Markov models in limited space environments because it holds all necessary informations.","PeriodicalId":103598,"journal":{"name":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Sorted sliding window compression\",\"authors\":\"U. Graf\",\"doi\":\"10.1109/DCC.1999.785684\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sorted Sliding Window Compression (SSWC) uses a new model (Sorted Sliding Window Model | SSWM) to encode strings e cient, which appear again while encoding a symbol sequence. The SSWM holds statistics of all strings up to certain length k in a sliding window of size n (the sliding window is de ned like in lz77). The compression program can use the SSWM to determine if the string of the next symbols are already contained in the sliding window and returns the length of match. SSWM gives directly statistics (borders of subinterval in an interval) for use in entropy encoding methods like Arithmetic Coding or Dense Coding [Gra97]. For a given number in an interval and the string length the SSWM gives back the corresponding string which is used in decompressing. After an encoding (decoding) step the model is updated with the just encoded (decoded) characters. The Model sorts all string starting points in the sliding window lexicographically. A simple way to implement the SSWM is by exhaustive search in the sliding window. An implementation with a B-tree together with special binary searches is used here. SSWC is a simple compression scheme, which uses this new model to evaluate its properties. It looks on the next characters to encode and determines the longest match with the SSWM. If the match is smaller than 2, the character is encoded. Otherwise the length and the subinterval of the string are encoded. The length values are encoded together with the single characters by using the same adaptive frequency model. Additionally some rules are used to reduce the matching length if the code length get worse. Encoding of frequencies and intervals is done with Dense Coding. SSWC is in average better than gzip [Gai93] on the Calgary corpus: 0:2 0:5 bits-per-byte better on most les and at most 0:03 bits-per-byte worse (progc and progl). This proves the quality and gives con dence in the usability of SSWM as a new building block in models for compression. SSWM has O(log k) computing complexity on all operations and needs O(n) space. SSWM can be used to implement PPM or Markov models in limited space environments because it holds all necessary informations.\",\"PeriodicalId\":103598,\"journal\":{\"name\":\"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC.1999.785684\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.1999.785684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

只提供摘要形式。排序滑动窗口压缩(SSWC)采用一种新的模型(排序滑动窗口模型,SSWM)对字符串进行高效编码,使字符串在编码符号序列时再次出现。SSWM在大小为n的滑动窗口中保存所有长度为k的字符串的统计信息。压缩程序可以使用SSWM来确定下一个符号的字符串是否已经包含在滑动窗口中,并返回匹配的长度。ssm直接给出统计信息(区间内子区间的边界),用于熵编码方法,如算术编码或密集编码。对于间隔和字符串长度中的给定数字,ssswm会返回用于解压缩的相应字符串。在编码(解码)步骤之后,使用刚刚编码(解码)的字符更新模型。该模型按字典顺序对滑动窗口中的所有字符串起始点进行排序。实现SSWM的一种简单方法是在滑动窗口中进行穷举搜索。这里使用了b树和特殊二叉搜索的实现。SSWC是一种简单的压缩方案,它使用这个新模型来评估其性能。它查找下一个要编码的字符,并确定与ssm的最长匹配。如果匹配小于2,则对该字符进行编码。否则将对字符串的长度和子间隔进行编码。使用相同的自适应频率模型将长度值与单个字符一起编码。此外,如果代码长度变差,则使用一些规则来减少匹配长度。频率和间隔的编码是用密集编码完成的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Sorted sliding window compression
Sorted Sliding Window Compression (SSWC) uses a new model (Sorted Sliding Window Model | SSWM) to encode strings e cient, which appear again while encoding a symbol sequence. The SSWM holds statistics of all strings up to certain length k in a sliding window of size n (the sliding window is de ned like in lz77). The compression program can use the SSWM to determine if the string of the next symbols are already contained in the sliding window and returns the length of match. SSWM gives directly statistics (borders of subinterval in an interval) for use in entropy encoding methods like Arithmetic Coding or Dense Coding [Gra97]. For a given number in an interval and the string length the SSWM gives back the corresponding string which is used in decompressing. After an encoding (decoding) step the model is updated with the just encoded (decoded) characters. The Model sorts all string starting points in the sliding window lexicographically. A simple way to implement the SSWM is by exhaustive search in the sliding window. An implementation with a B-tree together with special binary searches is used here. SSWC is a simple compression scheme, which uses this new model to evaluate its properties. It looks on the next characters to encode and determines the longest match with the SSWM. If the match is smaller than 2, the character is encoded. Otherwise the length and the subinterval of the string are encoded. The length values are encoded together with the single characters by using the same adaptive frequency model. Additionally some rules are used to reduce the matching length if the code length get worse. Encoding of frequencies and intervals is done with Dense Coding. SSWC is in average better than gzip [Gai93] on the Calgary corpus: 0:2 0:5 bits-per-byte better on most les and at most 0:03 bits-per-byte worse (progc and progl). This proves the quality and gives con dence in the usability of SSWM as a new building block in models for compression. SSWM has O(log k) computing complexity on all operations and needs O(n) space. SSWM can be used to implement PPM or Markov models in limited space environments because it holds all necessary informations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Real-time VBR rate control of MPEG video based upon lexicographic bit allocation Performance of quantizers on noisy channels using structured families of codes SICLIC: a simple inter-color lossless image coder Protein is incompressible Encoding time reduction in fractal image compression
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1