一种用于大状态压缩方案的开销减少技术

Proceedings DCC '97. Data Compression Conference Pub Date : 1997-03-25 DOI:10.1109/DCC.1997.582061

A. Bookstein, S. T. Klein, T. Raita

{"title":"一种用于大状态压缩方案的开销减少技术","authors":"A. Bookstein, S. T. Klein, T. Raita","doi":"10.1109/DCC.1997.582061","DOIUrl":null,"url":null,"abstract":"Many of the most effective compression methods involve complicated models. Unfortunately, as model complexity increases, so does the cost of storing the model itself. This paper examines a method to reduce the amount of storage needed to represent a Markov model with an extended alphabet, by applying a clustering scheme that brings together similar states. Experiments run on a variety of large natural language texts show that much of the overhead of storing the model can be saved at the cost of a very small loss of compression efficiency.","PeriodicalId":403990,"journal":{"name":"Proceedings DCC '97. Data Compression Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"An overhead reduction technique for mega-state compression schemes\",\"authors\":\"A. Bookstein, S. T. Klein, T. Raita\",\"doi\":\"10.1109/DCC.1997.582061\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many of the most effective compression methods involve complicated models. Unfortunately, as model complexity increases, so does the cost of storing the model itself. This paper examines a method to reduce the amount of storage needed to represent a Markov model with an extended alphabet, by applying a clustering scheme that brings together similar states. Experiments run on a variety of large natural language texts show that much of the overhead of storing the model can be saved at the cost of a very small loss of compression efficiency.\",\"PeriodicalId\":403990,\"journal\":{\"name\":\"Proceedings DCC '97. Data Compression Conference\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings DCC '97. Data Compression Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC.1997.582061\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings DCC '97. Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.1997.582061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

许多最有效的压缩方法都涉及复杂的模型。不幸的是，随着模型复杂性的增加，存储模型本身的成本也在增加。本文研究了一种方法，通过应用将相似状态聚集在一起的聚类方案，减少用扩展字母表表示马尔可夫模型所需的存储量。在各种大型自然语言文本上运行的实验表明，以很小的压缩效率损失为代价，可以节省存储模型的大部分开销。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

An overhead reduction technique for mega-state compression schemes

Many of the most effective compression methods involve complicated models. Unfortunately, as model complexity increases, so does the cost of storing the model itself. This paper examines a method to reduce the amount of storage needed to represent a Markov model with an extended alphabet, by applying a clustering scheme that brings together similar states. Experiments run on a variety of large natural language texts show that much of the overhead of storing the model can be saved at the cost of a very small loss of compression efficiency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings DCC '97. Data Compression Conference

自引率

0.00%

发文量

期刊最新文献

Robust image coding with perceptual-based scalability Image coding based on mixture modeling of wavelet coefficients and a fast estimation-quantization framework Region-based video coding with embedded zero-trees Progressive Ziv-Lempel encoding of synthetic images Compressing address trace data for cache simulations