文本文件无损数据压缩中涉及字母编码和有限长度编码的备选算法

2016 International Symposium on Fundamentals of Electrical Engineering (ISFEE) Pub Date : 2016-06-01 DOI:10.1109/ISFEE.2016.7803154

R. Radescu

{"title":"文本文件无损数据压缩中涉及字母编码和有限长度编码的备选算法","authors":"R. Radescu","doi":"10.1109/ISFEE.2016.7803154","DOIUrl":null,"url":null,"abstract":"Coding algorithms are generally aimed at minimizing the output code length encoding speed once the code has been designed. Moreover, most codes use a binary alphabet. This paper examines other issues related to coding, such as additional constraints imposed on the channel. Code generation will be considered where there is a limit on code words. Limits the application of such systems is a practical example of data compression where fast decoding is essential. When all code words correspond to a single word in memory (usually 32 bits, but there are situations that take 64-bit) can be used canonical decoding. If the deadline cannot be guaranteed, however, required the use of slower decoding methods. This paper also deals with the alphabetic code generation, where lexicographic arrangement of words by their code symbols must correspond to the original order in which the symbols were taken coding system. When an alphabetic code is used to compress a database that can be sorted in the same, order they would have had if the database records were first decompressed and then sorted. It also corresponds to alphabetic code trees binary search trees, which have applications in a wide variety of search problems. Assumption that the symbols are sorted by probability is not suitable for this scenario. The problem of finding codes for non-binary alphabets channel will be examined in detail. The subsequent experimental results cover the problem of alphabetic coding and of limited length coding.","PeriodicalId":240170,"journal":{"name":"2016 International Symposium on Fundamentals of Electrical Engineering (ISFEE)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Alternative algorithms involving alphabetical coding and limited length encoding in lossless data compression of text files\",\"authors\":\"R. Radescu\",\"doi\":\"10.1109/ISFEE.2016.7803154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Coding algorithms are generally aimed at minimizing the output code length encoding speed once the code has been designed. Moreover, most codes use a binary alphabet. This paper examines other issues related to coding, such as additional constraints imposed on the channel. Code generation will be considered where there is a limit on code words. Limits the application of such systems is a practical example of data compression where fast decoding is essential. When all code words correspond to a single word in memory (usually 32 bits, but there are situations that take 64-bit) can be used canonical decoding. If the deadline cannot be guaranteed, however, required the use of slower decoding methods. This paper also deals with the alphabetic code generation, where lexicographic arrangement of words by their code symbols must correspond to the original order in which the symbols were taken coding system. When an alphabetic code is used to compress a database that can be sorted in the same, order they would have had if the database records were first decompressed and then sorted. It also corresponds to alphabetic code trees binary search trees, which have applications in a wide variety of search problems. Assumption that the symbols are sorted by probability is not suitable for this scenario. The problem of finding codes for non-binary alphabets channel will be examined in detail. The subsequent experimental results cover the problem of alphabetic coding and of limited length coding.\",\"PeriodicalId\":240170,\"journal\":{\"name\":\"2016 International Symposium on Fundamentals of Electrical Engineering (ISFEE)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Symposium on Fundamentals of Electrical Engineering (ISFEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISFEE.2016.7803154\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Symposium on Fundamentals of Electrical Engineering (ISFEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISFEE.2016.7803154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

编码算法的目标通常是在设计代码后使输出代码长度和编码速度最小。此外，大多数代码使用二进制字母表。本文探讨了与编码相关的其他问题，例如对信道施加的附加约束。在对码字有限制的情况下，将考虑代码生成。限制这种系统的应用是数据压缩的一个实际例子，其中快速解码是必不可少的。当所有码字对应于内存中的单个字(通常为32位，但也有64位的情况)时，可以使用规范解码。但是，如果不能保证最后期限，则需要使用较慢的解码方法。本文还讨论了字母代码的生成，其中按其代码符号排列的单词的字典顺序必须与符号采用编码系统的原始顺序相对应。当使用字母代码压缩数据库时，数据库可以按照相同的顺序排序，如果数据库记录先解压缩然后排序，它们将具有相同的顺序。它也对应于字母编码树二叉搜索树，在各种搜索问题中有广泛的应用。按概率对符号进行排序的假设不适合此场景。本文将详细讨论非二进制字母信道的寻码问题。随后的实验结果涵盖了字母编码和有限长度编码的问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Alternative algorithms involving alphabetical coding and limited length encoding in lossless data compression of text files

Coding algorithms are generally aimed at minimizing the output code length encoding speed once the code has been designed. Moreover, most codes use a binary alphabet. This paper examines other issues related to coding, such as additional constraints imposed on the channel. Code generation will be considered where there is a limit on code words. Limits the application of such systems is a practical example of data compression where fast decoding is essential. When all code words correspond to a single word in memory (usually 32 bits, but there are situations that take 64-bit) can be used canonical decoding. If the deadline cannot be guaranteed, however, required the use of slower decoding methods. This paper also deals with the alphabetic code generation, where lexicographic arrangement of words by their code symbols must correspond to the original order in which the symbols were taken coding system. When an alphabetic code is used to compress a database that can be sorted in the same, order they would have had if the database records were first decompressed and then sorted. It also corresponds to alphabetic code trees binary search trees, which have applications in a wide variety of search problems. Assumption that the symbols are sorted by probability is not suitable for this scenario. The problem of finding codes for non-binary alphabets channel will be examined in detail. The subsequent experimental results cover the problem of alphabetic coding and of limited length coding.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 International Symposium on Fundamentals of Electrical Engineering (ISFEE)

自引率

0.00%

发文量