{"title":"一些通用平面树的熵和最优压缩","authors":"Z. Golebiewski, A. Magner, W. Szpankowski","doi":"10.1145/3275444","DOIUrl":null,"url":null,"abstract":"We continue developing the information theory of structured data. In this article, we study models generating d-ary trees (d ≥ 2) and trees with unrestricted degree. We first compute the entropy which gives us the fundamental lower bound on compression of such trees. Then we present efficient compression algorithms based on arithmetic encoding that achieve the entropy within a constant number of bits. A naïve implementation of these algorithms has a prohibitive time complexity of O(nd) elementary arithmetic operations (each corresponding to a number f(n, d) of bit operations), but our efficient algorithms run in O(n2) of these operations, where n is the number of nodes. It turns out that extending source coding (i.e., compression) from sequences to advanced data structures such as degree-unconstrained trees is mathematically quite challenging and leads to recurrences that find ample applications in the information theory of general structures (e.g., to analyze the information content of degree-unconstrained non-plane trees).","PeriodicalId":154047,"journal":{"name":"ACM Transactions on Algorithms (TALG)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Entropy and Optimal Compression of Some General Plane Trees\",\"authors\":\"Z. Golebiewski, A. Magner, W. Szpankowski\",\"doi\":\"10.1145/3275444\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We continue developing the information theory of structured data. In this article, we study models generating d-ary trees (d ≥ 2) and trees with unrestricted degree. We first compute the entropy which gives us the fundamental lower bound on compression of such trees. Then we present efficient compression algorithms based on arithmetic encoding that achieve the entropy within a constant number of bits. A naïve implementation of these algorithms has a prohibitive time complexity of O(nd) elementary arithmetic operations (each corresponding to a number f(n, d) of bit operations), but our efficient algorithms run in O(n2) of these operations, where n is the number of nodes. It turns out that extending source coding (i.e., compression) from sequences to advanced data structures such as degree-unconstrained trees is mathematically quite challenging and leads to recurrences that find ample applications in the information theory of general structures (e.g., to analyze the information content of degree-unconstrained non-plane trees).\",\"PeriodicalId\":154047,\"journal\":{\"name\":\"ACM Transactions on Algorithms (TALG)\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Algorithms (TALG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3275444\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Algorithms (TALG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3275444","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Entropy and Optimal Compression of Some General Plane Trees
We continue developing the information theory of structured data. In this article, we study models generating d-ary trees (d ≥ 2) and trees with unrestricted degree. We first compute the entropy which gives us the fundamental lower bound on compression of such trees. Then we present efficient compression algorithms based on arithmetic encoding that achieve the entropy within a constant number of bits. A naïve implementation of these algorithms has a prohibitive time complexity of O(nd) elementary arithmetic operations (each corresponding to a number f(n, d) of bit operations), but our efficient algorithms run in O(n2) of these operations, where n is the number of nodes. It turns out that extending source coding (i.e., compression) from sequences to advanced data structures such as degree-unconstrained trees is mathematically quite challenging and leads to recurrences that find ample applications in the information theory of general structures (e.g., to analyze the information content of degree-unconstrained non-plane trees).