{"title":"Codes for Correcting a Burst of Edits Using Weighted-Summation VT Sketch","authors":"Yubo Sun;Gennian Ge","doi":"10.1109/TIT.2025.3530506","DOIUrl":null,"url":null,"abstract":"Bursts of errors are a class of errors that can be found in a variety of applications. A burst of t edits refers to a burst of t deletions, or a burst of t insertions, or a burst of t substitutions. This paper focuses on studying codes that can correct a burst of t edits. Our primary approach involves the use of the tool called weighted-summation VT sketch. The <inline-formula> <tex-math>$(t,k)$ </tex-math></inline-formula>-weighted-summation VT sketch of a length-n sequence is defined as the weighted summation of the VT sketch of each row of its <inline-formula> <tex-math>$t\\times \\lceil n/t \\rceil $ </tex-math></inline-formula> array representation, with weights in the i-th row set as <inline-formula> <tex-math>$k^{i-1}$ </tex-math></inline-formula> for <inline-formula> <tex-math>$i=1,2,\\ldots,t$ </tex-math></inline-formula>. By employing the weighted-summation VT sketch alongside multiple weight sketches, we introduce a construction for q-ary t-burst-substitution correcting codes with a redundancy of <inline-formula> <tex-math>$\\log n+O(1)$ </tex-math></inline-formula>, where the logarithm base is 2. Subsequently, we improve the redundancy to address specific types of burst-substitution errors, such as inversion errors, adjacent-block-transposition errors, and absorption errors. Moreover, by utilizing the method developed in the construction of burst-substitution correcting codes and imposing additional run-length-limited constraints, locally-bounded constraints, and strong-locally-balanced constraints, respectively, we introduce three constructions of t-burst-deletion correcting codes, each requiring a redundancy of <inline-formula> <tex-math>$\\log n+O(\\log \\log n)$ </tex-math></inline-formula>. Any t-burst-deletion-correcting code is also a t-burst-insertion correcting code, allowing us to intersect the t-burst-substitution-correcting codes and t-burst-deletion-correcting codes designed above to derive three constructions of q-ary t-burst-edit-correcting codes. The first two constructions have a redundancy of <inline-formula> <tex-math>$\\log n+(t\\log q-1)\\log \\log n+O(1)$ </tex-math></inline-formula>, while the third construction has a redundancy of <inline-formula> <tex-math>$\\log n+\\log \\log n+O(1)$ </tex-math></inline-formula>. Most of the proposed codes demonstrate superior performance compared to previous results, with the exception of burst-deletion correcting codes. Furthermore, in cases of single-edit errors (t-burst-edit error with <inline-formula> <tex-math>$t=1$ </tex-math></inline-formula>), the redundancy of the first two constructions of quaternary single-edit correcting codes outperforms the results of Gabrys et al. (IEEE Trans. Inf. Theory 2023). We also provide efficient encoding and decoding algorithms for our codes to enhance their practical usability.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"71 3","pages":"1631-1646"},"PeriodicalIF":2.2000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Theory","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10844033/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Bursts of errors are a class of errors that can be found in a variety of applications. A burst of t edits refers to a burst of t deletions, or a burst of t insertions, or a burst of t substitutions. This paper focuses on studying codes that can correct a burst of t edits. Our primary approach involves the use of the tool called weighted-summation VT sketch. The $(t,k)$ -weighted-summation VT sketch of a length-n sequence is defined as the weighted summation of the VT sketch of each row of its $t\times \lceil n/t \rceil $ array representation, with weights in the i-th row set as $k^{i-1}$ for $i=1,2,\ldots,t$ . By employing the weighted-summation VT sketch alongside multiple weight sketches, we introduce a construction for q-ary t-burst-substitution correcting codes with a redundancy of $\log n+O(1)$ , where the logarithm base is 2. Subsequently, we improve the redundancy to address specific types of burst-substitution errors, such as inversion errors, adjacent-block-transposition errors, and absorption errors. Moreover, by utilizing the method developed in the construction of burst-substitution correcting codes and imposing additional run-length-limited constraints, locally-bounded constraints, and strong-locally-balanced constraints, respectively, we introduce three constructions of t-burst-deletion correcting codes, each requiring a redundancy of $\log n+O(\log \log n)$ . Any t-burst-deletion-correcting code is also a t-burst-insertion correcting code, allowing us to intersect the t-burst-substitution-correcting codes and t-burst-deletion-correcting codes designed above to derive three constructions of q-ary t-burst-edit-correcting codes. The first two constructions have a redundancy of $\log n+(t\log q-1)\log \log n+O(1)$ , while the third construction has a redundancy of $\log n+\log \log n+O(1)$ . Most of the proposed codes demonstrate superior performance compared to previous results, with the exception of burst-deletion correcting codes. Furthermore, in cases of single-edit errors (t-burst-edit error with $t=1$ ), the redundancy of the first two constructions of quaternary single-edit correcting codes outperforms the results of Gabrys et al. (IEEE Trans. Inf. Theory 2023). We also provide efficient encoding and decoding algorithms for our codes to enhance their practical usability.
期刊介绍:
The IEEE Transactions on Information Theory is a journal that publishes theoretical and experimental papers concerned with the transmission, processing, and utilization of information. The boundaries of acceptable subject matter are intentionally not sharply delimited. Rather, it is hoped that as the focus of research activity changes, a flexible policy will permit this Transactions to follow suit. Current appropriate topics are best reflected by recent Tables of Contents; they are summarized in the titles of editorial areas that appear on the inside front cover.