Pub Date : 2025-08-19DOI: 10.1109/JSAIT.2025.3600363
Mats Gustafsson
The number of degrees of freedom (NDoF) in a communication channel fundamentally limits the number of independent spatial modes available for transmitting and receiving information. Although the NDoF can be computed numerically for specific configurations using singular value decomposition (SVD) of the channel operator, this approach provides limited physical insight. In this paper, we introduce a simple analytical estimate for the NDoF between arbitrarily shaped transmitter and receiver regions in free space. In the electrically large limit, where the NDoF is high, it is well approximated by the mutual shadow area, measured in units of wavelength squared. This area corresponds to the projected overlap of the regions, integrated over all lines of sight, and captures their effective spatial coupling. The proposed estimate generalizes and unifies several previously established results, including those based on Weyl’s law, shadow area, and the paraxial approximation. We analyze several example configurations to illustrate the accuracy of the estimate and validate it through comparisons with numerical SVD computations of the propagation channel. The results provide both practical tools and physical insight for the design and analysis of high-capacity communication and sensing systems.
{"title":"Shadow Area and Degrees of Freedom for Free-Space Communication","authors":"Mats Gustafsson","doi":"10.1109/JSAIT.2025.3600363","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3600363","url":null,"abstract":"The number of degrees of freedom (NDoF) in a communication channel fundamentally limits the number of independent spatial modes available for transmitting and receiving information. Although the NDoF can be computed numerically for specific configurations using singular value decomposition (SVD) of the channel operator, this approach provides limited physical insight. In this paper, we introduce a simple analytical estimate for the NDoF between arbitrarily shaped transmitter and receiver regions in free space. In the electrically large limit, where the NDoF is high, it is well approximated by the mutual shadow area, measured in units of wavelength squared. This area corresponds to the projected overlap of the regions, integrated over all lines of sight, and captures their effective spatial coupling. The proposed estimate generalizes and unifies several previously established results, including those based on Weyl’s law, shadow area, and the paraxial approximation. We analyze several example configurations to illustrate the accuracy of the estimate and validate it through comparisons with numerical SVD computations of the propagation channel. The results provide both practical tools and physical insight for the design and analysis of high-capacity communication and sensing systems.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"325-337"},"PeriodicalIF":2.2,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145110296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-18DOI: 10.1109/JSAIT.2025.3599794
Ran Tamir;Nir Weinberger
We consider a molecular channel, in which messages are encoded to the frequency of objects in a pool, and whose output during reading time is a noisy version of the input frequencies, as obtained by sampling with replacement from the pool. Motivated by recent DNA storage techniques, we focus on the regime in which the input resolution is unlimited. We propose two error probability bounds for this channel; the first bound is based on random coding analysis of the error probability of the maximum likelihood decoder and the second bound is derived by code expurgation techniques. We deduce an achievable bound on the capacity of this channel, and compare it to both the achievable bounds under limited input resolution, as well as to a converse bound.
{"title":"Achievable Rates and Error Probability Bounds of Frequency-Based Channels of Unlimited Input Resolution","authors":"Ran Tamir;Nir Weinberger","doi":"10.1109/JSAIT.2025.3599794","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3599794","url":null,"abstract":"We consider a molecular channel, in which messages are encoded to the frequency of objects in a pool, and whose output during reading time is a noisy version of the input frequencies, as obtained by sampling with replacement from the pool. Motivated by recent DNA storage techniques, we focus on the regime in which the input resolution is unlimited. We propose two error probability bounds for this channel; the first bound is based on random coding analysis of the error probability of the maximum likelihood decoder and the second bound is derived by code expurgation techniques. We deduce an achievable bound on the capacity of this channel, and compare it to both the achievable bounds under limited input resolution, as well as to a converse bound.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"283-295"},"PeriodicalIF":2.2,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145061913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14DOI: 10.1109/JSAIT.2025.3598756
Brendon McBain;Emanuele Viterbo
This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a basecalling algorithm. Specifically, the noisy nanopore channel (NNC) with the Scrappie pore model generates average output levels via i.i.d. geometric sample duplications corrupted by i.i.d. Gaussian noise (NNC-Scrappie). Simplified message passing algorithms are derived for efficient soft decoding of nanopore signals using NNC-Scrappie. Previously, evaluation of this channel model was limited by the lack of DNA storage datasets with nanopore signals included. This is solved by deriving an achievable rate based on the dynamic time-warping (DTW) algorithm that can be applied to genomic sequencing datasets subject to constraints that make the resulting rate applicable to DNA storage. Using a publicly-available dataset from Oxford Nanopore Technologies (ONT), it is demonstrated that coding over multiple DNA strands of 100 bases in length and decoding with the NNC-Scrappie decoder can achieve rates of at least $0.64-1.18$ bits per base, depending on the channel quality of the nanopore that is chosen in the sequencing device per channel-use, and 0.96 bits per base on average assuming uniformly chosen nanopores. These rates are pessimistic since they only apply to single reads and do not include calibration of the pore model to specific nanopores.
{"title":"Achievable Rates of Nanopore-Based DNA Storage","authors":"Brendon McBain;Emanuele Viterbo","doi":"10.1109/JSAIT.2025.3598756","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3598756","url":null,"abstract":"This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a basecalling algorithm. Specifically, the noisy nanopore channel (NNC) with the Scrappie pore model generates average output levels via i.i.d. geometric sample duplications corrupted by i.i.d. Gaussian noise (NNC-Scrappie). Simplified message passing algorithms are derived for efficient soft decoding of nanopore signals using NNC-Scrappie. Previously, evaluation of this channel model was limited by the lack of DNA storage datasets with nanopore signals included. This is solved by deriving an achievable rate based on the dynamic time-warping (DTW) algorithm that can be applied to genomic sequencing datasets subject to constraints that make the resulting rate applicable to DNA storage. Using a publicly-available dataset from Oxford Nanopore Technologies (ONT), it is demonstrated that coding over multiple DNA strands of 100 bases in length and decoding with the NNC-Scrappie decoder can achieve rates of at least <inline-formula> <tex-math>$0.64-1.18$ </tex-math></inline-formula> bits per base, depending on the channel quality of the nanopore that is chosen in the sequencing device per channel-use, and 0.96 bits per base on average assuming uniformly chosen nanopores. These rates are pessimistic since they only apply to single reads and do not include calibration of the pore model to specific nanopores.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"261-269"},"PeriodicalIF":2.2,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14DOI: 10.1109/JSAIT.2025.3598773
V. Arvind Rameshwar;Nir Weinberger
In this paper, we consider a recent channel model of a nanopore sequencer proposed by McBain, Viterbo, and Saunderson (2024), termed the noisy nanopore channel (NNC). In essence, an NNC is a duplication channel with structured, Markov inputs, that is corrupted by memoryless noise. We first discuss a (tight) lower bound on the capacity of the NNC in the absence of random noise. Next, we present lower and upper bounds on the channel capacity of general noisy nanopore channels. We then consider two interesting regimes of operation of an NNC: first, where the memory of the input process is large and the random noise introduces erasures, and second, where the rate of measurements of the electric current (also called the sampling rate) is high. For these regimes, we show that it is possible to achieve information rates close to the noise-free capacity, using low-complexity encoding and decoding schemes. In particular, our decoder for the regime of high sampling rates makes use of a change-point detection procedure – a subroutine of immediate relevance for practitioners.
{"title":"On Achievable Rates Over Noisy Nanopore Channels","authors":"V. Arvind Rameshwar;Nir Weinberger","doi":"10.1109/JSAIT.2025.3598773","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3598773","url":null,"abstract":"In this paper, we consider a recent channel model of a nanopore sequencer proposed by McBain, Viterbo, and Saunderson (2024), termed the noisy nanopore channel (NNC). In essence, an NNC is a duplication channel with structured, Markov inputs, that is corrupted by memoryless noise. We first discuss a (tight) lower bound on the capacity of the NNC in the absence of random noise. Next, we present lower and upper bounds on the channel capacity of general noisy nanopore channels. We then consider two interesting regimes of operation of an NNC: first, where the memory of the input process is large and the random noise introduces erasures, and second, where the rate of measurements of the electric current (also called the sampling rate) is high. For these regimes, we show that it is possible to achieve information rates close to the noise-free capacity, using low-complexity encoding and decoding schemes. In particular, our decoder for the regime of high sampling rates makes use of a change-point detection procedure – a subroutine of immediate relevance for practitioners.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"270-282"},"PeriodicalIF":2.2,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-08DOI: 10.1109/JSAIT.2025.3597013
Wentu Song;Kui Cai;Tony Q. S. Quek
The central problem in sequence reconstruction is to find the minimum number of distinct channel outputs required to uniquely reconstruct the transmitted sequence. According to Levenshtein’s work in 2001, this number is determined by the size of the maximum intersection between the error balls of any two distinct input sequences of the channel. In this work, we study the sequence reconstruction problem for the q-ary single-deletion single-substitution channel for any fixed integer $qgeq 2$ . First, we prove that if two q-ary sequences of length n have a Hamming distance $dgeq 2$ , then the intersection size of their error balls is upper bounded by $2qn-3q-2-delta _{q,2}$ , where $delta _{i,j}$ is the Kronecker delta, and this bound is achievable. Next, we prove that if two q-ary sequences have a Hamming distance $dgeq 3$ and a Levenshtein distance $d_{text {L}}geq 2$ , then the intersection size of their error balls is upper bounded by $3q+11$ , and we show that the gap between this bound and the tight bound is at most 2.
{"title":"Sequence Reconstruction for the Single-Deletion Single-Substitution Channel","authors":"Wentu Song;Kui Cai;Tony Q. S. Quek","doi":"10.1109/JSAIT.2025.3597013","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3597013","url":null,"abstract":"The central problem in sequence reconstruction is to find the minimum number of distinct channel outputs required to uniquely reconstruct the transmitted sequence. According to Levenshtein’s work in 2001, this number is determined by the size of the maximum intersection between the error balls of any two distinct input sequences of the channel. In this work, we study the sequence reconstruction problem for the q-ary single-deletion single-substitution channel for any fixed integer <inline-formula> <tex-math>$qgeq 2$ </tex-math></inline-formula>. First, we prove that if two q-ary sequences of length n have a Hamming distance <inline-formula> <tex-math>$dgeq 2$ </tex-math></inline-formula>, then the intersection size of their error balls is upper bounded by <inline-formula> <tex-math>$2qn-3q-2-delta _{q,2}$ </tex-math></inline-formula>, where <inline-formula> <tex-math>$delta _{i,j}$ </tex-math></inline-formula> is the Kronecker delta, and this bound is achievable. Next, we prove that if two q-ary sequences have a Hamming distance <inline-formula> <tex-math>$dgeq 3$ </tex-math></inline-formula> and a Levenshtein distance <inline-formula> <tex-math>$d_{text {L}}geq 2$ </tex-math></inline-formula>, then the intersection size of their error balls is upper bounded by <inline-formula> <tex-math>$3q+11$ </tex-math></inline-formula>, and we show that the gap between this bound and the tight bound is at most 2.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"232-247"},"PeriodicalIF":2.2,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-04DOI: 10.1109/JSAIT.2025.3595457
Yaoyu Yang
In DNA sequencing, we often need to infer an unknown sequence from a collection of its corrupted copies. Each copy cannot faithfully tell the truth due to DNA fragmentation, point mutations, and measurement errors. The theoretical guarantee of unique reconstruction is thus of concern. This motivated the study of sequence reconstruction problems three decades ago. Recently, synthetic DNA has been regarded as an ultra-dense data storage medium. Sequence reconstruction is a crucial step in achieving reliable and efficient data readout. In this survey, we summarize mainly two types of problems, reconstruction from subsequences or substrings, in both combinatorial and probabilistic settings. Meanwhile, we discuss codes and algorithms that may assist with the future development of DNA-based data storage systems.
{"title":"Survey of Sequence Reconstruction Problems and Their Applications in DNA-Based Storage","authors":"Yaoyu Yang","doi":"10.1109/JSAIT.2025.3595457","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3595457","url":null,"abstract":"In DNA sequencing, we often need to infer an unknown sequence from a collection of its corrupted copies. Each copy cannot faithfully tell the truth due to DNA fragmentation, point mutations, and measurement errors. The theoretical guarantee of unique reconstruction is thus of concern. This motivated the study of sequence reconstruction problems three decades ago. Recently, synthetic DNA has been regarded as an ultra-dense data storage medium. Sequence reconstruction is a crucial step in achieving reliable and efficient data readout. In this survey, we summarize mainly two types of problems, reconstruction from subsequences or substrings, in both combinatorial and probabilistic settings. Meanwhile, we discuss codes and algorithms that may assist with the future development of DNA-based data storage systems.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"352-366"},"PeriodicalIF":2.2,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145141744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01DOI: 10.1109/JSAIT.2025.3595005
Adir Kobovich;Nir Weinberger
Recent advancements in DNA storage show that composite DNA letters can significantly enhance storage capacity. We model this process as a multinomial channel and propose an optimization algorithm to determine its capacity-achieving input distribution (CAID) for an arbitrary number of output reads. Our empirical results match a scaling law that determines that the support size grows exponentially with capacity. In addition, we introduce a limited-support optimization algorithm that optimizes the input distribution under a restricted support size, making it more feasible for real-world DNA storage systems. We also extend our model to account for noise and study its effect on capacity and input design.
{"title":"Input Optimization in the Composite DNA Storage Channel","authors":"Adir Kobovich;Nir Weinberger","doi":"10.1109/JSAIT.2025.3595005","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3595005","url":null,"abstract":"Recent advancements in DNA storage show that composite DNA letters can significantly enhance storage capacity. We model this process as a multinomial channel and propose an optimization algorithm to determine its capacity-achieving input distribution (CAID) for an arbitrary number of output reads. Our empirical results match a scaling law that determines that the support size grows exponentially with capacity. In addition, we introduce a limited-support optimization algorithm that optimizes the input distribution under a restricted support size, making it more feasible for real-world DNA storage systems. We also extend our model to account for noise and study its effect on capacity and input design.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"248-260"},"PeriodicalIF":2.2,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-30DOI: 10.1109/JSAIT.2025.3594310
Olai Å. Mostad;Eirik Rosnes;Hsuan-Yin Lin
In this work, we present a generalization of the recently proposed quantum Tanner codes by Leverrier and Zémor, which contains a construction of asymptotically good quantum low-density parity-check codes. Quantum Tanner codes have so far been constructed equivalently from groups, Cayley graphs, or square complexes constructed from groups. We show how to enlarge this to graphs with labeled local views and a family of square complexes, which is the largest possible in a certain sense. We show that the proposed generalization contains a family of asymptotically good quantum codes that are based on non-Cayley Schreier graphs, i.e., a new family of (generalized) quantum Tanner codes is provided. Moreover, we evaluate the performance of the generalized codes and compare with those based on Cayley graphs both in terms of minimum distance and logical error rate on the depolarizing channel, demonstrating that the proposed generalized codes based on Schreier graphs outperform those based on Cayley graphs.
{"title":"Asymptotically Good Generalized Quantum Tanner Codes","authors":"Olai Å. Mostad;Eirik Rosnes;Hsuan-Yin Lin","doi":"10.1109/JSAIT.2025.3594310","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3594310","url":null,"abstract":"In this work, we present a generalization of the recently proposed quantum Tanner codes by Leverrier and Zémor, which contains a construction of asymptotically good quantum low-density parity-check codes. Quantum Tanner codes have so far been constructed equivalently from groups, Cayley graphs, or square complexes constructed from groups. We show how to enlarge this to graphs with labeled local views and a family of square complexes, which is the largest possible in a certain sense. We show that the proposed generalization contains a family of asymptotically good quantum codes that are based on non-Cayley Schreier graphs, i.e., a <italic>new</i> family of (generalized) quantum Tanner codes is provided. Moreover, we evaluate the performance of the generalized codes and compare with those based on Cayley graphs both in terms of minimum distance and logical error rate on the depolarizing channel, demonstrating that the proposed generalized codes based on Schreier graphs outperform those based on Cayley graphs.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"367-382"},"PeriodicalIF":2.2,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-28DOI: 10.1109/JSAIT.2025.3593447
Wenkai Zhang;Zhiying Wang
Emerging DNA storage technologies use composite DNA letters, where information is represented by a probability vector, leading to higher information density and lower synthesis costs. However, it faces the problem of information leakage in sharing the DNA vessels among untrusted vendors. This paper introduces an asymptotic ramp secret sharing scheme (ARSSS) for secret information storage using composite DNA letters. This innovative scheme, inspired by secret sharing methods over finite fields and enhanced with a modified matrix-vector multiplication operation for probability vectors, achieves asymptotic information-theoretic data security for a large alphabet size. Moreover, this scheme reduces the number of reading operations for DNA samples compared to traditional schemes, and therefore lowers the complexity and the cost of DNA-based secret sharing. We further explore the construction of the scheme, starting with a proof of the existence of a suitable generator, followed by practical examples. Finally, we demonstrate efficient constructions to support large information sizes, which utilize multiple vessels for each secret share rather than a single vessel.
{"title":"Ramp Secret Sharing for Composite DNA","authors":"Wenkai Zhang;Zhiying Wang","doi":"10.1109/JSAIT.2025.3593447","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3593447","url":null,"abstract":"Emerging DNA storage technologies use composite DNA letters, where information is represented by a probability vector, leading to higher information density and lower synthesis costs. However, it faces the problem of information leakage in sharing the DNA vessels among untrusted vendors. This paper introduces an asymptotic ramp secret sharing scheme (ARSSS) for secret information storage using composite DNA letters. This innovative scheme, inspired by secret sharing methods over finite fields and enhanced with a modified matrix-vector multiplication operation for probability vectors, achieves asymptotic information-theoretic data security for a large alphabet size. Moreover, this scheme reduces the number of reading operations for DNA samples compared to traditional schemes, and therefore lowers the complexity and the cost of DNA-based secret sharing. We further explore the construction of the scheme, starting with a proof of the existence of a suitable generator, followed by practical examples. Finally, we demonstrate efficient constructions to support large information sizes, which utilize multiple vessels for each secret share rather than a single vessel.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"217-231"},"PeriodicalIF":2.2,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144887811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-21DOI: 10.1109/JSAIT.2025.3590758
Yan Hao Ling;Nir Weinberger;Jonathan Scarlett
In this paper, we study error exponents for an index-based concatenated coding based class of DNA storage codes in which the number of reads performed can be variable. That is, the decoder can sequentially perform reads and choose whether to output the final decision or take more reads, and we are interested in minimizing the average number of reads performed rather than a fixed pre-specified value. We show that this flexibility leads to a considerable reduction in the error probability compared to a fixed number of reads, not only in terms of constants in the error exponent but also in the scaling laws. This is shown via an achievability result for a suitably-designed protocol, and in certain parameter regimes we additionally establish a matching converse that holds for all protocols within a broader index-based concatenated coding based class.
{"title":"Error Exponents for DNA Storage Codes With a Variable Number of Reads","authors":"Yan Hao Ling;Nir Weinberger;Jonathan Scarlett","doi":"10.1109/JSAIT.2025.3590758","DOIUrl":"https://doi.org/10.1109/JSAIT.2025.3590758","url":null,"abstract":"In this paper, we study error exponents for an index-based concatenated coding based class of DNA storage codes in which the number of reads performed can be variable. That is, the decoder can sequentially perform reads and choose whether to output the final decision or take more reads, and we are interested in minimizing the average number of reads performed rather than a fixed pre-specified value. We show that this flexibility leads to a considerable reduction in the error probability compared to a fixed number of reads, not only in terms of constants in the error exponent but also in the scaling laws. This is shown via an achievability result for a suitably-designed protocol, and in certain parameter regimes we additionally establish a matching converse that holds for all protocols within a broader index-based concatenated coding based class.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"6 ","pages":"205-216"},"PeriodicalIF":2.2,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144814143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}