{"title":"构建具有多重约束特性的 DNA 代码","authors":"Siddhartha Siddhiprada Bhoi, Udaya Parampalli, Abhay Kumar Singh","doi":"10.1007/s12095-024-00718-x","DOIUrl":null,"url":null,"abstract":"<p>DNA sequences are prone to creating secondary structures by folding back on themselves by non-specific hybridization of its nucleotides. The formation of large stem-length secondary structures makes the sequences chemically inactive towards synthesis and sequencing processes. Furthermore, in DNA computing, other constraints like homopolymer run length also introduce complications. In this paper, our goal is to tackle the problems due to the creation of secondary structures in DNA sequences along with constraints such as not having a large homopolymer run length. This paper presents families of DNA codes with secondary structures of stem length at most two and homopolymer run length at most four. We identified <span>\\(\\mathbb {Z}_{11}\\)</span> as an ideal structure to construct DNA codes to avoid the above problems. By mapping the error-correcting codes over <span>\\(\\mathbb {Z}_{11}\\)</span> to DNA nucleotides, we obtained DNA codes with rates 0.5765 times the corresponding code rate over <span>\\(\\mathbb {Z}_{11}\\)</span>, including some new secondary structure-free and better-performing codes for DNA-based data storage and DNA computing purposes.</p>","PeriodicalId":10788,"journal":{"name":"Cryptography and Communications","volume":"304 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Construction of DNA codes with multiple constrained properties\",\"authors\":\"Siddhartha Siddhiprada Bhoi, Udaya Parampalli, Abhay Kumar Singh\",\"doi\":\"10.1007/s12095-024-00718-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>DNA sequences are prone to creating secondary structures by folding back on themselves by non-specific hybridization of its nucleotides. The formation of large stem-length secondary structures makes the sequences chemically inactive towards synthesis and sequencing processes. Furthermore, in DNA computing, other constraints like homopolymer run length also introduce complications. In this paper, our goal is to tackle the problems due to the creation of secondary structures in DNA sequences along with constraints such as not having a large homopolymer run length. This paper presents families of DNA codes with secondary structures of stem length at most two and homopolymer run length at most four. We identified <span>\\\\(\\\\mathbb {Z}_{11}\\\\)</span> as an ideal structure to construct DNA codes to avoid the above problems. By mapping the error-correcting codes over <span>\\\\(\\\\mathbb {Z}_{11}\\\\)</span> to DNA nucleotides, we obtained DNA codes with rates 0.5765 times the corresponding code rate over <span>\\\\(\\\\mathbb {Z}_{11}\\\\)</span>, including some new secondary structure-free and better-performing codes for DNA-based data storage and DNA computing purposes.</p>\",\"PeriodicalId\":10788,\"journal\":{\"name\":\"Cryptography and Communications\",\"volume\":\"304 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cryptography and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s12095-024-00718-x\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cryptography and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s12095-024-00718-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
DNA 序列容易因核苷酸的非特异性杂交而折叠成二级结构。大茎长二级结构的形成使序列在合成和测序过程中失去化学活性。此外,在 DNA 计算中,同源多聚物运行长度等其他限制因素也会带来复杂性。在本文中,我们的目标是解决由于 DNA 序列中二级结构的产生以及没有大的同源多聚物运行长度等限制条件所带来的问题。本文提出了DNA编码族,其二级结构的茎长度最多为2,同源共聚物的运行长度最多为4。我们发现 \(\mathbb {Z}_{11}\) 是构建 DNA 编码以避免上述问题的理想结构。通过将 \(\mathbb {Z}_{11}\) 上的纠错码映射到 DNA 核苷酸上,我们得到的 DNA 码的速率是 \(\mathbb {Z}_{11}\) 上相应码速率的 0.5765 倍,其中包括一些新的无二级结构和性能更好的码,可用于基于 DNA 的数据存储和 DNA 计算。
Construction of DNA codes with multiple constrained properties
DNA sequences are prone to creating secondary structures by folding back on themselves by non-specific hybridization of its nucleotides. The formation of large stem-length secondary structures makes the sequences chemically inactive towards synthesis and sequencing processes. Furthermore, in DNA computing, other constraints like homopolymer run length also introduce complications. In this paper, our goal is to tackle the problems due to the creation of secondary structures in DNA sequences along with constraints such as not having a large homopolymer run length. This paper presents families of DNA codes with secondary structures of stem length at most two and homopolymer run length at most four. We identified \(\mathbb {Z}_{11}\) as an ideal structure to construct DNA codes to avoid the above problems. By mapping the error-correcting codes over \(\mathbb {Z}_{11}\) to DNA nucleotides, we obtained DNA codes with rates 0.5765 times the corresponding code rate over \(\mathbb {Z}_{11}\), including some new secondary structure-free and better-performing codes for DNA-based data storage and DNA computing purposes.