{"title":"Sequence analysis and secondary structure prediction of autosomal STR alleles using next generation sequencing (NGS) data","authors":"Hirak Ranjan Dash, Akash Ranga","doi":"10.1016/j.humgen.2024.201274","DOIUrl":null,"url":null,"abstract":"<div><p>The inception of NGS technology in forensic DNA analysis explores the sequence-based alleles in STR markers. This allows the in-depth analysis of observed STR allelic sequences to understand their nature, stability, inheritance, and possible generation of artifacts during their analysis. In the current study, 100 allelic sequences at 20 STR markers generated using the NGS technique were analyzed for their sequence topographies and prediction of secondary structures. The G + C content of the alleles observed in 20 STR markers used in this study varied from 58.65 ± 1.367% (D2S1338) to 7.62 ± 0.844% (D12ATA63). The average exact mass of one stand and for the opposite stand mass was found to be highest in FGA (25,963.248 ± 1623.271; 27,501.720 ± 1712.691), whereas, D4S2408 generated the lowest exact mass of one stand and for the opposite stand mass (11,136.354 ± 1521.757; 11,582.021 ± 1585.486). As expected, none of the STR markers showed the presence of open reading frames, Codons, and CRISPR sequences. Three STR markers viz. D2S1338, TH01, and D5S2800 showed the presence of restriction sites for <em>Cac8I</em>, <em>TspGWI</em>, <em>TspDTI</em>, <em>AccI</em>, and <em>Hpy8I</em> enzymes. Phylogenetic analysis reveals the close association of alleles between the D12ATA63 and D19S433 markers. Stable pseudoknots were predicted at alleles of D2S1338 showing an average energy of −0.76 with the highest number of nucleotides present in the pseudoknots i.e., 21.33, suggesting this marker is more prone to generate amplification artifacts.</p></div>","PeriodicalId":29686,"journal":{"name":"Human Gene","volume":"40 ","pages":"Article 201274"},"PeriodicalIF":0.5000,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Gene","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2773044124000184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
The inception of NGS technology in forensic DNA analysis explores the sequence-based alleles in STR markers. This allows the in-depth analysis of observed STR allelic sequences to understand their nature, stability, inheritance, and possible generation of artifacts during their analysis. In the current study, 100 allelic sequences at 20 STR markers generated using the NGS technique were analyzed for their sequence topographies and prediction of secondary structures. The G + C content of the alleles observed in 20 STR markers used in this study varied from 58.65 ± 1.367% (D2S1338) to 7.62 ± 0.844% (D12ATA63). The average exact mass of one stand and for the opposite stand mass was found to be highest in FGA (25,963.248 ± 1623.271; 27,501.720 ± 1712.691), whereas, D4S2408 generated the lowest exact mass of one stand and for the opposite stand mass (11,136.354 ± 1521.757; 11,582.021 ± 1585.486). As expected, none of the STR markers showed the presence of open reading frames, Codons, and CRISPR sequences. Three STR markers viz. D2S1338, TH01, and D5S2800 showed the presence of restriction sites for Cac8I, TspGWI, TspDTI, AccI, and Hpy8I enzymes. Phylogenetic analysis reveals the close association of alleles between the D12ATA63 and D19S433 markers. Stable pseudoknots were predicted at alleles of D2S1338 showing an average energy of −0.76 with the highest number of nucleotides present in the pseudoknots i.e., 21.33, suggesting this marker is more prone to generate amplification artifacts.