Structural Features of the Nucleotide Sequences of Genomes

IF 0.1 Journal of Computer Aided Chemistry Pub Date : 2009-01-01 DOI:10.2751/JCAC.10.38

M. Takeda, M. Nakahara

{"title":"Structural Features of the Nucleotide Sequences of Genomes","authors":"M. Takeda, M. Nakahara","doi":"10.2751/JCAC.10.38","DOIUrl":null,"url":null,"abstract":"We propose structural features of genomic DNA, which are essential to generate and to analyze genome. We calculated the appearance frequency of the nucleotides (bases) of throughout the entire genome as a polynucleotide molecule consisting of Adenine (A), Thymine (T), Guanine (G) and Cytosine (C) bases including the coding- and the non-coding regions, primarily in the genomes of Saccharomyces cerevisiae, Escherichia coli, and Homo sapiens. Our results indicate that the base sequences in a single-strand of DNA have the following characteristics: (1) reverse-complement symmetry of 3-9 successive bases, (2) bias and (3) multiple fractality of the distribution of four bases, A, T, G and C depending on the distance, exponentially decreased at short distances and linearly decreased at long distances in double logarithmic plot (power spectrum) of L (the distance of a base to the next base) vs P (L) (the probability of the base-distribution at L). These structural features of a single-strand of DNA can be clearly observed in any genomic DNA, especially observed remarkable in eukaryotic genome. Whereas in the artificial genomes or chromosomes with the same base-numbers, the same base-contents and the same frequencies of 64 triplets, the bias and the linearly-decreased fractality of the distribution of four bases described the above were missing, although the reverse-complement symmetry of the base sequences and the exponentially decreased-fractality of the base distribution were observed.","PeriodicalId":41457,"journal":{"name":"Journal of Computer Aided Chemistry","volume":"10 1","pages":"38-52"},"PeriodicalIF":0.1000,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Aided Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2751/JCAC.10.38","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

We propose structural features of genomic DNA, which are essential to generate and to analyze genome. We calculated the appearance frequency of the nucleotides (bases) of throughout the entire genome as a polynucleotide molecule consisting of Adenine (A), Thymine (T), Guanine (G) and Cytosine (C) bases including the coding- and the non-coding regions, primarily in the genomes of Saccharomyces cerevisiae, Escherichia coli, and Homo sapiens. Our results indicate that the base sequences in a single-strand of DNA have the following characteristics: (1) reverse-complement symmetry of 3-9 successive bases, (2) bias and (3) multiple fractality of the distribution of four bases, A, T, G and C depending on the distance, exponentially decreased at short distances and linearly decreased at long distances in double logarithmic plot (power spectrum) of L (the distance of a base to the next base) vs P (L) (the probability of the base-distribution at L). These structural features of a single-strand of DNA can be clearly observed in any genomic DNA, especially observed remarkable in eukaryotic genome. Whereas in the artificial genomes or chromosomes with the same base-numbers, the same base-contents and the same frequencies of 64 triplets, the bias and the linearly-decreased fractality of the distribution of four bases described the above were missing, although the reverse-complement symmetry of the base sequences and the exponentially decreased-fractality of the base distribution were observed.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基因组核苷酸序列的结构特征

我们提出了基因组DNA的结构特征，这是产生和分析基因组所必需的。我们计算了整个基因组中核苷酸(碱基)的出现频率，作为一个由腺嘌呤(a)、胸腺嘧啶(T)、鸟嘌呤(G)和胞嘧啶(C)碱基组成的多核苷酸分子，包括编码区和非编码区，主要存在于酿酒酵母、大肠杆菌和智人的基因组中。我们的结果表明，单链DNA的碱基序列具有以下特点:(1) reverse-complement对称连续3 - 9的基地,(2)偏见和(3)多呈不规则碎片形分布的四个基地,A、T、G和C根据距离,指数下降在短距离和长距离线性下降在双对数图(功率谱)的L(基地到下一个基地的距离)与P (L) (base-distribution的概率在L)。这些结构特点的带着一长串的DNA可以清楚地观察到在任何基因组DNA,在真核生物基因组中尤其显著。而在具有相同碱基数、相同碱基含量和相同频率的64个三胞胎的人工基因组或染色体中，虽然观察到碱基序列的逆补对称和碱基分布的指数递减，但不存在上述四种碱基分布的偏置和线性递减的分形。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Computer Aided Chemistry CHEMISTRY, MULTIDISCIPLINARY-

自引率

0.00%

发文量