{"title":"The Dimensionality of Genetic Information","authors":"Subhash Kak","doi":"10.1142/s0129626423400121","DOIUrl":null,"url":null,"abstract":"This paper investigates the dimensionality of genetic information from the perspective of optimal representation. Recently it has been shown that optimal coding of information is in terms of the noninteger dimension of e, which is accompanied by the property of scale invariance. Since Nature is optimal, we should see this dimension reflected in the organization of the genetic code. With this as background, this paper investigates the problem of the logic behind the nature of the assignment of codons to amino acids, for they take different values that range from 1 to 6. It is shown that the non-uniformity of this assignment, which goes against mathematical coding theory that demands a near uniform assignment, is consistent with noninteger dimensions. The reason why the codon assignment for different amino acids varies is because uniformity is a requirement for optimality only in a standard vector space, and is not so in the noninteger dimensional space. It is noteworthy that there are 20 different covering regions in an e-dimensional information space, which is equal to the number of amino acids. The problem of the visualization of data that originates in an e-dimensional space but examined in a 3-dimensional vector space is also discussed. It is shown that the assignment of the codons to the amino acids is fractal-like that is well modeled by the Zipf distribution which is a power law. It is remarkable that the Zipf distribution that holds for the letter frequencies of words in a natural language also applies to the rank order of triplets in the code for amino acids.","PeriodicalId":44742,"journal":{"name":"Parallel Processing Letters","volume":"60 1","pages":"0"},"PeriodicalIF":0.5000,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Parallel Processing Letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0129626423400121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper investigates the dimensionality of genetic information from the perspective of optimal representation. Recently it has been shown that optimal coding of information is in terms of the noninteger dimension of e, which is accompanied by the property of scale invariance. Since Nature is optimal, we should see this dimension reflected in the organization of the genetic code. With this as background, this paper investigates the problem of the logic behind the nature of the assignment of codons to amino acids, for they take different values that range from 1 to 6. It is shown that the non-uniformity of this assignment, which goes against mathematical coding theory that demands a near uniform assignment, is consistent with noninteger dimensions. The reason why the codon assignment for different amino acids varies is because uniformity is a requirement for optimality only in a standard vector space, and is not so in the noninteger dimensional space. It is noteworthy that there are 20 different covering regions in an e-dimensional information space, which is equal to the number of amino acids. The problem of the visualization of data that originates in an e-dimensional space but examined in a 3-dimensional vector space is also discussed. It is shown that the assignment of the codons to the amino acids is fractal-like that is well modeled by the Zipf distribution which is a power law. It is remarkable that the Zipf distribution that holds for the letter frequencies of words in a natural language also applies to the rank order of triplets in the code for amino acids.
期刊介绍:
Parallel Processing Letters (PPL) aims to rapidly disseminate results on a worldwide basis in the field of parallel processing in the form of short papers. It fills the need for an information vehicle which can convey recent achievements and further the exchange of scientific information in the field. This journal has a wide scope and topics covered included: - design and analysis of parallel and distributed algorithms - theory of parallel computation - parallel programming languages - parallel programming environments - parallel architectures and VLSI circuits