{"title":"Non-quantized minimum free energy in untranslated region exons","authors":"K. Knapp, A. Rahaman, Y.-P.P. Chen","doi":"10.1109/BIBMW.2007.4425397","DOIUrl":null,"url":null,"abstract":"In an attempt to improve automated gene prediction in the untranslated region of a gene, we completed an in-depth analysis of the minimum free energy for 8,689 sub-genetic DNA sequences. We expanded Zhang's classification model and classified each sub-genetic sequence into one of 27 possible motifs. We calculated the minimum free energy for each motif to explore statistical features that correlate to biologically relevant sub-genetic sequences. If biologically relevant sub-genetic sequences fall into distinct free energy quanta it may be possible to characterize a motif based on its minimum free energy. Proper characterization of motifs can lead to greater understanding in automated genefinding, gene variability and the role DNA structure plays in gene network regulation. Our analysis determined: (1) the average free energy value for exons, introns and other biologically relevant sub-genetic sequences, (2) that these subsequences do not exist in distinct energy quanta, (3) that introns exist however in a tightly coupled average minimum free energy quantum compared to all other biologically relevant sub-genetic sequence types, (4) that single exon genes demonstrate a higher stability than exons which span the entire coding sequence as part of a multi-exon gene and (5) that all motif types contain a free energy global minimum at approximately nucleotide position 1,000 before reaching a plateau. These results should be relevant to the biochemist and bioinformatician seeking to understand the relationship between sub-genetic sequences and the information behind them.","PeriodicalId":260286,"journal":{"name":"2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBMW.2007.4425397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In an attempt to improve automated gene prediction in the untranslated region of a gene, we completed an in-depth analysis of the minimum free energy for 8,689 sub-genetic DNA sequences. We expanded Zhang's classification model and classified each sub-genetic sequence into one of 27 possible motifs. We calculated the minimum free energy for each motif to explore statistical features that correlate to biologically relevant sub-genetic sequences. If biologically relevant sub-genetic sequences fall into distinct free energy quanta it may be possible to characterize a motif based on its minimum free energy. Proper characterization of motifs can lead to greater understanding in automated genefinding, gene variability and the role DNA structure plays in gene network regulation. Our analysis determined: (1) the average free energy value for exons, introns and other biologically relevant sub-genetic sequences, (2) that these subsequences do not exist in distinct energy quanta, (3) that introns exist however in a tightly coupled average minimum free energy quantum compared to all other biologically relevant sub-genetic sequence types, (4) that single exon genes demonstrate a higher stability than exons which span the entire coding sequence as part of a multi-exon gene and (5) that all motif types contain a free energy global minimum at approximately nucleotide position 1,000 before reaching a plateau. These results should be relevant to the biochemist and bioinformatician seeking to understand the relationship between sub-genetic sequences and the information behind them.