{"title":"Finding Occurrences of Relevant Functional Elements in Genomic Signatures.","authors":"Edwin Jacox, Laura Elnitski","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>For genomic applications, signature-finding algorithms identify over-represented signatures (words) in collections of DNA sequences. The results can be presented as a specific sequence of bases, a consensus sequence showing possible combination of bases, or a matrix of weighted possibilities at each position. These results are often compared to a biological set of binding sites (i.e., known functional elements), which are usually represented as weighted matrices. The comparison is made by scoring the signatures against each weight matrix to identify the best option for a positive hit. However, this approach can misclassify results when applied to short sequences, which are a frequent result of signature finders. We describe a novel method using a window around the original sequences (those which the signature is based upon) to improve the comparison and identify a more significant measure of similarity. In doing so, our method transforms a list of DNA signatures into a resource of characterized binding sites with known functional roles and identifies novel elements in need of further elucidation.</p>","PeriodicalId":88523,"journal":{"name":"International journal of computational science","volume":"2 5","pages":"599-606"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2800375/pdf/nihms70363.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of computational science","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
For genomic applications, signature-finding algorithms identify over-represented signatures (words) in collections of DNA sequences. The results can be presented as a specific sequence of bases, a consensus sequence showing possible combination of bases, or a matrix of weighted possibilities at each position. These results are often compared to a biological set of binding sites (i.e., known functional elements), which are usually represented as weighted matrices. The comparison is made by scoring the signatures against each weight matrix to identify the best option for a positive hit. However, this approach can misclassify results when applied to short sequences, which are a frequent result of signature finders. We describe a novel method using a window around the original sequences (those which the signature is based upon) to improve the comparison and identify a more significant measure of similarity. In doing so, our method transforms a list of DNA signatures into a resource of characterized binding sites with known functional roles and identifies novel elements in need of further elucidation.