{"title":"A novel information contents based similarity metric for comparing TFBS motifs","authors":"Shaoqiang Zhang, Lifen Jiang, Chuanbin Du, Z. Su","doi":"10.1109/ISB.2012.6314109","DOIUrl":null,"url":null,"abstract":"Identifying binding sites recognized by transcription factors (TFs) is one of major challenges to decipher complex genetic regulatory networks encoded in a genome. A set of binding sites recognized by the same TF, called a motif, can be accurately represented by a position frequency matrix (PFM) or a position-specific scoring matrix (PSSM). Very often, we need to compare motifs when searching for similar motifs in a motif database for a query motif, or clustering motifs possibly recognized by the same TF. In this paper, we have designed a novel metric, called SPIC (Similarity between Positions with Information Contents), for quantifying the similarity between two motifs using their PFMs, PSSMs, and column information contents, and demonstrated that this metric outperforms the other state-of-the-art methods for clustering motifs of the same TF and differentiating motifs of different TFs.","PeriodicalId":224011,"journal":{"name":"2012 IEEE 6th International Conference on Systems Biology (ISB)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 6th International Conference on Systems Biology (ISB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISB.2012.6314109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Identifying binding sites recognized by transcription factors (TFs) is one of major challenges to decipher complex genetic regulatory networks encoded in a genome. A set of binding sites recognized by the same TF, called a motif, can be accurately represented by a position frequency matrix (PFM) or a position-specific scoring matrix (PSSM). Very often, we need to compare motifs when searching for similar motifs in a motif database for a query motif, or clustering motifs possibly recognized by the same TF. In this paper, we have designed a novel metric, called SPIC (Similarity between Positions with Information Contents), for quantifying the similarity between two motifs using their PFMs, PSSMs, and column information contents, and demonstrated that this metric outperforms the other state-of-the-art methods for clustering motifs of the same TF and differentiating motifs of different TFs.