{"title":"MethylSeqLogo:DNA 甲基化智能序列标识。","authors":"Fei-Man Hsu, Paul Horton","doi":"10.1186/s12859-024-05896-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Some transcription factors, MYC for example, bind sites of potentially methylated DNA. This may increase binding specificity as such sites are (1) highly under-represented in the genome, and (2) offer additional, tissue specific information in the form of hypo- or hyper-methylation. Fortunately, bisulfite sequencing data can be used to investigate this phenomenon.</p><p><strong>Method: </strong>We developed MethylSeqLogo, an extension of sequence logos which includes new elements to indicate DNA methylation and under-represented dimers in each position of a set binding sites. Our method displays information from both DNA strands, and takes into account the sequence context (CpG or other) and genome region (promoter versus whole genome) appropriate to properly assess the expected background dimer frequency and level of methylation. MethylSeqLogo preserves sequence logo semantics-the relative height of nucleotides within a column represents their proportion in the binding sites, while the absolute height of each column represents information (relative entropy) and the height of all columns added together represents total information RESULTS: We present figures illustrating the utility of using MethylSeqLogo to summarize data from several CpG binding transcription factors. The logos show that unmethylated CpG binding sites are a feature of transcription factors such as MYC and ZBTB33, while some other CpG binding transcription factors, such as CEBPB, appear methylation neutral.</p><p><strong>Conclusions: </strong>Our software enables users to explore bisulfite and ChIP sequencing data sets-and in the process obtain publication quality figures.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462690/pdf/","citationCount":"0","resultStr":"{\"title\":\"MethylSeqLogo: DNA methylation smart sequence logos.\",\"authors\":\"Fei-Man Hsu, Paul Horton\",\"doi\":\"10.1186/s12859-024-05896-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Some transcription factors, MYC for example, bind sites of potentially methylated DNA. This may increase binding specificity as such sites are (1) highly under-represented in the genome, and (2) offer additional, tissue specific information in the form of hypo- or hyper-methylation. Fortunately, bisulfite sequencing data can be used to investigate this phenomenon.</p><p><strong>Method: </strong>We developed MethylSeqLogo, an extension of sequence logos which includes new elements to indicate DNA methylation and under-represented dimers in each position of a set binding sites. Our method displays information from both DNA strands, and takes into account the sequence context (CpG or other) and genome region (promoter versus whole genome) appropriate to properly assess the expected background dimer frequency and level of methylation. MethylSeqLogo preserves sequence logo semantics-the relative height of nucleotides within a column represents their proportion in the binding sites, while the absolute height of each column represents information (relative entropy) and the height of all columns added together represents total information RESULTS: We present figures illustrating the utility of using MethylSeqLogo to summarize data from several CpG binding transcription factors. The logos show that unmethylated CpG binding sites are a feature of transcription factors such as MYC and ZBTB33, while some other CpG binding transcription factors, such as CEBPB, appear methylation neutral.</p><p><strong>Conclusions: </strong>Our software enables users to explore bisulfite and ChIP sequencing data sets-and in the process obtain publication quality figures.</p>\",\"PeriodicalId\":8958,\"journal\":{\"name\":\"BMC Bioinformatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462690/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12859-024-05896-2\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-024-05896-2","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
摘要
背景:一些转录因子(例如 MYC)与可能甲基化的 DNA 位点结合。这可能会增加结合的特异性,因为这些位点(1)在基因组中的代表性极低,(2)以低甲基化或高甲基化的形式提供额外的组织特异性信息。幸运的是,亚硫酸氢盐测序数据可用于研究这一现象:我们开发了 MethylSeqLogo,它是序列标识的一种扩展,其中包含了一些新元素,用于显示 DNA 甲基化和一组结合位点中每个位置上代表性不足的二聚体。我们的方法显示 DNA 双链的信息,并考虑到适当的序列上下文(CpG 或其他)和基因组区域(启动子或全基因组),以正确评估预期的背景二聚体频率和甲基化水平。MethylSeqLogo 保留了序列徽标的语义--一列中核苷酸的相对高度代表它们在结合位点中的比例,而每列的绝对高度代表信息(相对熵),所有列加起来的高度代表总信息 结果:我们展示的图表说明了使用 MethylSeqLogo 总结几个 CpG 结合转录因子数据的实用性。图标显示,未甲基化的 CpG 结合位点是 MYC 和 ZBTB33 等转录因子的特征,而其他一些 CpG 结合转录因子(如 CEBPB)则呈现甲基化中性:结论:我们的软件使用户能够探索亚硫酸氢盐和 ChIP 测序数据集,并在此过程中获得具有发表质量的数据。
MethylSeqLogo: DNA methylation smart sequence logos.
Background: Some transcription factors, MYC for example, bind sites of potentially methylated DNA. This may increase binding specificity as such sites are (1) highly under-represented in the genome, and (2) offer additional, tissue specific information in the form of hypo- or hyper-methylation. Fortunately, bisulfite sequencing data can be used to investigate this phenomenon.
Method: We developed MethylSeqLogo, an extension of sequence logos which includes new elements to indicate DNA methylation and under-represented dimers in each position of a set binding sites. Our method displays information from both DNA strands, and takes into account the sequence context (CpG or other) and genome region (promoter versus whole genome) appropriate to properly assess the expected background dimer frequency and level of methylation. MethylSeqLogo preserves sequence logo semantics-the relative height of nucleotides within a column represents their proportion in the binding sites, while the absolute height of each column represents information (relative entropy) and the height of all columns added together represents total information RESULTS: We present figures illustrating the utility of using MethylSeqLogo to summarize data from several CpG binding transcription factors. The logos show that unmethylated CpG binding sites are a feature of transcription factors such as MYC and ZBTB33, while some other CpG binding transcription factors, such as CEBPB, appear methylation neutral.
Conclusions: Our software enables users to explore bisulfite and ChIP sequencing data sets-and in the process obtain publication quality figures.
期刊介绍:
BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology.
BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.