从 ChIP-exo 数据中识别主题的加权两阶段序列比对框架

IF 6.7 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Patterns Pub Date : 2024-02-02 DOI:10.1016/j.patter.2024.100927

Yang Li, Yizhong Wang, Cankun Wang, Anjun Ma, Qin Ma, Bingqiang Liu

{"title":"从 ChIP-exo 数据中识别主题的加权两阶段序列比对框架","authors":"Yang Li, Yizhong Wang, Cankun Wang, Anjun Ma, Qin Ma, Bingqiang Liu","doi":"10.1016/j.patter.2024.100927","DOIUrl":null,"url":null,"abstract":"In this study, we introduce TESA (weighted two-stage alignment), an innovative motif prediction tool that refines the identification of DNA-binding protein motifs, essential for deciphering transcriptional regulatory mechanisms. Unlike traditional algorithms that rely solely on sequence data, TESA integrates the high-resolution chromatin immunoprecipitation (ChIP) signal, specifically from ChIP-exonuclease (ChIP-exo), by assigning weights to sequence positions, thereby enhancing motif discovery. TESA employs a nuanced approach combining a binomial distribution model with a graph model, further supported by a “bookend” model, to improve the accuracy of predicting motifs of varying lengths. Our evaluation, utilizing an extensive compilation of 90 prokaryotic ChIP-exo datasets from proChIPdb and 167 H. sapiens datasets, compared TESA’s performance against seven established tools. The results indicate TESA’s improved precision in motif identification, suggesting its valuable contribution to the field of genomic research.","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"41 1","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A weighted two-stage sequence alignment framework to identify motifs from ChIP-exo data\",\"authors\":\"Yang Li, Yizhong Wang, Cankun Wang, Anjun Ma, Qin Ma, Bingqiang Liu\",\"doi\":\"10.1016/j.patter.2024.100927\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, we introduce TESA (weighted two-stage alignment), an innovative motif prediction tool that refines the identification of DNA-binding protein motifs, essential for deciphering transcriptional regulatory mechanisms. Unlike traditional algorithms that rely solely on sequence data, TESA integrates the high-resolution chromatin immunoprecipitation (ChIP) signal, specifically from ChIP-exonuclease (ChIP-exo), by assigning weights to sequence positions, thereby enhancing motif discovery. TESA employs a nuanced approach combining a binomial distribution model with a graph model, further supported by a “bookend” model, to improve the accuracy of predicting motifs of varying lengths. Our evaluation, utilizing an extensive compilation of 90 prokaryotic ChIP-exo datasets from proChIPdb and 167 H. sapiens datasets, compared TESA’s performance against seven established tools. The results indicate TESA’s improved precision in motif identification, suggesting its valuable contribution to the field of genomic research.\",\"PeriodicalId\":36242,\"journal\":{\"name\":\"Patterns\",\"volume\":\"41 1\",\"pages\":\"\"},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2024-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Patterns\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.patter.2024.100927\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Patterns","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.patter.2024.100927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在这项研究中，我们介绍了 TESA（加权两阶段比对），这是一种创新的主题预测工具，它能完善 DNA 结合蛋白主题的识别，这对破译转录调控机制至关重要。与仅依赖序列数据的传统算法不同，TESA 通过为序列位置分配权重，整合了高分辨率染色质免疫沉淀（ChIP）信号，特别是来自 ChIP-exonuclease（ChIP-exo）的信号，从而提高了主题发现的能力。TESA 采用了一种细致入微的方法，将二项分布模型与图形模型相结合，并辅以 "书尾 "模型，从而提高了预测不同长度主题的准确性。我们利用来自 proChIPdb 的 90 个原核生物 ChIP-exo 数据集和 167 个智人数据集的广泛汇编进行了评估，将 TESA 的性能与七种成熟工具进行了比较。结果表明 TESA 提高了主题识别的精确度，这表明它在基因组研究领域做出了宝贵的贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A weighted two-stage sequence alignment framework to identify motifs from ChIP-exo data

In this study, we introduce TESA (weighted two-stage alignment), an innovative motif prediction tool that refines the identification of DNA-binding protein motifs, essential for deciphering transcriptional regulatory mechanisms. Unlike traditional algorithms that rely solely on sequence data, TESA integrates the high-resolution chromatin immunoprecipitation (ChIP) signal, specifically from ChIP-exonuclease (ChIP-exo), by assigning weights to sequence positions, thereby enhancing motif discovery. TESA employs a nuanced approach combining a binomial distribution model with a graph model, further supported by a “bookend” model, to improve the accuracy of predicting motifs of varying lengths. Our evaluation, utilizing an extensive compilation of 90 prokaryotic ChIP-exo datasets from proChIPdb and 167 H. sapiens datasets, compared TESA’s performance against seven established tools. The results indicate TESA’s improved precision in motif identification, suggesting its valuable contribution to the field of genomic research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊