Structure-based learning to predict and model protein-DNA interactions and transcription-factor co-operativity in cis-regulatory elements.

IF 4 Q1 GENETICS & HEREDITY NAR Genomics and Bioinformatics Pub Date : 2024-06-12 eCollection Date: 2024-06-01 DOI:10.1093/nargab/lqae068
Fornes Oriol, Meseguer Alberto, Aguirre-Plans Joachim, Gohl Patrick, Bota Patricia M, Molina-Fernández Ruben, Bonet Jaume, Chinchilla-Hernandez Altair, Pegenaute Ferran, Gallego Oriol, Fernandez-Fuentes Narcis, Oliva Baldo
{"title":"Structure-based learning to predict and model protein-DNA interactions and transcription-factor co-operativity in <i>cis</i>-regulatory elements.","authors":"Fornes Oriol, Meseguer Alberto, Aguirre-Plans Joachim, Gohl Patrick, Bota Patricia M, Molina-Fernández Ruben, Bonet Jaume, Chinchilla-Hernandez Altair, Pegenaute Ferran, Gallego Oriol, Fernandez-Fuentes Narcis, Oliva Baldo","doi":"10.1093/nargab/lqae068","DOIUrl":null,"url":null,"abstract":"<p><p>Transcription factor (TF) binding is a key component of genomic regulation. There are numerous high-throughput experimental methods to characterize TF-DNA binding specificities. Their application, however, is both laborious and expensive, which makes profiling all TFs challenging. For instance, the binding preferences of ∼25% human TFs remain unknown; they neither have been determined experimentally nor inferred computationally. We introduce a structure-based learning approach to predict the binding preferences of TFs and the automated modelling of TF regulatory complexes. We show the advantage of using our approach over the classical nearest-neighbor prediction in the limits of remote homology. Starting from a TF sequence or structure, we predict binding preferences in the form of motifs that are then used to scan a DNA sequence for occurrences. The best matches are either profiled with a binding score or collected for their subsequent modeling into a higher-order regulatory complex with DNA. Co-operativity is modelled by: (i) the co-localization of TFs and (ii) the structural modeling of protein-protein interactions between TFs and with co-factors. We have applied our approach to automatically model the interferon-β enhanceosome and the pioneering complexes of OCT4, SOX2 (or SOX11) and KLF4 with a nucleosome, which are compared with the experimentally known structures.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 2","pages":"lqae068"},"PeriodicalIF":4.0000,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11167492/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqae068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Transcription factor (TF) binding is a key component of genomic regulation. There are numerous high-throughput experimental methods to characterize TF-DNA binding specificities. Their application, however, is both laborious and expensive, which makes profiling all TFs challenging. For instance, the binding preferences of ∼25% human TFs remain unknown; they neither have been determined experimentally nor inferred computationally. We introduce a structure-based learning approach to predict the binding preferences of TFs and the automated modelling of TF regulatory complexes. We show the advantage of using our approach over the classical nearest-neighbor prediction in the limits of remote homology. Starting from a TF sequence or structure, we predict binding preferences in the form of motifs that are then used to scan a DNA sequence for occurrences. The best matches are either profiled with a binding score or collected for their subsequent modeling into a higher-order regulatory complex with DNA. Co-operativity is modelled by: (i) the co-localization of TFs and (ii) the structural modeling of protein-protein interactions between TFs and with co-factors. We have applied our approach to automatically model the interferon-β enhanceosome and the pioneering complexes of OCT4, SOX2 (or SOX11) and KLF4 with a nucleosome, which are compared with the experimentally known structures.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于结构的学习,预测和模拟顺式调控元件中蛋白质与 DNA 的相互作用以及转录因子的协同作用。
转录因子(TF)结合是基因组调控的关键组成部分。目前有许多高通量实验方法来表征 TF-DNA 结合的特异性。然而,这些方法的应用既费力又昂贵,这使得对所有转录因子进行分析具有挑战性。例如,25% 的人类 TFs 的结合偏好仍然未知;它们既没有通过实验确定,也没有通过计算推断。我们介绍了一种基于结构的学习方法,用于预测 TF 的结合偏好和 TF 调控复合物的自动建模。我们展示了在远缘同源性的限制下,使用我们的方法比经典的最近邻预测更有优势。从 TF 序列或结构出发,我们以图案的形式预测结合偏好,然后用图案扫描 DNA 序列,寻找出现的图案。最好的匹配结果会以结合得分进行分析,或者收集起来,以便随后将其建模为与 DNA 的高阶调控复合物。协同作用的建模方法包括(i) TFs 的共定位;(ii) TFs 之间以及 TFs 与辅助因子之间蛋白质-蛋白质相互作用的结构建模。我们应用我们的方法自动建模了干扰素-β增强体以及 OCT4、SOX2(或 SOX11)和 KLF4 与核小体的先驱复合体,并将其与实验已知的结构进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
8.00
自引率
2.20%
发文量
95
审稿时长
15 weeks
期刊最新文献
Phenotype prediction in plants is improved by integrating large-scale transcriptomic datasets. AntiBody Sequence Database. Approximate nearest neighbor graph provides fast and efficient embedding with applications for large-scale biological data. Cell- and tissue-specific glycosylation pathways informed by single-cell transcriptomics. HiCrayon reveals distinct layers of multi-state 3D chromatin organization.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1