Klara Kuret, Aram Gustav Amalietti, D Marc Jones, Charlotte Capitanchik, Jernej Ule
{"title":"Positional motif analysis reveals the extent of specificity of protein-RNA interactions observed by CLIP.","authors":"Klara Kuret, Aram Gustav Amalietti, D Marc Jones, Charlotte Capitanchik, Jernej Ule","doi":"10.1186/s13059-022-02755-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Crosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA-protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specify the RNA-binding profiles of RBPs in cells.</p><p><strong>Results: </strong>We develop positionally enriched k-mer analysis (PEKA), a computational tool for efficient analysis of enriched motifs from individual CLIP datasets, which minimizes the impact of technical and regional genomic biases by internal data normalization. We cross-validate PEKA with mCross and show that the use of input control for background correction is not required to yield high specificity of enriched motifs. We identify motif classes with common enrichment patterns across eCLIP datasets and across RNA regions, while also observing variations in the specificity and the extent of motif enrichment across eCLIP datasets, between variant CLIP protocols, and between CLIP and in vitro binding data. Thereby, we gain insights into the contributions of technical and regional genomic biases to the enriched motifs, and find how motif enrichment features relate to the domain composition and low-complexity regions of the studied proteins.</p><p><strong>Conclusions: </strong>Our study provides insights into the overall contributions of regional binding preferences, protein domains, and low-complexity regions to the specificity of protein-RNA interactions, and shows the value of cross-motif and cross-RBP comparison for data interpretation. Our results are presented for exploratory analysis via an online platform in an RBP-centric and motif-centric manner ( https://imaps.goodwright.com/apps/peka/ ).</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"23 1","pages":"191"},"PeriodicalIF":12.3000,"publicationDate":"2022-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9461102/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13059-022-02755-2","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Crosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA-protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specify the RNA-binding profiles of RBPs in cells.
Results: We develop positionally enriched k-mer analysis (PEKA), a computational tool for efficient analysis of enriched motifs from individual CLIP datasets, which minimizes the impact of technical and regional genomic biases by internal data normalization. We cross-validate PEKA with mCross and show that the use of input control for background correction is not required to yield high specificity of enriched motifs. We identify motif classes with common enrichment patterns across eCLIP datasets and across RNA regions, while also observing variations in the specificity and the extent of motif enrichment across eCLIP datasets, between variant CLIP protocols, and between CLIP and in vitro binding data. Thereby, we gain insights into the contributions of technical and regional genomic biases to the enriched motifs, and find how motif enrichment features relate to the domain composition and low-complexity regions of the studied proteins.
Conclusions: Our study provides insights into the overall contributions of regional binding preferences, protein domains, and low-complexity regions to the specificity of protein-RNA interactions, and shows the value of cross-motif and cross-RBP comparison for data interpretation. Our results are presented for exploratory analysis via an online platform in an RBP-centric and motif-centric manner ( https://imaps.goodwright.com/apps/peka/ ).
期刊介绍:
Genome Biology is a leading research journal that focuses on the study of biology and biomedicine from a genomic and post-genomic standpoint. The journal consistently publishes outstanding research across various areas within these fields.
With an impressive impact factor of 12.3 (2022), Genome Biology has earned its place as the 3rd highest-ranked research journal in the Genetics and Heredity category, according to Thomson Reuters. Additionally, it is ranked 2nd among research journals in the Biotechnology and Applied Microbiology category. It is important to note that Genome Biology is the top-ranking open access journal in this category.
In summary, Genome Biology sets a high standard for scientific publications in the field, showcasing cutting-edge research and earning recognition among its peers.