Predicting cell population-specific gene expression from genomic sequence.

IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Frontiers in bioinformatics Pub Date : 2024-03-04 eCollection Date: 2024-01-01 DOI:10.3389/fbinf.2024.1347276
Lieke Michielsen, Marcel J T Reinders, Ahmed Mahfouz
{"title":"Predicting cell population-specific gene expression from genomic sequence.","authors":"Lieke Michielsen, Marcel J T Reinders, Ahmed Mahfouz","doi":"10.3389/fbinf.2024.1347276","DOIUrl":null,"url":null,"abstract":"<p><p>Most regulatory elements, especially enhancer sequences, are cell population-specific. One could even argue that a distinct set of regulatory elements is what defines a cell population. However, discovering which non-coding regions of the DNA are essential in which context, and as a result, which genes are expressed, is a difficult task. Some computational models tackle this problem by predicting gene expression directly from the genomic sequence. These models are currently limited to predicting bulk measurements and mainly make tissue-specific predictions. Here, we present a model that leverages single-cell RNA-sequencing data to predict gene expression. We show that cell population-specific models outperform tissue-specific models, especially when the expression profile of a cell population and the corresponding tissue are dissimilar. Further, we show that our model can prioritize GWAS variants and learn motifs of transcription factor binding sites. We envision that our model can be useful for delineating cell population-specific regulatory elements.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1347276"},"PeriodicalIF":2.8000,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10944912/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fbinf.2024.1347276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Most regulatory elements, especially enhancer sequences, are cell population-specific. One could even argue that a distinct set of regulatory elements is what defines a cell population. However, discovering which non-coding regions of the DNA are essential in which context, and as a result, which genes are expressed, is a difficult task. Some computational models tackle this problem by predicting gene expression directly from the genomic sequence. These models are currently limited to predicting bulk measurements and mainly make tissue-specific predictions. Here, we present a model that leverages single-cell RNA-sequencing data to predict gene expression. We show that cell population-specific models outperform tissue-specific models, especially when the expression profile of a cell population and the corresponding tissue are dissimilar. Further, we show that our model can prioritize GWAS variants and learn motifs of transcription factor binding sites. We envision that our model can be useful for delineating cell population-specific regulatory elements.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从基因组序列预测细胞群特异性基因表达。
大多数调控元件,尤其是增强子序列,都具有细胞群体特异性。甚至可以说,一组独特的调控元件就是细胞群体的定义。然而,要发现 DNA 中哪些非编码区域在何种情况下必不可少,从而发现哪些基因会表达,是一项艰巨的任务。一些计算模型通过直接从基因组序列预测基因表达来解决这一问题。这些模型目前仅限于预测批量测量,主要是针对特定组织进行预测。在这里,我们提出了一种利用单细胞 RNA 序列数据预测基因表达的模型。我们的研究表明,细胞群特异性模型优于组织特异性模型,尤其是当细胞群和相应组织的表达谱不同时。此外,我们的研究还表明,我们的模型可以确定 GWAS 变异的优先次序,并学习转录因子结合位点的图案。我们设想,我们的模型可用于划分细胞群特异性调控元件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.60
自引率
0.00%
发文量
0
期刊最新文献
Quantification of muscle fiber malformations using edge detection to investigate chronic muscle pressure ulcers. Computational identification and characterization of chitinase 1 and chitinase 2 from neotropical isolates of Beauveria bassiana. DCMA: faster protein backbone dihedral angle prediction using a dilated convolutional attention-based neural network. Identification of novel drug targets for Helicobacter pylori: structure-based virtual screening of potential inhibitors against DAH7PS protein involved in the shikimate pathway. Editorial: Women in bioinformatics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1