Comprehensive Analysis of Ubiquitously Expressed Genes in Humans from A Data-driven Perspective

IF 11.5 2区 生物学 Q1 GENETICS & HEREDITY Genomics, Proteomics & Bioinformatics Pub Date : 2023-02-01 DOI:10.1016/j.gpb.2021.08.017
Jianlei Gu , Jiawei Dai , Hui Lu , Hongyu Zhao
{"title":"Comprehensive Analysis of Ubiquitously Expressed Genes in Humans from A Data-driven Perspective","authors":"Jianlei Gu ,&nbsp;Jiawei Dai ,&nbsp;Hui Lu ,&nbsp;Hongyu Zhao","doi":"10.1016/j.gpb.2021.08.017","DOIUrl":null,"url":null,"abstract":"<div><p>Comprehensive characterization of spatial and temporal gene expression patterns in humans is critical for uncovering the regulatory codes of the human genome and understanding the molecular mechanisms of human diseases. Ubiquitously expressed genes (UEGs) refer to the genes expressed across a majority of, if not all, phenotypic and physiological conditions of an organism. It is known that many human genes are broadly expressed across tissues. However, most previous UEG studies have only focused on providing a list of UEGs without capturing their global expression patterns, thus limiting the potential use of UEG information. In this study, we proposed a novel data-driven framework to leverage the extensive collection of ∼ 40,000 human transcriptomes to derive a list of UEGs and their corresponding global expression patterns, which offers a valuable resource to further characterize human transcriptome. Our results suggest that about half (12,234; 49.01%) of the human genes are expressed in at least 80% of human transcriptomes, and the median size of the human transcriptome is 16,342 genes (65.44%). Through gene clustering, we identified a set of UEGs, named LoVarUEGs, which have stable expression across human transcriptomes and can be used as internal reference genes for expression measurement. To further demonstrate the usefulness of this resource, we evaluated the global expression patterns for 16 previously predicted <strong>disallowed genes</strong> in islet beta cells and found that seven of these genes showed relatively more varied expression patterns, suggesting that the repression of these genes may not be unique to islet beta cells.</p></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 1","pages":"Pages 164-176"},"PeriodicalIF":11.5000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10373092/pdf/","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics, Proteomics & Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1672022922000420","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 7

Abstract

Comprehensive characterization of spatial and temporal gene expression patterns in humans is critical for uncovering the regulatory codes of the human genome and understanding the molecular mechanisms of human diseases. Ubiquitously expressed genes (UEGs) refer to the genes expressed across a majority of, if not all, phenotypic and physiological conditions of an organism. It is known that many human genes are broadly expressed across tissues. However, most previous UEG studies have only focused on providing a list of UEGs without capturing their global expression patterns, thus limiting the potential use of UEG information. In this study, we proposed a novel data-driven framework to leverage the extensive collection of ∼ 40,000 human transcriptomes to derive a list of UEGs and their corresponding global expression patterns, which offers a valuable resource to further characterize human transcriptome. Our results suggest that about half (12,234; 49.01%) of the human genes are expressed in at least 80% of human transcriptomes, and the median size of the human transcriptome is 16,342 genes (65.44%). Through gene clustering, we identified a set of UEGs, named LoVarUEGs, which have stable expression across human transcriptomes and can be used as internal reference genes for expression measurement. To further demonstrate the usefulness of this resource, we evaluated the global expression patterns for 16 previously predicted disallowed genes in islet beta cells and found that seven of these genes showed relatively more varied expression patterns, suggesting that the repression of these genes may not be unique to islet beta cells.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于数据驱动的人类普遍表达基因综合分析
全面表征人类的时空基因表达模式对于揭示人类基因组的调控密码和理解人类疾病的分子机制至关重要。普遍表达基因(UEG)是指在生物体的大多数(如果不是全部的话)表型和生理条件下表达的基因。众所周知,许多人类基因在组织中广泛表达。然而,大多数先前的UEG研究只关注于提供UEG的列表,而没有捕捉它们的全局表达模式,从而限制了UEG信息的潜在使用。在这项研究中,我们提出了一个新的数据驱动框架,利用广泛收集的约40000个人类转录组来推导UEG及其相应的全球表达模式列表,这为进一步表征人类转录组提供了宝贵的资源。我们的结果表明,大约一半(12234;49.01%)的人类基因在至少80%的人类转录组中表达,人类转录组的中位大小为16342个基因(65.44%)。通过基因聚类,我们鉴定了一组UEG,命名为LoVarUEGs,其在人类转录组中具有稳定的表达,并且可以用作表达测量的内部参考基因。为了进一步证明这一资源的有用性,我们评估了16个先前预测的胰岛β细胞中不被允许的基因的整体表达模式,发现其中7个基因表现出相对更多样的表达模式,这表明这些基因的抑制可能不是胰岛β细胞独有的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Genomics, Proteomics & Bioinformatics
Genomics, Proteomics & Bioinformatics Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
14.30
自引率
4.20%
发文量
844
审稿时长
61 days
期刊介绍: Genomics, Proteomics and Bioinformatics (GPB) is the official journal of the Beijing Institute of Genomics, Chinese Academy of Sciences / China National Center for Bioinformation and Genetics Society of China. It aims to disseminate new developments in the field of omics and bioinformatics, publish high-quality discoveries quickly, and promote open access and online publication. GPB welcomes submissions in all areas of life science, biology, and biomedicine, with a focus on large data acquisition, analysis, and curation. Manuscripts covering omics and related bioinformatics topics are particularly encouraged. GPB is indexed/abstracted by PubMed/MEDLINE, PubMed Central, Scopus, BIOSIS Previews, Chemical Abstracts, CSCD, among others.
期刊最新文献
Review and Evaluate the Bioinformatics Analysis Strategies of ATAC-seq and CUT&Tag Data. Identification of highly repetitive barley enhancers with long-range regulation potential via STARR-seq CpG island definition and methylation mapping of the T2T-YAO genome Pindel-TD: a tandem duplication detector based on a pattern growth approach SMARTdb: An Integrated Database for Exploring Single-cell Multi-omics Data of Reproductive Medicine
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1