Lu Tang, Dongyang Xu, Lingcong Luo, Weiyan Ma, Xiaojie He, Yong Diao, Rongqin Ke, Philipp Kapranov
{"title":"利用靶向 RNA 富集技术发现的新型人类蛋白质编码基因座","authors":"Lu Tang, Dongyang Xu, Lingcong Luo, Weiyan Ma, Xiaojie He, Yong Diao, Rongqin Ke, Philipp Kapranov","doi":"10.1186/s12915-024-02069-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Accurate and comprehensive genomic annotation, including the full list of protein-coding genes, is vital for understanding the molecular mechanisms of human biology. We have previously shown that the genome contains a multitude of yet hidden functional exons and transcripts, some of which might represent novel mRNAs. These results resonate with those from other groups and strongly argue that two decades after the completion of the first draft of the human genome sequence, the current annotation of human genes and transcripts remains far from being complete.</p><p><strong>Results: </strong>Using a targeted RNA enrichment technique, we showed that one of the novel functional exons previously discovered by us and currently annotated as part of a long non-coding RNA, is actually a part of a novel protein-coding gene, InSETG-4, which encodes a novel human protein with no known homologs or motifs. We found that InSETG-4 is induced by various DNA-damaging agents across multiple cell types and therefore might represent a novel component of DNA damage response. Despite its low abundance in bulk cell populations, InSETG-4 exhibited expression restricted to a small fraction of cells, as demonstrated by the amplification-based single-molecule fluorescence in situ hybridization (asmFISH) analysis.</p><p><strong>Conclusions: </strong>This study argues that yet undiscovered human protein-coding genes exist and provides an example of how targeted RNA enrichment techniques can help to fill this major gap in our knowledge of the information encoded in the human genome.</p>","PeriodicalId":9339,"journal":{"name":"BMC Biology","volume":"22 1","pages":"273"},"PeriodicalIF":4.4000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590353/pdf/","citationCount":"0","resultStr":"{\"title\":\"A novel human protein-coding locus identified using a targeted RNA enrichment technique.\",\"authors\":\"Lu Tang, Dongyang Xu, Lingcong Luo, Weiyan Ma, Xiaojie He, Yong Diao, Rongqin Ke, Philipp Kapranov\",\"doi\":\"10.1186/s12915-024-02069-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Accurate and comprehensive genomic annotation, including the full list of protein-coding genes, is vital for understanding the molecular mechanisms of human biology. We have previously shown that the genome contains a multitude of yet hidden functional exons and transcripts, some of which might represent novel mRNAs. These results resonate with those from other groups and strongly argue that two decades after the completion of the first draft of the human genome sequence, the current annotation of human genes and transcripts remains far from being complete.</p><p><strong>Results: </strong>Using a targeted RNA enrichment technique, we showed that one of the novel functional exons previously discovered by us and currently annotated as part of a long non-coding RNA, is actually a part of a novel protein-coding gene, InSETG-4, which encodes a novel human protein with no known homologs or motifs. We found that InSETG-4 is induced by various DNA-damaging agents across multiple cell types and therefore might represent a novel component of DNA damage response. Despite its low abundance in bulk cell populations, InSETG-4 exhibited expression restricted to a small fraction of cells, as demonstrated by the amplification-based single-molecule fluorescence in situ hybridization (asmFISH) analysis.</p><p><strong>Conclusions: </strong>This study argues that yet undiscovered human protein-coding genes exist and provides an example of how targeted RNA enrichment techniques can help to fill this major gap in our knowledge of the information encoded in the human genome.</p>\",\"PeriodicalId\":9339,\"journal\":{\"name\":\"BMC Biology\",\"volume\":\"22 1\",\"pages\":\"273\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2024-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590353/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12915-024-02069-8\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12915-024-02069-8","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
背景:准确而全面的基因组注释(包括完整的蛋白质编码基因列表)对于了解人类生物学的分子机制至关重要。我们之前已经证明,基因组包含大量尚未被发现的功能外显子和转录本,其中一些可能代表新型 mRNA。这些结果与其他研究小组的结果产生了共鸣,并有力地证明了在人类基因组序列初稿完成二十年后,目前对人类基因和转录本的注释仍远未完成:我们利用靶向 RNA 富集技术发现,我们之前发现的、目前被注释为长非编码 RNA 的一部分的新型功能外显子之一,实际上是一个新型蛋白质编码基因 InSETG-4 的一部分,该基因编码一种新型人类蛋白质,没有已知的同源物或主题。我们发现,InSETG-4 会被多种细胞类型中的各种 DNA 损伤因子诱导,因此可能是 DNA 损伤反应的一个新成分。基于扩增的单分子荧光原位杂交(asmFISH)分析表明,尽管InSETG-4在大量细胞中的丰度较低,但它的表达仅限于一小部分细胞:这项研究证明,人类还存在未被发现的蛋白质编码基因,并提供了一个实例,说明靶向 RNA 富集技术如何有助于填补我们对人类基因组编码信息了解的这一重大空白。
A novel human protein-coding locus identified using a targeted RNA enrichment technique.
Background: Accurate and comprehensive genomic annotation, including the full list of protein-coding genes, is vital for understanding the molecular mechanisms of human biology. We have previously shown that the genome contains a multitude of yet hidden functional exons and transcripts, some of which might represent novel mRNAs. These results resonate with those from other groups and strongly argue that two decades after the completion of the first draft of the human genome sequence, the current annotation of human genes and transcripts remains far from being complete.
Results: Using a targeted RNA enrichment technique, we showed that one of the novel functional exons previously discovered by us and currently annotated as part of a long non-coding RNA, is actually a part of a novel protein-coding gene, InSETG-4, which encodes a novel human protein with no known homologs or motifs. We found that InSETG-4 is induced by various DNA-damaging agents across multiple cell types and therefore might represent a novel component of DNA damage response. Despite its low abundance in bulk cell populations, InSETG-4 exhibited expression restricted to a small fraction of cells, as demonstrated by the amplification-based single-molecule fluorescence in situ hybridization (asmFISH) analysis.
Conclusions: This study argues that yet undiscovered human protein-coding genes exist and provides an example of how targeted RNA enrichment techniques can help to fill this major gap in our knowledge of the information encoded in the human genome.
期刊介绍:
BMC Biology is a broad scope journal covering all areas of biology. Our content includes research articles, new methods and tools. BMC Biology also publishes reviews, Q&A, and commentaries.