Long-read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors of disease.

IF 8.1 1区 生物学 Q1 GENETICS & HEREDITY American journal of human genetics Pub Date : 2024-09-05 Epub Date: 2024-07-29 DOI:10.1016/j.ajhg.2024.07.003
Abdullah Abood, Larry D Mesner, Erin D Jeffery, Mayank Murali, Micah D Lehe, Jamie Saquing, Charles R Farber, Gloria M Sheynkman
{"title":"Long-read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors of disease.","authors":"Abdullah Abood, Larry D Mesner, Erin D Jeffery, Mayank Murali, Micah D Lehe, Jamie Saquing, Charles R Farber, Gloria M Sheynkman","doi":"10.1016/j.ajhg.2024.07.003","DOIUrl":null,"url":null,"abstract":"<p><p>A major fraction of loci identified by genome-wide association studies (GWASs) mediate alternative splicing, but mechanistic interpretation is hindered by the technical limitations of short-read RNA sequencing (RNA-seq), which cannot directly link splicing events to full-length protein isoforms. Long-read RNA-seq represents a powerful tool to characterize transcript isoforms, and recently, infer protein isoform existence. Here, we present an approach that integrates information from GWASs, splicing quantitative trait loci (sQTLs), and PacBio long-read RNA-seq in a disease-relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode. We demonstrate the utility of our approach using bone mineral density (BMD) GWAS data. We identified 1,863 sQTLs from the Genotype-Tissue Expression (GTEx) project in 732 protein-coding genes that colocalized with BMD associations (H4PP ≥ 0.75). We generated PacBio Iso-Seq data (N = ∼22 million full-length reads) on human osteoblasts, identifying 68,326 protein-coding isoforms, of which 17,375 (25%) were unannotated. By casting the sQTLs onto protein isoforms, we connected 809 sQTLs to 2,029 protein isoforms from 441 genes expressed in osteoblasts. Overall, we found that 74 sQTLs influenced isoforms likely impacted by nonsense-mediated decay and 190 that potentially resulted in the expression of unannotated protein isoforms. Finally, we functionally validated colocalizing sQTLs in TPM2, in which siRNA-mediated knockdown in osteoblasts showed two TPM2 isoforms with opposing effects on mineralization but exhibited no effect upon knockdown of the entire gene. Our approach should be to generalize across diverse clinical traits and to provide insights into protein isoform activities modulated by GWAS loci.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1914-1931"},"PeriodicalIF":8.1000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393689/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of human genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.ajhg.2024.07.003","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

A major fraction of loci identified by genome-wide association studies (GWASs) mediate alternative splicing, but mechanistic interpretation is hindered by the technical limitations of short-read RNA sequencing (RNA-seq), which cannot directly link splicing events to full-length protein isoforms. Long-read RNA-seq represents a powerful tool to characterize transcript isoforms, and recently, infer protein isoform existence. Here, we present an approach that integrates information from GWASs, splicing quantitative trait loci (sQTLs), and PacBio long-read RNA-seq in a disease-relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode. We demonstrate the utility of our approach using bone mineral density (BMD) GWAS data. We identified 1,863 sQTLs from the Genotype-Tissue Expression (GTEx) project in 732 protein-coding genes that colocalized with BMD associations (H4PP ≥ 0.75). We generated PacBio Iso-Seq data (N = ∼22 million full-length reads) on human osteoblasts, identifying 68,326 protein-coding isoforms, of which 17,375 (25%) were unannotated. By casting the sQTLs onto protein isoforms, we connected 809 sQTLs to 2,029 protein isoforms from 441 genes expressed in osteoblasts. Overall, we found that 74 sQTLs influenced isoforms likely impacted by nonsense-mediated decay and 190 that potentially resulted in the expression of unannotated protein isoforms. Finally, we functionally validated colocalizing sQTLs in TPM2, in which siRNA-mediated knockdown in osteoblasts showed two TPM2 isoforms with opposing effects on mineralization but exhibited no effect upon knockdown of the entire gene. Our approach should be to generalize across diverse clinical traits and to provide insights into protein isoform activities modulated by GWAS loci.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
长读蛋白质基因组学将疾病相关的 sQTL 与疾病的蛋白质同工酶效应物联系起来。
全基因组关联研究(GWAS)发现的大部分基因位点介导了替代剪接,但由于短线程 RNA 测序(RNA-seq)的技术限制,无法将剪接事件与全长蛋白质同工酶直接联系起来,从而阻碍了机理解释。长线程 RNA-seq 是表征转录本异构体的有力工具,最近还能推断蛋白质异构体的存在。在这里,我们提出了一种方法,在疾病相关模型中整合来自基因组学分析、剪接定量性状位点(sQTLs)和 PacBio 长读程 RNA-seq 的信息,以推断 sQTLs 对其编码的最终蛋白质异构体产物的影响。我们利用骨矿物质密度(BMD)GWAS 数据证明了我们的方法的实用性。我们从基因型-组织表达(GTEx)项目的 732 个蛋白编码基因中鉴定出了 1863 个与 BMD 相关的 sQTLs(H4PP ≥ 0.75)。我们生成了人类成骨细胞的 PacBio Iso-Seq 数据(N = ∼ 2,200 万个全长读数),识别出 68,326 个蛋白编码同工型,其中 17,375 个(25%)未被标注。通过将 sQTLs 与蛋白质同工酶连接,我们将 809 个 sQTLs 与成骨细胞中表达的 441 个基因的 2,029 个蛋白质同工酶连接起来。总体而言,我们发现有 74 个 sQTLs 影响了可能受到无义介导衰变影响的同工酶,有 190 个 sQTLs 可能导致了未注释蛋白质同工酶的表达。最后,我们对 TPM2 中的共定位 sQTL 进行了功能验证,其中 siRNA 介导的成骨细胞基因敲除显示了两种 TPM2 同工酶,它们对矿化的影响截然相反,但敲除整个基因后却没有任何影响。我们的方法应该是在不同的临床性状中进行推广,并深入了解受 GWAS 基因位点调节的蛋白质同工酶活性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
14.70
自引率
4.10%
发文量
185
审稿时长
1 months
期刊介绍: The American Journal of Human Genetics (AJHG) is a monthly journal published by Cell Press, chosen by The American Society of Human Genetics (ASHG) as its premier publication starting from January 2008. AJHG represents Cell Press's first society-owned journal, and both ASHG and Cell Press anticipate significant synergies between AJHG content and that of other Cell Press titles.
期刊最新文献
The PRIMED Consortium: Reducing disparities in polygenic risk assessment. Comparative analysis of predicted DNA secondary structures infers complex human centromere topology. Toward trustable use of machine learning models of variant effects in the clinic. Allele frequency impacts the cross-ancestry portability of gene expression prediction in lymphoblastoid cell lines. Inherited infertility: Mapping loci associated with impaired female reproduction.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1