Abdullah Abood, Larry D Mesner, Erin D Jeffery, Mayank Murali, Micah D Lehe, Jamie Saquing, Charles R Farber, Gloria M Sheynkman
{"title":"Long-read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors of disease.","authors":"Abdullah Abood, Larry D Mesner, Erin D Jeffery, Mayank Murali, Micah D Lehe, Jamie Saquing, Charles R Farber, Gloria M Sheynkman","doi":"10.1016/j.ajhg.2024.07.003","DOIUrl":null,"url":null,"abstract":"<p><p>A major fraction of loci identified by genome-wide association studies (GWASs) mediate alternative splicing, but mechanistic interpretation is hindered by the technical limitations of short-read RNA sequencing (RNA-seq), which cannot directly link splicing events to full-length protein isoforms. Long-read RNA-seq represents a powerful tool to characterize transcript isoforms, and recently, infer protein isoform existence. Here, we present an approach that integrates information from GWASs, splicing quantitative trait loci (sQTLs), and PacBio long-read RNA-seq in a disease-relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode. We demonstrate the utility of our approach using bone mineral density (BMD) GWAS data. We identified 1,863 sQTLs from the Genotype-Tissue Expression (GTEx) project in 732 protein-coding genes that colocalized with BMD associations (H4PP ≥ 0.75). We generated PacBio Iso-Seq data (N = ∼22 million full-length reads) on human osteoblasts, identifying 68,326 protein-coding isoforms, of which 17,375 (25%) were unannotated. By casting the sQTLs onto protein isoforms, we connected 809 sQTLs to 2,029 protein isoforms from 441 genes expressed in osteoblasts. Overall, we found that 74 sQTLs influenced isoforms likely impacted by nonsense-mediated decay and 190 that potentially resulted in the expression of unannotated protein isoforms. Finally, we functionally validated colocalizing sQTLs in TPM2, in which siRNA-mediated knockdown in osteoblasts showed two TPM2 isoforms with opposing effects on mineralization but exhibited no effect upon knockdown of the entire gene. Our approach should be to generalize across diverse clinical traits and to provide insights into protein isoform activities modulated by GWAS loci.</p>","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":" ","pages":"1914-1931"},"PeriodicalIF":8.1000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393689/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of human genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.ajhg.2024.07.003","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
A major fraction of loci identified by genome-wide association studies (GWASs) mediate alternative splicing, but mechanistic interpretation is hindered by the technical limitations of short-read RNA sequencing (RNA-seq), which cannot directly link splicing events to full-length protein isoforms. Long-read RNA-seq represents a powerful tool to characterize transcript isoforms, and recently, infer protein isoform existence. Here, we present an approach that integrates information from GWASs, splicing quantitative trait loci (sQTLs), and PacBio long-read RNA-seq in a disease-relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode. We demonstrate the utility of our approach using bone mineral density (BMD) GWAS data. We identified 1,863 sQTLs from the Genotype-Tissue Expression (GTEx) project in 732 protein-coding genes that colocalized with BMD associations (H4PP ≥ 0.75). We generated PacBio Iso-Seq data (N = ∼22 million full-length reads) on human osteoblasts, identifying 68,326 protein-coding isoforms, of which 17,375 (25%) were unannotated. By casting the sQTLs onto protein isoforms, we connected 809 sQTLs to 2,029 protein isoforms from 441 genes expressed in osteoblasts. Overall, we found that 74 sQTLs influenced isoforms likely impacted by nonsense-mediated decay and 190 that potentially resulted in the expression of unannotated protein isoforms. Finally, we functionally validated colocalizing sQTLs in TPM2, in which siRNA-mediated knockdown in osteoblasts showed two TPM2 isoforms with opposing effects on mineralization but exhibited no effect upon knockdown of the entire gene. Our approach should be to generalize across diverse clinical traits and to provide insights into protein isoform activities modulated by GWAS loci.
期刊介绍:
The American Journal of Human Genetics (AJHG) is a monthly journal published by Cell Press, chosen by The American Society of Human Genetics (ASHG) as its premier publication starting from January 2008. AJHG represents Cell Press's first society-owned journal, and both ASHG and Cell Press anticipate significant synergies between AJHG content and that of other Cell Press titles.