Combining full-length gene assay and SpliceAI to interpret the splicing impact of all possible SPINK1 coding variants

IF 3.8 3区 医学 Q2 GENETICS & HEREDITY Human Genomics Pub Date : 2024-02-27 DOI:10.1186/s40246-024-00586-9
Hao Wu, Jin-Huan Lin, Xin-Ying Tang, Gaëlle Marenne, Wen-Bin Zou, Sacha Schutz, Emmanuelle Masson, Emmanuelle Génin, Yann Fichou, Gerald Le Gac, Claude Férec, Zhuan Liao, Jian-Min Chen
{"title":"Combining full-length gene assay and SpliceAI to interpret the splicing impact of all possible SPINK1 coding variants","authors":"Hao Wu, Jin-Huan Lin, Xin-Ying Tang, Gaëlle Marenne, Wen-Bin Zou, Sacha Schutz, Emmanuelle Masson, Emmanuelle Génin, Yann Fichou, Gerald Le Gac, Claude Férec, Zhuan Liao, Jian-Min Chen","doi":"10.1186/s40246-024-00586-9","DOIUrl":null,"url":null,"abstract":"Single-nucleotide variants (SNVs) within gene coding sequences can significantly impact pre-mRNA splicing, bearing profound implications for pathogenic mechanisms and precision medicine. In this study, we aim to harness the well-established full-length gene splicing assay (FLGSA) in conjunction with SpliceAI to prospectively interpret the splicing effects of all potential coding SNVs within the four-exon SPINK1 gene, a gene associated with chronic pancreatitis. Our study began with a retrospective analysis of 27 SPINK1 coding SNVs previously assessed using FLGSA, proceeded with a prospective analysis of 35 new FLGSA-tested SPINK1 coding SNVs, followed by data extrapolation, and ended with further validation. In total, we analyzed 67 SPINK1 coding SNVs, which account for 9.3% of the 720 possible coding SNVs. Among these 67 FLGSA-analyzed SNVs, 12 were found to impact splicing. Through detailed comparison of FLGSA results and SpliceAI predictions, we inferred that the remaining 653 untested coding SNVs in the SPINK1 gene are unlikely to significantly affect splicing. Of the 12 splice-altering events, nine produced both normally spliced and aberrantly spliced transcripts, while the remaining three only generated aberrantly spliced transcripts. These splice-impacting SNVs were found solely in exons 1 and 2, notably at the first and/or last coding nucleotides of these exons. Among the 12 splice-altering events, 11 were missense variants (2.17% of 506 potential missense variants), and one was synonymous (0.61% of 164 potential synonymous variants). Notably, adjusting the SpliceAI cut-off to 0.30 instead of the conventional 0.20 would improve specificity without reducing sensitivity. By integrating FLGSA with SpliceAI, we have determined that less than 2% (1.67%) of all possible coding SNVs in SPINK1 significantly influence splicing outcomes. Our findings emphasize the critical importance of conducting splicing analysis within the broader genomic sequence context of the study gene and highlight the inherent uncertainties associated with intermediate SpliceAI scores (0.20 to 0.80). This study contributes to the field by being the first to prospectively interpret all potential coding SNVs in a disease-associated gene with a high degree of accuracy, representing a meaningful attempt at shifting from retrospective to prospective variant analysis in the era of exome and genome sequencing.","PeriodicalId":13183,"journal":{"name":"Human Genomics","volume":null,"pages":null},"PeriodicalIF":3.8000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Genomics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40246-024-00586-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Single-nucleotide variants (SNVs) within gene coding sequences can significantly impact pre-mRNA splicing, bearing profound implications for pathogenic mechanisms and precision medicine. In this study, we aim to harness the well-established full-length gene splicing assay (FLGSA) in conjunction with SpliceAI to prospectively interpret the splicing effects of all potential coding SNVs within the four-exon SPINK1 gene, a gene associated with chronic pancreatitis. Our study began with a retrospective analysis of 27 SPINK1 coding SNVs previously assessed using FLGSA, proceeded with a prospective analysis of 35 new FLGSA-tested SPINK1 coding SNVs, followed by data extrapolation, and ended with further validation. In total, we analyzed 67 SPINK1 coding SNVs, which account for 9.3% of the 720 possible coding SNVs. Among these 67 FLGSA-analyzed SNVs, 12 were found to impact splicing. Through detailed comparison of FLGSA results and SpliceAI predictions, we inferred that the remaining 653 untested coding SNVs in the SPINK1 gene are unlikely to significantly affect splicing. Of the 12 splice-altering events, nine produced both normally spliced and aberrantly spliced transcripts, while the remaining three only generated aberrantly spliced transcripts. These splice-impacting SNVs were found solely in exons 1 and 2, notably at the first and/or last coding nucleotides of these exons. Among the 12 splice-altering events, 11 were missense variants (2.17% of 506 potential missense variants), and one was synonymous (0.61% of 164 potential synonymous variants). Notably, adjusting the SpliceAI cut-off to 0.30 instead of the conventional 0.20 would improve specificity without reducing sensitivity. By integrating FLGSA with SpliceAI, we have determined that less than 2% (1.67%) of all possible coding SNVs in SPINK1 significantly influence splicing outcomes. Our findings emphasize the critical importance of conducting splicing analysis within the broader genomic sequence context of the study gene and highlight the inherent uncertainties associated with intermediate SpliceAI scores (0.20 to 0.80). This study contributes to the field by being the first to prospectively interpret all potential coding SNVs in a disease-associated gene with a high degree of accuracy, representing a meaningful attempt at shifting from retrospective to prospective variant analysis in the era of exome and genome sequencing.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
结合全长基因检测和 SpliceAI,解读所有可能的 SPINK1 编码变异对剪接的影响
基因编码序列中的单核苷酸变异(SNVs)会对前mRNA剪接产生重大影响,对致病机制和精准医疗产生深远影响。在本研究中,我们旨在利用成熟的全长基因剪接测定(FLGSA)与 SpliceAI 结合,前瞻性地解读与慢性胰腺炎相关的四外显子 SPINK1 基因中所有潜在编码 SNV 的剪接效应。我们的研究首先回顾性分析了之前用 FLGSA 评估过的 27 个 SPINK1 编码 SNV,接着前瞻性分析了 35 个新的经 FLGSA 测试的 SPINK1 编码 SNV,然后进行了数据外推,最后进行了进一步验证。我们总共分析了 67 个 SPINK1 编码 SNV,占 720 个可能编码 SNV 的 9.3%。在这67个FLGSA分析的SNV中,发现12个影响剪接。通过详细比较 FLGSA 结果和 SpliceAI 预测,我们推断 SPINK1 基因中其余 653 个未检测的编码 SNV 不太可能对剪接产生重大影响。在 12 个剪接改变事件中,有 9 个同时产生了正常剪接和异常剪接的转录本,而其余 3 个只产生了异常剪接的转录本。这些影响剪接的 SNV 只存在于 1 号和 2 号外显子中,尤其是这些外显子的第一个和/或最后一个编码核苷酸。在这 12 个剪接改变事件中,11 个是错义变异(占 506 个潜在错义变异的 2.17%),1 个是同义变异(占 164 个潜在同义变异的 0.61%)。值得注意的是,将 SpliceAI 临界值从传统的 0.20 调整到 0.30 会在不降低灵敏度的情况下提高特异性。通过整合 FLGSA 和 SpliceAI,我们确定在 SPINK1 所有可能的编码 SNV 中,只有不到 2%(1.67%)的 SNV 会显著影响剪接结果。我们的研究结果强调了在研究基因更广泛的基因组序列背景下进行剪接分析的极端重要性,并突出了与 SpliceAI 中级评分(0.20 至 0.80)相关的固有不确定性。本研究首次对疾病相关基因中的所有潜在编码 SNV 进行了前瞻性的高精度解读,为该领域做出了贡献,是在外显子组和基因组测序时代从回顾性变异分析转向前瞻性变异分析的一次有意义的尝试。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Human Genomics
Human Genomics GENETICS & HEREDITY-
CiteScore
6.00
自引率
2.20%
发文量
55
审稿时长
11 weeks
期刊介绍: Human Genomics is a peer-reviewed, open access, online journal that focuses on the application of genomic analysis in all aspects of human health and disease, as well as genomic analysis of drug efficacy and safety, and comparative genomics. Topics covered by the journal include, but are not limited to: pharmacogenomics, genome-wide association studies, genome-wide sequencing, exome sequencing, next-generation deep-sequencing, functional genomics, epigenomics, translational genomics, expression profiling, proteomics, bioinformatics, animal models, statistical genetics, genetic epidemiology, human population genetics and comparative genomics.
期刊最新文献
Best practices for germline variant and DNA methylation analysis of second- and third-generation sequencing data. Development of oxidative stress- and ferroptosis-related prognostic signature in gastric cancer and identification of CDH19 as a novel biomarker. Drosophila Toxicogenomics: genetic variation and sexual dimorphism in susceptibility to 4-Methylimidazole. Mapping the evolving trend of research on leukocyte telomere length: a text-mining study. Novel FLNC variants in pediatric cardiomyopathy: an insight into disease mechanisms.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1