对 NGS 基因分型管道进行模板特异性优化,揭示了 MHC 基因表达的等位基因特异性变异。

IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Molecular Ecology Resources Pub Date : 2024-02-08 DOI:10.1111/1755-0998.13935
Artemis Efstratiou, Arnaud Gaigher, Sven Künzel, Ana Teles, Tobias L. Lenz
{"title":"对 NGS 基因分型管道进行模板特异性优化,揭示了 MHC 基因表达的等位基因特异性变异。","authors":"Artemis Efstratiou,&nbsp;Arnaud Gaigher,&nbsp;Sven Künzel,&nbsp;Ana Teles,&nbsp;Tobias L. Lenz","doi":"10.1111/1755-0998.13935","DOIUrl":null,"url":null,"abstract":"<p>Using high-throughput sequencing for precise genotyping of multi-locus gene families, such as the major histocompatibility complex (MHC), remains challenging, due to the complexity of the data and difficulties in distinguishing genuine from erroneous variants. Several dedicated genotyping pipelines for data from high-throughput sequencing, such as next-generation sequencing (NGS), have been developed to tackle the ensuing risk of artificially inflated diversity. Here, we thoroughly assess three such multi-locus genotyping pipelines for NGS data, the DOC method, AmpliSAS and ACACIA, using MHC class IIβ data sets of three-spined stickleback gDNA, cDNA and “artificial” plasmid samples with known allelic diversity. We show that genotyping of gDNA and plasmid samples at optimal pipeline parameters was highly accurate and reproducible across methods. However, for cDNA data, the gDNA-optimal parameter configuration yielded decreased overall genotyping precision and consistency between pipelines. Further adjustments of key clustering parameters were required tο account for higher error rates and larger variation in sequencing depth per allele, highlighting the importance of template-specific pipeline optimization for reliable genotyping of multi-locus gene families. Through accurate paired gDNA-cDNA typing and MHC-II haplotype inference, we show that MHC-II allele-specific expression levels correlate negatively with allele number across haplotypes. Lastly, sibship-assisted cDNA-typing of MHC-I revealed novel variants linked in haplotype blocks, and a higher-than-previously-reported individual MHC-I allelic diversity. In conclusion, we provide novel genotyping protocols for the three-spined stickleback MHC-I and -II genes, and evaluate the performance of popular NGS-genotyping pipelines. We also show that fine-tuned genotyping of paired gDNA-cDNA samples facilitates amplification bias-corrected MHC allele expression analysis.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 4","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13935","citationCount":"0","resultStr":"{\"title\":\"Template-specific optimization of NGS genotyping pipelines reveals allele-specific variation in MHC gene expression\",\"authors\":\"Artemis Efstratiou,&nbsp;Arnaud Gaigher,&nbsp;Sven Künzel,&nbsp;Ana Teles,&nbsp;Tobias L. Lenz\",\"doi\":\"10.1111/1755-0998.13935\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Using high-throughput sequencing for precise genotyping of multi-locus gene families, such as the major histocompatibility complex (MHC), remains challenging, due to the complexity of the data and difficulties in distinguishing genuine from erroneous variants. Several dedicated genotyping pipelines for data from high-throughput sequencing, such as next-generation sequencing (NGS), have been developed to tackle the ensuing risk of artificially inflated diversity. Here, we thoroughly assess three such multi-locus genotyping pipelines for NGS data, the DOC method, AmpliSAS and ACACIA, using MHC class IIβ data sets of three-spined stickleback gDNA, cDNA and “artificial” plasmid samples with known allelic diversity. We show that genotyping of gDNA and plasmid samples at optimal pipeline parameters was highly accurate and reproducible across methods. However, for cDNA data, the gDNA-optimal parameter configuration yielded decreased overall genotyping precision and consistency between pipelines. Further adjustments of key clustering parameters were required tο account for higher error rates and larger variation in sequencing depth per allele, highlighting the importance of template-specific pipeline optimization for reliable genotyping of multi-locus gene families. Through accurate paired gDNA-cDNA typing and MHC-II haplotype inference, we show that MHC-II allele-specific expression levels correlate negatively with allele number across haplotypes. Lastly, sibship-assisted cDNA-typing of MHC-I revealed novel variants linked in haplotype blocks, and a higher-than-previously-reported individual MHC-I allelic diversity. In conclusion, we provide novel genotyping protocols for the three-spined stickleback MHC-I and -II genes, and evaluate the performance of popular NGS-genotyping pipelines. We also show that fine-tuned genotyping of paired gDNA-cDNA samples facilitates amplification bias-corrected MHC allele expression analysis.</p>\",\"PeriodicalId\":211,\"journal\":{\"name\":\"Molecular Ecology Resources\",\"volume\":\"24 4\",\"pages\":\"\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-02-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13935\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Ecology Resources\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/1755-0998.13935\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1755-0998.13935","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

利用高通量测序技术对主要组织相容性复合体(MHC)等多焦点基因家族进行精确基因分型仍然具有挑战性,因为数据非常复杂,而且难以区分真正的变异和错误的变异。针对高通量测序数据(如下一代测序(NGS))开发了几种专用基因分型管道,以解决随之而来的人为夸大多样性的风险。在这里,我们利用已知等位基因多样性的三刺尾蜥 gDNA、cDNA 和 "人工 "质粒样本的 MHC II 类β数据集,对 DOC 方法、AmpliSAS 和 ACACIA 这三种用于 NGS 数据的多焦点基因分型管道进行了全面评估。我们的研究表明,以最佳的管道参数对 gDNA 和质粒样本进行基因分型的准确性很高,而且不同方法的重复性也很好。然而,对于 cDNA 数据,gDNA 最佳参数配置会降低整体基因分型的精确度和不同管道间的一致性。需要进一步调整关键聚类参数,以考虑更高的错误率和更大的等位基因测序深度差异,这凸显了针对特定模板的管道优化对可靠的多焦点基因家族基因分型的重要性。通过精确的成对 gDNA-cDNA 分型和 MHC-II 单倍型推断,我们发现 MHC-II 等位基因特异性表达水平与单倍型中的等位基因数量呈负相关。最后,通过对 MHC-I 进行兄弟姐妹辅助 cDNA 分型,我们发现了单倍型区块中存在新的变异,而且个体 MHC-I 等位基因的多样性高于之前的报道。总之,我们为三刺棍鱼的 MHC-I 和 -II 基因提供了新的基因分型方案,并评估了流行的 NGS 基因分型管道的性能。我们还表明,对配对的 gDNA-cDNA 样品进行微调基因分型有助于扩增偏差校正的 MHC 等位基因表达分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Template-specific optimization of NGS genotyping pipelines reveals allele-specific variation in MHC gene expression

Using high-throughput sequencing for precise genotyping of multi-locus gene families, such as the major histocompatibility complex (MHC), remains challenging, due to the complexity of the data and difficulties in distinguishing genuine from erroneous variants. Several dedicated genotyping pipelines for data from high-throughput sequencing, such as next-generation sequencing (NGS), have been developed to tackle the ensuing risk of artificially inflated diversity. Here, we thoroughly assess three such multi-locus genotyping pipelines for NGS data, the DOC method, AmpliSAS and ACACIA, using MHC class IIβ data sets of three-spined stickleback gDNA, cDNA and “artificial” plasmid samples with known allelic diversity. We show that genotyping of gDNA and plasmid samples at optimal pipeline parameters was highly accurate and reproducible across methods. However, for cDNA data, the gDNA-optimal parameter configuration yielded decreased overall genotyping precision and consistency between pipelines. Further adjustments of key clustering parameters were required tο account for higher error rates and larger variation in sequencing depth per allele, highlighting the importance of template-specific pipeline optimization for reliable genotyping of multi-locus gene families. Through accurate paired gDNA-cDNA typing and MHC-II haplotype inference, we show that MHC-II allele-specific expression levels correlate negatively with allele number across haplotypes. Lastly, sibship-assisted cDNA-typing of MHC-I revealed novel variants linked in haplotype blocks, and a higher-than-previously-reported individual MHC-I allelic diversity. In conclusion, we provide novel genotyping protocols for the three-spined stickleback MHC-I and -II genes, and evaluate the performance of popular NGS-genotyping pipelines. We also show that fine-tuned genotyping of paired gDNA-cDNA samples facilitates amplification bias-corrected MHC allele expression analysis.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Molecular Ecology Resources
Molecular Ecology Resources 生物-进化生物学
CiteScore
15.60
自引率
5.20%
发文量
170
审稿时长
3 months
期刊介绍: Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines. In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.
期刊最新文献
Development of SNP Panels from Low-Coverage Whole Genome Sequencing (lcWGS) to Support Indigenous Fisheries for Three Salmonid Species in Northern Canada. Probe Capture Enrichment Sequencing of amoA Genes Improves the Detection of Diverse Ammonia-Oxidising Archaeal and Bacterial Populations. HMicroDB: A Comprehensive Database of Herpetofaunal Microbiota With a Focus on Host Phylogeny, Physiological Traits, and Environment Factors. OGU: A Toolbox for Better Utilising Organelle Genomic Data. Correction to "Characterisation of Putative Circular Plasmids in Sponge-Associated Bacterial Communities Using a Selective Multiply-Primed Rolling Circle Amplification".
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1