Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing

Q1 Biochemistry, Genetics and Molecular Biology Biomolecular Detection and Quantification Pub Date : 2015-09-01 DOI:10.1016/j.bdq.2015.08.003
Thomas Blomquist , Erin L. Crawford , Jiyoun Yeo , Xiaolu Zhang , James C. Willey
{"title":"Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing","authors":"Thomas Blomquist ,&nbsp;Erin L. Crawford ,&nbsp;Jiyoun Yeo ,&nbsp;Xiaolu Zhang ,&nbsp;James C. Willey","doi":"10.1016/j.bdq.2015.08.003","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Clinical implementation of Next-Generation Sequencing (NGS) is challenged by poor control for stochastic sampling, library preparation biases and qualitative sequencing error. To address these challenges we developed and tested two hypotheses.</p></div><div><h3>Methods</h3><p>Hypothesis 1: Analytical variation in quantification is predicted by stochastic sampling effects at input of (a) amplifiable nucleic acid target molecules into the library preparation, (b) amplicons from library into sequencer, or (c) both. We derived equations using Monte Carlo simulation to predict assay coefficient of variation (CV) based on these three working models and tested them against NGS data from specimens with well characterized molecule inputs and sequence counts prepared using competitive multiplex-PCR amplicon-based NGS library preparation method comprising synthetic internal standards (IS). Hypothesis 2: Frequencies of technically-derived qualitative sequencing errors (i.e., base substitution, insertion and deletion) observed at each base position in each target native template (NT) are concordant with those observed in respective competitive synthetic IS present in the same reaction. We measured error frequencies at each base position within amplicons from each of 30 target NT, then tested whether they correspond to those within the 30 respective IS.</p></div><div><h3>Results</h3><p>For hypothesis 1, the Monte Carlo model derived from both sampling events best predicted CV and explained 74% of observed assay variance. For hypothesis 2, observed frequency and type of sequence variation at each base position within each IS was concordant with that observed in respective NTs (<em>R</em><sup>2</sup> <!-->=<!--> <!-->0.93).</p></div><div><h3>Conclusion</h3><p>In targeted NGS, synthetic competitive IS control for stochastic sampling at input of both target into library preparation and of target library product into sequencer, and control for qualitative errors generated during library preparation and sequencing. These controls enable accurate clinical diagnostic reporting of confidence limits and limit of detection for copy number measurement, and of frequency for each actionable mutation.</p></div>","PeriodicalId":38073,"journal":{"name":"Biomolecular Detection and Quantification","volume":"5 ","pages":"Pages 30-37"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.bdq.2015.08.003","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomolecular Detection and Quantification","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S221475351530005X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 10

Abstract

Background

Clinical implementation of Next-Generation Sequencing (NGS) is challenged by poor control for stochastic sampling, library preparation biases and qualitative sequencing error. To address these challenges we developed and tested two hypotheses.

Methods

Hypothesis 1: Analytical variation in quantification is predicted by stochastic sampling effects at input of (a) amplifiable nucleic acid target molecules into the library preparation, (b) amplicons from library into sequencer, or (c) both. We derived equations using Monte Carlo simulation to predict assay coefficient of variation (CV) based on these three working models and tested them against NGS data from specimens with well characterized molecule inputs and sequence counts prepared using competitive multiplex-PCR amplicon-based NGS library preparation method comprising synthetic internal standards (IS). Hypothesis 2: Frequencies of technically-derived qualitative sequencing errors (i.e., base substitution, insertion and deletion) observed at each base position in each target native template (NT) are concordant with those observed in respective competitive synthetic IS present in the same reaction. We measured error frequencies at each base position within amplicons from each of 30 target NT, then tested whether they correspond to those within the 30 respective IS.

Results

For hypothesis 1, the Monte Carlo model derived from both sampling events best predicted CV and explained 74% of observed assay variance. For hypothesis 2, observed frequency and type of sequence variation at each base position within each IS was concordant with that observed in respective NTs (R2 = 0.93).

Conclusion

In targeted NGS, synthetic competitive IS control for stochastic sampling at input of both target into library preparation and of target library product into sequencer, and control for qualitative errors generated during library preparation and sequencing. These controls enable accurate clinical diagnostic reporting of confidence limits and limit of detection for copy number measurement, and of frequency for each actionable mutation.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
下一代测序中随机抽样变异和定性测序误差的控制
新一代测序(NGS)的临床实施面临随机抽样控制不佳、文库制备偏差和定性测序误差的挑战。为了应对这些挑战,我们提出并测试了两个假设。方法假设1:定量分析的变化是通过(a)可扩增的核酸靶分子输入文库制备,(b)文库扩增子输入测序器,或(c)两者同时输入时的随机抽样效应来预测的。基于这三种工作模型,我们通过蒙特卡罗模拟推导出方程来预测分析变异系数(CV),并与具有良好特征的分子输入和序列计数的样品的NGS数据进行了测试,这些样品采用竞争性多重pcr扩增子为基础的NGS文库制备方法,包括合成内标(IS)。假设2:在每个靶天然模板(NT)的每个碱基位置观察到的技术衍生的定性测序错误(即碱基替换、插入和删除)的频率与在同一反应中存在的各自竞争性合成IS中观察到的频率一致。我们测量了30个目标NT中每个扩增子的每个碱基位置的错误频率,然后测试它们是否与30个各自的IS中的错误频率相对应。结果对于假设1,蒙特卡罗模型从两个抽样事件中得到最好的CV预测和解释74%观察到的分析方差。对于假设2,在每个IS内的每个碱基位置观察到的序列变异频率和类型与各自nt中观察到的序列变异频率和类型一致(R2 = 0.93)。结论在靶向NGS中,对目标文库制备和目标文库产品输入测序仪的随机抽样进行了综合竞争性IS控制,并对文库制备和测序过程中产生的定性误差进行了控制。这些控制使准确的临床诊断报告的置信限和检测限度的拷贝数测量,以及频率的每一个可操作的突变。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Biomolecular Detection and Quantification
Biomolecular Detection and Quantification Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
14.20
自引率
0.00%
发文量
0
审稿时长
8 weeks
期刊最新文献
Publisher's Note Establishing essential quality criteria for the validation of circular RNAs as biomarkers qPCR data analysis: Better results through iconoclasm Considerations and quality controls when analyzing cell-free tumor DNA Next-generation sequencing of HIV-1 single genome amplicons
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1