{"title":"QColors:从短而不连续的下一代测序读数中重建保守病毒类群的算法。","authors":"Austin Huang, Rami Kantor, Allison DeLong, Leeann Schreier, Sorin Istrail","doi":"10.3233/ISB-2012-0454","DOIUrl":null,"url":null,"abstract":"<p><p>Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data.</p>","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"11 5-6","pages":"193-201"},"PeriodicalIF":0.0000,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5530257/pdf/nihms879660.pdf","citationCount":"0","resultStr":"{\"title\":\"QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.\",\"authors\":\"Austin Huang, Rami Kantor, Allison DeLong, Leeann Schreier, Sorin Istrail\",\"doi\":\"10.3233/ISB-2012-0454\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data.</p>\",\"PeriodicalId\":39379,\"journal\":{\"name\":\"In Silico Biology\",\"volume\":\"11 5-6\",\"pages\":\"193-201\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5530257/pdf/nihms879660.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"In Silico Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/ISB-2012-0454\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"In Silico Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/ISB-2012-0454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}
QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.
Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data.
In Silico BiologyComputer Science-Computational Theory and Mathematics
CiteScore
2.20
自引率
0.00%
发文量
1
期刊介绍:
The considerable "algorithmic complexity" of biological systems requires a huge amount of detailed information for their complete description. Although far from being complete, the overwhelming quantity of small pieces of information gathered for all kind of biological systems at the molecular and cellular level requires computational tools to be adequately stored and interpreted. Interpretation of data means to abstract them as much as allowed to provide a systematic, an integrative view of biology. Most of the presently available scientific journals focus either on accumulating more data from elaborate experimental approaches, or on presenting new algorithms for the interpretation of these data. Both approaches are meritorious.