{"title":"aWGRS:自动对端全基因组重测序数据分析框架","authors":"Xiujuan Sun, Fa Zhang, Xiaohua Wan, Jinzhi Zhang","doi":"10.1109/BIBM.2016.7822646","DOIUrl":null,"url":null,"abstract":"In order to enable people to avoid too many cumbersome and complex operations of the command line and repeated parameter adjustments, automates pair-end whole genome re-sequence (aWGRS) data processing whereby pre-installed dependencies are presented in this paper, which are used to map reads to a reference and realign variations. This method presents aWGRS which is a method that takes as input paired-end reads and a reference genome and returns re-sequencing information. The concept behind the development of this tool is that re-sequencing requires several steps: alignment to the reference, single nucleotide polymorphisms (SNPs) calling, Insertion / Deletion (InDels) calling, structure variant (SVs) calling, and annotation. By introducing and adjusting a new concept called the recall rate, the coverage rate and accuracy rate can be met at the same time. Within the range of recall rate, a variation is evaluated by two criteria: the quality value and the number of reads that support it, and one read with higher quality value and larger supported number will be picked out finally. Genome-wide genetic variations between precocious trifoliate orange and its wild type are identified in [1], and empirical results show that there is a big reduction in the amount of variation and great improvement of accuracy between the results of aWGRS and [1] which offered by the Beijing Genomics Institute (BGI). Overall, the adjustable parameters adopted in aWGRS can affect the results of the experiment and the default filtering strategy using the mutation recall rate also can attain good results automatically.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"aWGRS: Automates paired-end whole genome re-sequencing data analysis framework\",\"authors\":\"Xiujuan Sun, Fa Zhang, Xiaohua Wan, Jinzhi Zhang\",\"doi\":\"10.1109/BIBM.2016.7822646\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to enable people to avoid too many cumbersome and complex operations of the command line and repeated parameter adjustments, automates pair-end whole genome re-sequence (aWGRS) data processing whereby pre-installed dependencies are presented in this paper, which are used to map reads to a reference and realign variations. This method presents aWGRS which is a method that takes as input paired-end reads and a reference genome and returns re-sequencing information. The concept behind the development of this tool is that re-sequencing requires several steps: alignment to the reference, single nucleotide polymorphisms (SNPs) calling, Insertion / Deletion (InDels) calling, structure variant (SVs) calling, and annotation. By introducing and adjusting a new concept called the recall rate, the coverage rate and accuracy rate can be met at the same time. Within the range of recall rate, a variation is evaluated by two criteria: the quality value and the number of reads that support it, and one read with higher quality value and larger supported number will be picked out finally. Genome-wide genetic variations between precocious trifoliate orange and its wild type are identified in [1], and empirical results show that there is a big reduction in the amount of variation and great improvement of accuracy between the results of aWGRS and [1] which offered by the Beijing Genomics Institute (BGI). Overall, the adjustable parameters adopted in aWGRS can affect the results of the experiment and the default filtering strategy using the mutation recall rate also can attain good results automatically.\",\"PeriodicalId\":345384,\"journal\":{\"name\":\"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBM.2016.7822646\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
aWGRS: Automates paired-end whole genome re-sequencing data analysis framework
In order to enable people to avoid too many cumbersome and complex operations of the command line and repeated parameter adjustments, automates pair-end whole genome re-sequence (aWGRS) data processing whereby pre-installed dependencies are presented in this paper, which are used to map reads to a reference and realign variations. This method presents aWGRS which is a method that takes as input paired-end reads and a reference genome and returns re-sequencing information. The concept behind the development of this tool is that re-sequencing requires several steps: alignment to the reference, single nucleotide polymorphisms (SNPs) calling, Insertion / Deletion (InDels) calling, structure variant (SVs) calling, and annotation. By introducing and adjusting a new concept called the recall rate, the coverage rate and accuracy rate can be met at the same time. Within the range of recall rate, a variation is evaluated by two criteria: the quality value and the number of reads that support it, and one read with higher quality value and larger supported number will be picked out finally. Genome-wide genetic variations between precocious trifoliate orange and its wild type are identified in [1], and empirical results show that there is a big reduction in the amount of variation and great improvement of accuracy between the results of aWGRS and [1] which offered by the Beijing Genomics Institute (BGI). Overall, the adjustable parameters adopted in aWGRS can affect the results of the experiment and the default filtering strategy using the mutation recall rate also can attain good results automatically.