{"title":"FSTest: an efficient tool for cross-population fixation index estimation on variant call format files","authors":"","doi":"10.1007/s12041-023-01459-1","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> <p>Fixation index (<em>F</em><sub>st</sub>) statistics provide critical insights into evolutionary processes affecting the structure of genetic variation within and among populations. <em>F</em><sub>st</sub> statistics have been widely applied in population and evolutionary genetics to identify genomic regions targeted by selection pressures. The FSTest 1.3 software was developed to estimate four <em>F</em><sub>st</sub> statistics of Hudson, Weir and Cockerham, Nei, and Wright using high-throughput genotyping or sequencing data. Here, we introduced FSTest 1.3 and compared its performance with two widely used software VCFtools 0.1.16 and PLINK 2.0. Chromosome 1 of 1000 Genomes Phase III variant data belonging to South Asian (<em>n</em> = 211) and African (<em>n</em> = 274) populations were included as an example case in this study. Different <em>F</em><sub>st</sub> estimates were calculated for each single-nucleotide polymorphism (SNP) in a pairwise comparison of South Asian against African populations, and the results of FSTest 1.3 were confirmed by VCFtools 0.1.16 and PLINK 2.0. Two different sliding window approaches, one based on a fixed number of SNPs and another based on a fixed number of base pair (bp) were conducted using FSTest 1.3 and VCFtools 0.1.16. Our results showed that regions with low coverage genotypic data could lead to an overestimation of <em>F</em><sub>st</sub> in sliding window analysis using a fixed number of bp. FSTest 1.3 could mitigate this challenge by estimating the average of consecutive SNPs along the chromosome. FSTest 1.3 allows direct analysis of VCF files with a small amount of code and can calculate <em>F</em><sub>st</sub> estimates on a desktop computer for more than a million SNPs in a few minutes. FSTest 1.3 is freely available at https://github.com/similab/FSTest.</p>","PeriodicalId":15907,"journal":{"name":"Journal of Genetics","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s12041-023-01459-1","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
Fixation index (Fst) statistics provide critical insights into evolutionary processes affecting the structure of genetic variation within and among populations. Fst statistics have been widely applied in population and evolutionary genetics to identify genomic regions targeted by selection pressures. The FSTest 1.3 software was developed to estimate four Fst statistics of Hudson, Weir and Cockerham, Nei, and Wright using high-throughput genotyping or sequencing data. Here, we introduced FSTest 1.3 and compared its performance with two widely used software VCFtools 0.1.16 and PLINK 2.0. Chromosome 1 of 1000 Genomes Phase III variant data belonging to South Asian (n = 211) and African (n = 274) populations were included as an example case in this study. Different Fst estimates were calculated for each single-nucleotide polymorphism (SNP) in a pairwise comparison of South Asian against African populations, and the results of FSTest 1.3 were confirmed by VCFtools 0.1.16 and PLINK 2.0. Two different sliding window approaches, one based on a fixed number of SNPs and another based on a fixed number of base pair (bp) were conducted using FSTest 1.3 and VCFtools 0.1.16. Our results showed that regions with low coverage genotypic data could lead to an overestimation of Fst in sliding window analysis using a fixed number of bp. FSTest 1.3 could mitigate this challenge by estimating the average of consecutive SNPs along the chromosome. FSTest 1.3 allows direct analysis of VCF files with a small amount of code and can calculate Fst estimates on a desktop computer for more than a million SNPs in a few minutes. FSTest 1.3 is freely available at https://github.com/similab/FSTest.
期刊介绍:
The journal retains its traditional interest in evolutionary research that is of relevance to geneticists, even if this is not explicitly genetical in nature. The journal covers all areas of genetics and evolution,including molecular genetics and molecular evolution.It publishes papers and review articles on current topics, commentaries and essayson ideas and trends in genetics and evolutionary biology, historical developments, debates and book reviews. From 2010 onwards, the journal has published a special category of papers termed ‘Online Resources’. These are brief reports on the development and the routine use of molecular markers for assessing genetic variability within and among species. Also published are reports outlining pedagogical approaches in genetics teaching.