{"title":"A hidden Markov model to estimate homozygous-by-descent probabilities associated with nested layers of ancestors","authors":"Tom Druet , Mathieu Gautier","doi":"10.1016/j.tpb.2022.03.001","DOIUrl":null,"url":null,"abstract":"<div><p>Inbreeding results from the mating of related individuals and has negative consequences because it brings together deleterious variants in one individual. Genomic estimates of the inbreeding coefficients are preferred to pedigree-based estimators as they measure the realized inbreeding levels and they are more robust to pedigree errors. Several methods identifying homozygous-by-descent (HBD) segments with hidden Markov models (HMM) have been recently developed and are particularly valuable when the information is degraded or heterogeneous (e.g., low-fold sequencing, low marker density, heterogeneous genotype quality or variable marker spacing). We previously developed a multiple HBD class HMM where HBD segments are classified in different groups based on their length (e.g., recent versus old HBD segments) but we recently observed that for high inbreeding levels with many HBD segments, the estimated contributions might be biased towards more recent classes (i.e., associated with large HBD segments) although the overall estimated level of inbreeding remained unbiased. We herein propose a new model in which the HBD classification is modelled in successive nested levels with decreasing expected HBD segment lengths, the underlying exponential rates being directly related to the number of generations to the common ancestor. The non-HBD classes are now modelled as a mixture of HBD segments from later generations and shorter non-HBD segments (i.e., both with higher rates). The new model has improved statistical properties and performs better on simulated data compared to our previous version. We also show that the parameters of the model are easier to interpret and that the model is more robust to the choice of the number of classes. Overall, the new model results in an improved partitioning of inbreeding in different HBD classes and should be preferred.</p></div>","PeriodicalId":49437,"journal":{"name":"Theoretical Population Biology","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Population Biology","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0040580922000168","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 3
Abstract
Inbreeding results from the mating of related individuals and has negative consequences because it brings together deleterious variants in one individual. Genomic estimates of the inbreeding coefficients are preferred to pedigree-based estimators as they measure the realized inbreeding levels and they are more robust to pedigree errors. Several methods identifying homozygous-by-descent (HBD) segments with hidden Markov models (HMM) have been recently developed and are particularly valuable when the information is degraded or heterogeneous (e.g., low-fold sequencing, low marker density, heterogeneous genotype quality or variable marker spacing). We previously developed a multiple HBD class HMM where HBD segments are classified in different groups based on their length (e.g., recent versus old HBD segments) but we recently observed that for high inbreeding levels with many HBD segments, the estimated contributions might be biased towards more recent classes (i.e., associated with large HBD segments) although the overall estimated level of inbreeding remained unbiased. We herein propose a new model in which the HBD classification is modelled in successive nested levels with decreasing expected HBD segment lengths, the underlying exponential rates being directly related to the number of generations to the common ancestor. The non-HBD classes are now modelled as a mixture of HBD segments from later generations and shorter non-HBD segments (i.e., both with higher rates). The new model has improved statistical properties and performs better on simulated data compared to our previous version. We also show that the parameters of the model are easier to interpret and that the model is more robust to the choice of the number of classes. Overall, the new model results in an improved partitioning of inbreeding in different HBD classes and should be preferred.
期刊介绍:
An interdisciplinary journal, Theoretical Population Biology presents articles on theoretical aspects of the biology of populations, particularly in the areas of demography, ecology, epidemiology, evolution, and genetics. Emphasis is on the development of mathematical theory and models that enhance the understanding of biological phenomena.
Articles highlight the motivation and significance of the work for advancing progress in biology, relying on a substantial mathematical effort to obtain biological insight. The journal also presents empirical results and computational and statistical methods directly impinging on theoretical problems in population biology.