Pub Date : 2024-05-02DOI: 10.1186/s12711-024-00892-9
Andres Legarra, Matias Bermann, Quanshun Mei, Ole F. Christensen
The theory of “metafounders” proposes a unified framework for relationships across base populations within breeds (e.g. unknown parent groups), and base populations across breeds (crosses) together with a sensible compatibility with genomic relationships. Considering metafounders might be advantageous in pedigree best linear unbiased prediction (BLUP) or single-step genomic BLUP. Existing methods to estimate relationships across metafounders $${varvec{Gamma}}$$ are not well adapted to highly unbalanced data, genotyped individuals far from base populations, or many unknown parent groups (within breed per year of birth). We derive likelihood methods to estimate $${varvec{Gamma}}$$ . For a single metafounder, summary statistics of pedigree and genomic relationships allow deriving a cubic equation with the real root being the maximum likelihood (ML) estimate of $${varvec{Gamma}}$$ . This equation is tested with Lacaune sheep data. For several metafounders, we split the first derivative of the complete likelihood in a term related to $${varvec{Gamma}}$$ , and a second term related to Mendelian sampling variances. Approximating the first derivative by its first term results in a pseudo-EM algorithm that iteratively updates the estimate of $${varvec{Gamma}}$$ by the corresponding block of the H-matrix. The method extends to complex situations with groups defined by year of birth, modelling the increase of $${varvec{Gamma}}$$ using estimates of the rate of increase of inbreeding ( $$Delta F$$ ), resulting in an expanded $${varvec{Gamma}}$$ and in a pseudo-EM+ $$Delta F$$ algorithm. We compare these methods with the generalized least squares (GLS) method using simulated data: complex crosses of two breeds in equal or unsymmetrical proportions; and in two breeds, with 10 groups per year of birth within breed. We simulate genotyping in all generations or in the last ones. For a single metafounder, the ML estimates of the Lacaune data corresponded to the maximum. For simulated data, when genotypes were spread across all generations, both GLS and pseudo-EM(+ $$Delta F$$ ) methods were accurate. With genotypes only available in the most recent generations, the GLS method was biased, whereas the pseudo-EM(+ $$Delta F$$ ) approach yielded more accurate and unbiased estimates. We derived ML, pseudo-EM and pseudo-EM+ $$Delta F$$ methods to estimate $${varvec{Gamma}}$$ in many realistic settings. Estimates are accurate in real and simulated data and have a low computational cost.
{"title":"Estimating genomic relationships of metafounders across and within breeds using maximum likelihood, pseudo-expectation–maximization maximum likelihood and increase of relationships","authors":"Andres Legarra, Matias Bermann, Quanshun Mei, Ole F. Christensen","doi":"10.1186/s12711-024-00892-9","DOIUrl":"https://doi.org/10.1186/s12711-024-00892-9","url":null,"abstract":"The theory of “metafounders” proposes a unified framework for relationships across base populations within breeds (e.g. unknown parent groups), and base populations across breeds (crosses) together with a sensible compatibility with genomic relationships. Considering metafounders might be advantageous in pedigree best linear unbiased prediction (BLUP) or single-step genomic BLUP. Existing methods to estimate relationships across metafounders $${varvec{Gamma}}$$ are not well adapted to highly unbalanced data, genotyped individuals far from base populations, or many unknown parent groups (within breed per year of birth). We derive likelihood methods to estimate $${varvec{Gamma}}$$ . For a single metafounder, summary statistics of pedigree and genomic relationships allow deriving a cubic equation with the real root being the maximum likelihood (ML) estimate of $${varvec{Gamma}}$$ . This equation is tested with Lacaune sheep data. For several metafounders, we split the first derivative of the complete likelihood in a term related to $${varvec{Gamma}}$$ , and a second term related to Mendelian sampling variances. Approximating the first derivative by its first term results in a pseudo-EM algorithm that iteratively updates the estimate of $${varvec{Gamma}}$$ by the corresponding block of the H-matrix. The method extends to complex situations with groups defined by year of birth, modelling the increase of $${varvec{Gamma}}$$ using estimates of the rate of increase of inbreeding ( $$Delta F$$ ), resulting in an expanded $${varvec{Gamma}}$$ and in a pseudo-EM+ $$Delta F$$ algorithm. We compare these methods with the generalized least squares (GLS) method using simulated data: complex crosses of two breeds in equal or unsymmetrical proportions; and in two breeds, with 10 groups per year of birth within breed. We simulate genotyping in all generations or in the last ones. For a single metafounder, the ML estimates of the Lacaune data corresponded to the maximum. For simulated data, when genotypes were spread across all generations, both GLS and pseudo-EM(+ $$Delta F$$ ) methods were accurate. With genotypes only available in the most recent generations, the GLS method was biased, whereas the pseudo-EM(+ $$Delta F$$ ) approach yielded more accurate and unbiased estimates. We derived ML, pseudo-EM and pseudo-EM+ $$Delta F$$ methods to estimate $${varvec{Gamma}}$$ in many realistic settings. Estimates are accurate in real and simulated data and have a low computational cost.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"38 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140819149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-02DOI: 10.1186/s12711-024-00900-y
Elisa Somenzi, Erika Partel, Mario Barbato, Ana María Chero Osorio, Licia Colli, Niccolò Franceschi, Roberto Mantovani, Fabio Pilla, Matteo Komjanc, Alessandro Achilli, Heidi Christine Hauffe, Paolo Ajmone Marsan
Rendena is a dual-purpose cattle breed, which is primarily found in the Italian Alps and the eastern areas of the Po valley, and recognized for its longevity, fertility, disease resistance and adaptability to steep Alpine pastures. It is categorized as 'vulnerable to extinction' with only 6057 registered animals in 2022, yet no comprehensive analyses of its molecular diversity have been performed to date. The aim of this study was to analyse the origin, genetic diversity, and genomic signatures of selection in Rendena cattle using data from samples collected in 2000 and 2018, and shed light on the breed's evolution and conservation needs. Genetic analysis revealed that the Rendena breed shares genetic components with various Alpine and Po valley breeds, with a marked genetic proximity to the Original Braunvieh breed, reflecting historical restocking efforts across the region. The breed shows signatures of selection related to both milk and meat production, environmental adaptation and immune response, the latter being possibly the result of multiple rinderpest epidemics that swept across the Alps in the eighteenth century. An analysis of the Rendena cattle population spanning 18 years showed an increase in the mean level of inbreeding over time, which is confirmed by the mean number of runs of homozygosity per individual, which was larger in the 2018 sample. The Rendena breed, while sharing a common origin with Brown Swiss, has developed distinct traits that enable it to thrive in the Alpine environment and make it highly valued by local farmers. Preserving these adaptive features is essential, not only for maintaining genetic diversity and enhancing the ability of this traditional animal husbandry to adapt to changing environments, but also for guaranteeing the resilience and sustainability of both this livestock system and the livelihoods within the Rendena valley.
{"title":"Genetic legacy and adaptive signatures: investigating the history, diversity, and selection signatures in Rendena cattle resilient to eighteenth century rinderpest epidemics","authors":"Elisa Somenzi, Erika Partel, Mario Barbato, Ana María Chero Osorio, Licia Colli, Niccolò Franceschi, Roberto Mantovani, Fabio Pilla, Matteo Komjanc, Alessandro Achilli, Heidi Christine Hauffe, Paolo Ajmone Marsan","doi":"10.1186/s12711-024-00900-y","DOIUrl":"https://doi.org/10.1186/s12711-024-00900-y","url":null,"abstract":"Rendena is a dual-purpose cattle breed, which is primarily found in the Italian Alps and the eastern areas of the Po valley, and recognized for its longevity, fertility, disease resistance and adaptability to steep Alpine pastures. It is categorized as 'vulnerable to extinction' with only 6057 registered animals in 2022, yet no comprehensive analyses of its molecular diversity have been performed to date. The aim of this study was to analyse the origin, genetic diversity, and genomic signatures of selection in Rendena cattle using data from samples collected in 2000 and 2018, and shed light on the breed's evolution and conservation needs. Genetic analysis revealed that the Rendena breed shares genetic components with various Alpine and Po valley breeds, with a marked genetic proximity to the Original Braunvieh breed, reflecting historical restocking efforts across the region. The breed shows signatures of selection related to both milk and meat production, environmental adaptation and immune response, the latter being possibly the result of multiple rinderpest epidemics that swept across the Alps in the eighteenth century. An analysis of the Rendena cattle population spanning 18 years showed an increase in the mean level of inbreeding over time, which is confirmed by the mean number of runs of homozygosity per individual, which was larger in the 2018 sample. The Rendena breed, while sharing a common origin with Brown Swiss, has developed distinct traits that enable it to thrive in the Alpine environment and make it highly valued by local farmers. Preserving these adaptive features is essential, not only for maintaining genetic diversity and enhancing the ability of this traditional animal husbandry to adapt to changing environments, but also for guaranteeing the resilience and sustainability of both this livestock system and the livelihoods within the Rendena valley.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"35 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140819266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-02DOI: 10.1186/s12711-024-00901-x
Luis Varona, David López-Carbonell, Houssemeddine Srihi, Carlos Hervás-Rivero, Óscar González-Recio, Juan Altarriba
Recursive models are a category of structural equation models that propose a causal relationship between traits. These models are more parameterized than multiple trait models, and they require imposing restrictions on the parameter space to ensure statistical identification. Nevertheless, in certain situations, the likelihood of recursive models and multiple trait models are equivalent. Consequently, the estimates of variance components derived from the multiple trait mixed model can be converted into estimates under several recursive models through LDL′ or block-LDL′ transformations. The procedure was employed on a dataset comprising five traits (birth weight—BW, weight at 90 days—W90, weight at 210 days—W210, cold carcass weight—CCW and conformation—CON) from the Pirenaica beef cattle breed. These phenotypic records were unequally distributed among 149,029 individuals and had a high percentage of missing data. The pedigree used consisted of 343,753 individuals. A Bayesian approach involving a multiple-trait mixed model was applied using a Gibbs sampler. The variance components obtained at each iteration of the Gibbs sampler were subsequently used to estimate the variance components within three distinct recursive models. The LDL′ or block-LDL′ transformations applied to the variance component estimates achieved from a multiple trait mixed model enabled inference across multiple sets of recursive models, with the sole prerequisite of being likelihood equivalent. Furthermore, the aforementioned transformations simplify the handling of missing data when conducting inference within the realm of recursive models.
{"title":"Equivalence of variance components between standard and recursive genetic models using LDL′ transformations","authors":"Luis Varona, David López-Carbonell, Houssemeddine Srihi, Carlos Hervás-Rivero, Óscar González-Recio, Juan Altarriba","doi":"10.1186/s12711-024-00901-x","DOIUrl":"https://doi.org/10.1186/s12711-024-00901-x","url":null,"abstract":"Recursive models are a category of structural equation models that propose a causal relationship between traits. These models are more parameterized than multiple trait models, and they require imposing restrictions on the parameter space to ensure statistical identification. Nevertheless, in certain situations, the likelihood of recursive models and multiple trait models are equivalent. Consequently, the estimates of variance components derived from the multiple trait mixed model can be converted into estimates under several recursive models through LDL′ or block-LDL′ transformations. The procedure was employed on a dataset comprising five traits (birth weight—BW, weight at 90 days—W90, weight at 210 days—W210, cold carcass weight—CCW and conformation—CON) from the Pirenaica beef cattle breed. These phenotypic records were unequally distributed among 149,029 individuals and had a high percentage of missing data. The pedigree used consisted of 343,753 individuals. A Bayesian approach involving a multiple-trait mixed model was applied using a Gibbs sampler. The variance components obtained at each iteration of the Gibbs sampler were subsequently used to estimate the variance components within three distinct recursive models. The LDL′ or block-LDL′ transformations applied to the variance component estimates achieved from a multiple trait mixed model enabled inference across multiple sets of recursive models, with the sole prerequisite of being likelihood equivalent. Furthermore, the aforementioned transformations simplify the handling of missing data when conducting inference within the realm of recursive models.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"56 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140819249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-29DOI: 10.1186/s12711-024-00903-9
Lucio F. M. Mota, Diana Giannuzzi, Sara Pegolo, Enrico Sturaro, Daniel Gianola, Riccardo Negrini, Erminio Trevisi, Paolo Ajmone Marsan, Alessio Cecchinato
Metabolic disturbances adversely impact productive and reproductive performance of dairy cattle due to changes in endocrine status and immune function, which increase the risk of disease. This may occur in the post-partum phase, but also throughout lactation, with sub-clinical symptoms. Recently, increased attention has been directed towards improved health and resilience in dairy cattle, and genomic selection (GS) could be a helpful tool for selecting animals that are more resilient to metabolic disturbances throughout lactation. Hence, we evaluated the genomic prediction of serum biomarkers levels for metabolic distress in 1353 Holsteins genotyped with the 100K single nucleotide polymorphism (SNP) chip assay. The GS was evaluated using parametric models best linear unbiased prediction (GBLUP), Bayesian B (BayesB), elastic net (ENET), and nonparametric models, gradient boosting machine (GBM) and stacking ensemble (Stack), which combines ENET and GBM approaches. The results show that the Stack approach outperformed other methods with a relative difference (RD), calculated as an increment in prediction accuracy, of approximately 18.0% compared to GBLUP, 12.6% compared to BayesB, 8.7% compared to ENET, and 4.4% compared to GBM. The highest RD in prediction accuracy between other models with respect to GBLUP was observed for haptoglobin (hapto) from 17.7% for BayesB to 41.2% for Stack; for Zn from 9.8% (BayesB) to 29.3% (Stack); for ceruloplasmin (CuCp) from 9.3% (BayesB) to 27.9% (Stack); for ferric reducing antioxidant power (FRAP) from 8.0% (BayesB) to 40.0% (Stack); and for total protein (PROTt) from 5.7% (BayesB) to 22.9% (Stack). Using a subset of top SNPs (1.5k) selected from the GBM approach improved the accuracy for GBLUP from 1.8 to 76.5%. However, for the other models reductions in prediction accuracy of 4.8% for ENET (average of 10 traits), 5.9% for GBM (average of 21 traits), and 6.6% for Stack (average of 16 traits) were observed. Our results indicate that the Stack approach was more accurate in predicting metabolic disturbances than GBLUP, BayesB, ENET, and GBM and seemed to be competitive for predicting complex phenotypes with various degrees of mode of inheritance, i.e. additive and non-additive effects. Selecting markers based on GBM improved accuracy of GBLUP.
{"title":"Genomic prediction of blood biomarkers of metabolic disorders in Holstein cattle using parametric and nonparametric models","authors":"Lucio F. M. Mota, Diana Giannuzzi, Sara Pegolo, Enrico Sturaro, Daniel Gianola, Riccardo Negrini, Erminio Trevisi, Paolo Ajmone Marsan, Alessio Cecchinato","doi":"10.1186/s12711-024-00903-9","DOIUrl":"https://doi.org/10.1186/s12711-024-00903-9","url":null,"abstract":"Metabolic disturbances adversely impact productive and reproductive performance of dairy cattle due to changes in endocrine status and immune function, which increase the risk of disease. This may occur in the post-partum phase, but also throughout lactation, with sub-clinical symptoms. Recently, increased attention has been directed towards improved health and resilience in dairy cattle, and genomic selection (GS) could be a helpful tool for selecting animals that are more resilient to metabolic disturbances throughout lactation. Hence, we evaluated the genomic prediction of serum biomarkers levels for metabolic distress in 1353 Holsteins genotyped with the 100K single nucleotide polymorphism (SNP) chip assay. The GS was evaluated using parametric models best linear unbiased prediction (GBLUP), Bayesian B (BayesB), elastic net (ENET), and nonparametric models, gradient boosting machine (GBM) and stacking ensemble (Stack), which combines ENET and GBM approaches. The results show that the Stack approach outperformed other methods with a relative difference (RD), calculated as an increment in prediction accuracy, of approximately 18.0% compared to GBLUP, 12.6% compared to BayesB, 8.7% compared to ENET, and 4.4% compared to GBM. The highest RD in prediction accuracy between other models with respect to GBLUP was observed for haptoglobin (hapto) from 17.7% for BayesB to 41.2% for Stack; for Zn from 9.8% (BayesB) to 29.3% (Stack); for ceruloplasmin (CuCp) from 9.3% (BayesB) to 27.9% (Stack); for ferric reducing antioxidant power (FRAP) from 8.0% (BayesB) to 40.0% (Stack); and for total protein (PROTt) from 5.7% (BayesB) to 22.9% (Stack). Using a subset of top SNPs (1.5k) selected from the GBM approach improved the accuracy for GBLUP from 1.8 to 76.5%. However, for the other models reductions in prediction accuracy of 4.8% for ENET (average of 10 traits), 5.9% for GBM (average of 21 traits), and 6.6% for Stack (average of 16 traits) were observed. Our results indicate that the Stack approach was more accurate in predicting metabolic disturbances than GBLUP, BayesB, ENET, and GBM and seemed to be competitive for predicting complex phenotypes with various degrees of mode of inheritance, i.e. additive and non-additive effects. Selecting markers based on GBM improved accuracy of GBLUP.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"94 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140808245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-17DOI: 10.1186/s12711-024-00898-3
Tristan Kistler, Evert W. Brascamp, Benjamin Basso, Piter Bijma, Florence Phocas
Breeding queens may be mated with drones that are produced by a single drone-producing queen (DPQ), or a group of sister-DPQs, but often only the dam of the DPQ(s) is reported in the pedigree. Furthermore, datasets may include colony phenotypes from DPQs that were open-mated at different locations, and thus to a heterogeneous drone population. Simulation was used to investigate the impact of the mating strategy and its modelling on the estimates of genetic parameters and genetic trends when the DPQs are treated in different ways in the statistical evaluation model. We quantified the bias and standard error of the estimates when breeding queens were mated to one DPQ or a group of DPQs, assuming that this information was known or not. We also investigated four alternative strategies to accommodate the phenotypes of open-mated DPQs in the genetic evaluation: excluding their phenotypes, adding a dummy pseudo-sire in the pedigree, or adding a non-genetic (fixed or random) effect to the statistical evaluation model to account for the origin of the mates. The most precise estimates of genetic parameters and genetic trends were obtained when breeding queens were mated with drones of single DPQs that are correctly assigned in the pedigree. However, when they were mated with drones from one or a group of DPQs, and this information was not known, erroneous assumptions led to considerable bias in these estimates. Furthermore, genetic variances were considerably overestimated when phenotypes of colonies from open-mated DPQs were adjusted for their mates by adding a dummy pseudo-sire in the pedigree for each subpopulation of open-mating drones. On the contrary, correcting for the heterogeneous drone population by adding a non-genetic effect in the evaluation model produced unbiased estimates. Knowing only the dam of the DPQ(s) used in each mating may lead to erroneous assumptions on how DPQs were used and severely bias the estimates of genetic parameters and trends. Thus, we recommend keeping track of DPQs in the pedigree, and not only of the dams of DPQ(s). Records from DPQ colonies with queens open-mated to a heterogeneous drone population can be integrated by adding non-genetic effects to the statistical evaluation model.
{"title":"Uncertainty in the mating strategy of honeybees causes bias and unreliability in the estimates of genetic parameters","authors":"Tristan Kistler, Evert W. Brascamp, Benjamin Basso, Piter Bijma, Florence Phocas","doi":"10.1186/s12711-024-00898-3","DOIUrl":"https://doi.org/10.1186/s12711-024-00898-3","url":null,"abstract":"Breeding queens may be mated with drones that are produced by a single drone-producing queen (DPQ), or a group of sister-DPQs, but often only the dam of the DPQ(s) is reported in the pedigree. Furthermore, datasets may include colony phenotypes from DPQs that were open-mated at different locations, and thus to a heterogeneous drone population. Simulation was used to investigate the impact of the mating strategy and its modelling on the estimates of genetic parameters and genetic trends when the DPQs are treated in different ways in the statistical evaluation model. We quantified the bias and standard error of the estimates when breeding queens were mated to one DPQ or a group of DPQs, assuming that this information was known or not. We also investigated four alternative strategies to accommodate the phenotypes of open-mated DPQs in the genetic evaluation: excluding their phenotypes, adding a dummy pseudo-sire in the pedigree, or adding a non-genetic (fixed or random) effect to the statistical evaluation model to account for the origin of the mates. The most precise estimates of genetic parameters and genetic trends were obtained when breeding queens were mated with drones of single DPQs that are correctly assigned in the pedigree. However, when they were mated with drones from one or a group of DPQs, and this information was not known, erroneous assumptions led to considerable bias in these estimates. Furthermore, genetic variances were considerably overestimated when phenotypes of colonies from open-mated DPQs were adjusted for their mates by adding a dummy pseudo-sire in the pedigree for each subpopulation of open-mating drones. On the contrary, correcting for the heterogeneous drone population by adding a non-genetic effect in the evaluation model produced unbiased estimates. Knowing only the dam of the DPQ(s) used in each mating may lead to erroneous assumptions on how DPQs were used and severely bias the estimates of genetic parameters and trends. Thus, we recommend keeping track of DPQs in the pedigree, and not only of the dams of DPQ(s). Records from DPQ colonies with queens open-mated to a heterogeneous drone population can be integrated by adding non-genetic effects to the statistical evaluation model.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"22 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140604070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-16DOI: 10.1186/s12711-024-00902-w
Mary Kate Hollifield, Ching-Yi Chen, Eric Psota, Justin Holl, Daniela Lourenco, Ignacy Misztal
With the introduction of digital phenotyping and high-throughput data, traits that were previously difficult or impossible to measure directly have become easily accessible, offering the opportunity to enhance the efficiency and rate of genetic gain in animal production. It is of interest to assess how behavioral traits are indirectly related to the production traits during the performance testing period. The aim of this study was to assess the quality of behavior data extracted from day-wise video recordings and estimate the genetic parameters of behavior traits and their phenotypic and genetic correlations with production traits in pigs. Behavior was recorded for 70 days after on-test at about 10 weeks of age and ended at off-test for 2008 female purebred pigs, totaling 119,812 day-wise records. Behavior traits included time spent eating, drinking, laterally lying, sternally lying, sitting, standing, and meters of distance traveled. A quality control procedure was created for algorithm training and adjustment, standardizing recording hours, removing culled animals, and filtering unrealistic records. Production traits included average daily gain (ADG), back fat thickness (BF), and loin depth (LD). Single-trait linear models were used to estimate heritabilities of the behavior traits and two-trait linear models were used to estimate genetic correlations between behavior and production traits. The results indicated that all behavior traits are heritable, with heritability estimates ranging from 0.19 to 0.57, and showed low-to-moderate phenotypic and genetic correlations with production traits. Two-trait linear models were also used to compare traits at different intervals of the recording period. To analyze the redundancies in behavior data during the recording period, the averages of various recording time intervals for the behavior and production traits were compared. Overall, the average of the 55- to 68-day recording interval had the strongest phenotypic and genetic correlation estimates with the production traits. Digital phenotyping is a new and low-cost method to record behavior phenotypes, but thorough data cleaning procedures are needed. Evaluating behavioral traits at different time intervals offers a deeper insight into their changes throughout the growth periods and their relationship with production traits, which may be recorded at a less frequent basis.
{"title":"Estimating genetic parameters of digital behavior traits and their relationship with production traits in purebred pigs","authors":"Mary Kate Hollifield, Ching-Yi Chen, Eric Psota, Justin Holl, Daniela Lourenco, Ignacy Misztal","doi":"10.1186/s12711-024-00902-w","DOIUrl":"https://doi.org/10.1186/s12711-024-00902-w","url":null,"abstract":"With the introduction of digital phenotyping and high-throughput data, traits that were previously difficult or impossible to measure directly have become easily accessible, offering the opportunity to enhance the efficiency and rate of genetic gain in animal production. It is of interest to assess how behavioral traits are indirectly related to the production traits during the performance testing period. The aim of this study was to assess the quality of behavior data extracted from day-wise video recordings and estimate the genetic parameters of behavior traits and their phenotypic and genetic correlations with production traits in pigs. Behavior was recorded for 70 days after on-test at about 10 weeks of age and ended at off-test for 2008 female purebred pigs, totaling 119,812 day-wise records. Behavior traits included time spent eating, drinking, laterally lying, sternally lying, sitting, standing, and meters of distance traveled. A quality control procedure was created for algorithm training and adjustment, standardizing recording hours, removing culled animals, and filtering unrealistic records. Production traits included average daily gain (ADG), back fat thickness (BF), and loin depth (LD). Single-trait linear models were used to estimate heritabilities of the behavior traits and two-trait linear models were used to estimate genetic correlations between behavior and production traits. The results indicated that all behavior traits are heritable, with heritability estimates ranging from 0.19 to 0.57, and showed low-to-moderate phenotypic and genetic correlations with production traits. Two-trait linear models were also used to compare traits at different intervals of the recording period. To analyze the redundancies in behavior data during the recording period, the averages of various recording time intervals for the behavior and production traits were compared. Overall, the average of the 55- to 68-day recording interval had the strongest phenotypic and genetic correlation estimates with the production traits. Digital phenotyping is a new and low-cost method to record behavior phenotypes, but thorough data cleaning procedures are needed. Evaluating behavioral traits at different time intervals offers a deeper insight into their changes throughout the growth periods and their relationship with production traits, which may be recorded at a less frequent basis.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"151 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140557212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-09DOI: 10.1186/s12711-024-00897-4
Chao Wang, Choulin Chen, Bowen Lei, Shenghua Qin, Yuanyuan Zhang, Kui Li, Song Zhang, Yuwen Liu
Enhancer RNAs (eRNAs) play a crucial role in transcriptional regulation. While significant progress has been made in understanding epigenetic regulation mediated by eRNAs, research on the construction of eRNA-mediated gene regulatory networks (eGRN) and the identification of critical network components that influence complex traits is lacking. Here, employing the pig as a model, we conducted a comprehensive study using H3K27ac histone ChIP-seq and RNA-seq data to construct eRNA expression profiles from multiple tissues of two distinct pig breeds, namely Enshi Black (ES) and Duroc. In addition to revealing the regulatory landscape of eRNAs at the tissue level, we developed an innovative network construction and refinement method by integrating RNA-seq, ChIP-seq, genome-wide association study (GWAS) signals and enhancer-modulating effects of single nucleotide polymorphisms (SNPs) measured by self-transcribing active regulatory region sequencing (STARR-seq) experiments. Using this approach, we unraveled eGRN that significantly influence the growth and development of muscle and fat tissues, and identified several novel genes that affect adipocyte differentiation in a cell line model. Our work not only provides novel insights into the genetic basis of economic pig traits, but also offers a generalizable approach to elucidate the eRNA-mediated transcriptional regulation underlying a wide spectrum of complex traits for diverse organisms.
{"title":"Constructing eRNA-mediated gene regulatory networks to explore the genetic basis of muscle and fat-relevant traits in pigs","authors":"Chao Wang, Choulin Chen, Bowen Lei, Shenghua Qin, Yuanyuan Zhang, Kui Li, Song Zhang, Yuwen Liu","doi":"10.1186/s12711-024-00897-4","DOIUrl":"https://doi.org/10.1186/s12711-024-00897-4","url":null,"abstract":"Enhancer RNAs (eRNAs) play a crucial role in transcriptional regulation. While significant progress has been made in understanding epigenetic regulation mediated by eRNAs, research on the construction of eRNA-mediated gene regulatory networks (eGRN) and the identification of critical network components that influence complex traits is lacking. Here, employing the pig as a model, we conducted a comprehensive study using H3K27ac histone ChIP-seq and RNA-seq data to construct eRNA expression profiles from multiple tissues of two distinct pig breeds, namely Enshi Black (ES) and Duroc. In addition to revealing the regulatory landscape of eRNAs at the tissue level, we developed an innovative network construction and refinement method by integrating RNA-seq, ChIP-seq, genome-wide association study (GWAS) signals and enhancer-modulating effects of single nucleotide polymorphisms (SNPs) measured by self-transcribing active regulatory region sequencing (STARR-seq) experiments. Using this approach, we unraveled eGRN that significantly influence the growth and development of muscle and fat tissues, and identified several novel genes that affect adipocyte differentiation in a cell line model. Our work not only provides novel insights into the genetic basis of economic pig traits, but also offers a generalizable approach to elucidate the eRNA-mediated transcriptional regulation underlying a wide spectrum of complex traits for diverse organisms.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"133 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140538508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-08DOI: 10.1186/s12711-024-00896-5
Helen Schneider, Valentin Haas, Ana-Marija Krizanac, Clemens Falker-Gieske, Johannes Heise, Jens Tetens, Georg Thaller, Jörn Bennewitz
Claw diseases and mastitis represent the most important health issues in dairy cattle with a frequently mentioned connection to milk production. Although many studies have aimed at investigating this connection in more detail by estimating genetic correlations, they do not provide information about causality. An alternative is to carry out Mendelian randomization (MR) studies using genetic variants to investigate the effect of an exposure on an outcome trait mediated by genetic variants. No study has yet investigated the causal association of milk yield (MY) with health traits in dairy cattle. Hence, we performed a MR analysis of MY and seven health traits using imputed whole-genome sequence data from 34,497 German Holstein cows. We applied a method that uses summary statistics and removes horizontal pleiotropic variants (having an effect on both traits), which improves the power and unbiasedness of MR studies. In addition, genetic correlations between MY and each health trait were estimated to compare them with the estimates of causal effects that we expected. All genetic correlations between MY and each health trait were negative, ranging from − 0.303 (mastitis) to − 0.019 (digital dermatitis), which indicates a reduced health status as MY increases. The only non-significant correlation was between MY and digital dermatitis. In addition, each causal association was negative, ranging from − 0.131 (mastitis) to − 0.034 (laminitis), but the number of significant associations was reduced to five nominal and two experiment-wide significant results. The latter were between MY and mastitis and between MY and digital phlegmon. Horizontal pleiotropic variants were identified for mastitis, digital dermatitis and digital phlegmon. They were located within or nearby variants that were previously reported to have a horizontal pleiotropic effect, e.g., on milk production and somatic cell count. Our results confirm the known negative genetic connection between health traits and MY in dairy cattle. In addition, they provide new information about causality, which for example points to the negative energy balance mediating the connection between these traits. This knowledge helps to better understand whether the negative genetic correlation is based on pleiotropy, linkage between causal variants for both trait complexes, or indeed on a causal association.
牛爪疾病和乳腺炎是奶牛最重要的健康问题,经常被提及与产奶量有关。虽然许多研究都旨在通过估计遗传相关性来更详细地调查这种联系,但它们并不能提供因果关系的信息。另一种方法是利用基因变异开展孟德尔随机化(MR)研究,调查基因变异介导的暴露对结果性状的影响。目前还没有研究调查奶牛产奶量(MY)与健康性状之间的因果关系。因此,我们利用来自 34,497 头德国荷斯坦奶牛的估算全基因组序列数据,对产奶量和七个健康性状进行了 MR 分析。我们采用了一种使用汇总统计的方法,并剔除了水平多向变异(对两个性状都有影响),从而提高了 MR 研究的功率和无偏性。此外,我们还估算了MY与每个健康性状之间的遗传相关性,以便与我们预期的因果效应估算值进行比较。MY与各健康性状之间的所有遗传相关性均为负,从- 0.303(乳腺炎)到- 0.019(数字皮炎)不等,这表明随着MY的增加,健康状况会下降。唯一不显著的相关性是 MY 与数字皮炎之间的相关性。此外,每种因果关系都是负相关,从- 0.131(乳腺炎)到- 0.034(蹄叶炎)不等,但显著相关的数量减少到 5 个名义显著结果和 2 个整个实验的显著结果。后者是 MY 与乳腺炎和 MY 与数字痰之间的关系。在乳腺炎、数字皮炎和数字痰中发现了水平多向变异。这些变异位于以前报道过的对产奶量和体细胞数有水平多效应的变异内或附近。我们的研究结果证实了奶牛健康性状与 MY 之间已知的负遗传联系。此外,它们还提供了有关因果关系的新信息,例如指出负能量平衡介导了这些性状之间的联系。这些知识有助于更好地理解负遗传相关性是基于多效性、两个性状复合体的因果变异之间的联系,还是确实存在因果关联。
{"title":"Mendelian randomization analysis of 34,497 German Holstein cows to infer causal associations between milk production and health traits","authors":"Helen Schneider, Valentin Haas, Ana-Marija Krizanac, Clemens Falker-Gieske, Johannes Heise, Jens Tetens, Georg Thaller, Jörn Bennewitz","doi":"10.1186/s12711-024-00896-5","DOIUrl":"https://doi.org/10.1186/s12711-024-00896-5","url":null,"abstract":"Claw diseases and mastitis represent the most important health issues in dairy cattle with a frequently mentioned connection to milk production. Although many studies have aimed at investigating this connection in more detail by estimating genetic correlations, they do not provide information about causality. An alternative is to carry out Mendelian randomization (MR) studies using genetic variants to investigate the effect of an exposure on an outcome trait mediated by genetic variants. No study has yet investigated the causal association of milk yield (MY) with health traits in dairy cattle. Hence, we performed a MR analysis of MY and seven health traits using imputed whole-genome sequence data from 34,497 German Holstein cows. We applied a method that uses summary statistics and removes horizontal pleiotropic variants (having an effect on both traits), which improves the power and unbiasedness of MR studies. In addition, genetic correlations between MY and each health trait were estimated to compare them with the estimates of causal effects that we expected. All genetic correlations between MY and each health trait were negative, ranging from − 0.303 (mastitis) to − 0.019 (digital dermatitis), which indicates a reduced health status as MY increases. The only non-significant correlation was between MY and digital dermatitis. In addition, each causal association was negative, ranging from − 0.131 (mastitis) to − 0.034 (laminitis), but the number of significant associations was reduced to five nominal and two experiment-wide significant results. The latter were between MY and mastitis and between MY and digital phlegmon. Horizontal pleiotropic variants were identified for mastitis, digital dermatitis and digital phlegmon. They were located within or nearby variants that were previously reported to have a horizontal pleiotropic effect, e.g., on milk production and somatic cell count. Our results confirm the known negative genetic connection between health traits and MY in dairy cattle. In addition, they provide new information about causality, which for example points to the negative energy balance mediating the connection between these traits. This knowledge helps to better understand whether the negative genetic correlation is based on pleiotropy, linkage between causal variants for both trait complexes, or indeed on a causal association.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"26 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140534513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-02DOI: 10.1186/s12711-024-00895-6
Iliyass Biada, Noelia Ibáñez-Escriche, Agustín Blasco, Cristina Casto-Rebollo, Maria A. Santacreu
Longevity and resilience are two fundamental traits for more sustainable livestock production. These traits are closely related, as resilient animals tend to have longer lifespans. An interesting criterion for increasing longevity in rabbit could be based on the information provided by its gut microbiome. The gut microbiome is essential for regulating health and plays crucial roles in the development of the immune system. The aim of this research was to investigate if animals with different longevities have different microbial profiles. We sequenced the 16S rRNA gene from soft faeces from 95 does. First, we compared two maternal rabbit lines with different longevities; a standard longevity maternal line (A) and a maternal line (LP) that was founded based on longevity criteria: females with a minimum of 25 parities with an average prolificacy per parity of 9 or more. Second, we compared the gut microbiota of two groups of animals from line LP with different longevities: females that died/were culled with two parities or less (LLP) and females with more than 15 parities (HLP). Differences in alpha and beta diversity were observed between lines A and LP, and a partial least square discriminant analysis (PLS-DA) showed a high prediction accuracy (> 91%) of classification of animals to line A versus LP (146 amplicon sequence variants (ASV)). The PLS-DA also showed a high prediction accuracy (> 94%) to classify animals to the LLP and HLP groups (53 ASV). Interestingly, some of the most important taxa identified in the PLS-DA were common to both comparisons (Akkermansia, Christensenellaceae R-7, Uncultured Eubacteriaceae, among others) and have been reported to be related to resilience and longevity. Our results indicate that the first parity gut microbiome profile differs between the two rabbit maternal lines (A and LP) and, to a lesser extent, between animals of line LP with different longevities (LLP and HLP). Several genera were able to discriminate animals from the two lines and animals with different longevities, which shows that the gut microbiome could be used as a predictive factor for longevity, or as a selection criterion for these traits.
{"title":"Microbiome composition as a potential predictor of longevity in rabbits","authors":"Iliyass Biada, Noelia Ibáñez-Escriche, Agustín Blasco, Cristina Casto-Rebollo, Maria A. Santacreu","doi":"10.1186/s12711-024-00895-6","DOIUrl":"https://doi.org/10.1186/s12711-024-00895-6","url":null,"abstract":"Longevity and resilience are two fundamental traits for more sustainable livestock production. These traits are closely related, as resilient animals tend to have longer lifespans. An interesting criterion for increasing longevity in rabbit could be based on the information provided by its gut microbiome. The gut microbiome is essential for regulating health and plays crucial roles in the development of the immune system. The aim of this research was to investigate if animals with different longevities have different microbial profiles. We sequenced the 16S rRNA gene from soft faeces from 95 does. First, we compared two maternal rabbit lines with different longevities; a standard longevity maternal line (A) and a maternal line (LP) that was founded based on longevity criteria: females with a minimum of 25 parities with an average prolificacy per parity of 9 or more. Second, we compared the gut microbiota of two groups of animals from line LP with different longevities: females that died/were culled with two parities or less (LLP) and females with more than 15 parities (HLP). Differences in alpha and beta diversity were observed between lines A and LP, and a partial least square discriminant analysis (PLS-DA) showed a high prediction accuracy (> 91%) of classification of animals to line A versus LP (146 amplicon sequence variants (ASV)). The PLS-DA also showed a high prediction accuracy (> 94%) to classify animals to the LLP and HLP groups (53 ASV). Interestingly, some of the most important taxa identified in the PLS-DA were common to both comparisons (Akkermansia, Christensenellaceae R-7, Uncultured Eubacteriaceae, among others) and have been reported to be related to resilience and longevity. Our results indicate that the first parity gut microbiome profile differs between the two rabbit maternal lines (A and LP) and, to a lesser extent, between animals of line LP with different longevities (LLP and HLP). Several genera were able to discriminate animals from the two lines and animals with different longevities, which shows that the gut microbiome could be used as a predictive factor for longevity, or as a selection criterion for these traits.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"63 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140534327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-02DOI: 10.1186/s12711-024-00893-8
Ruifei Yang, Siqi Jin, Suyun Fang, Dawei Yan, Hao Zhang, Jingru Nie, Jinqiao Liu, Minjuan Lv, Bo Zhang, Xinxing Dong
Gene flow is crucial for enhancing economic traits of livestock. In China, breeders have used hybridization strategies for decades to improve livestock performance. Here, we performed whole-genome sequencing of a native Chinese Lijiang pig (LJP) breed. By integrating previously published data, we explored the genetic structure and introgression of genetic components from commercial European pigs (EP) into the LJP, and examined the impact of this introgression on phenotypic traits. Our analysis revealed significant introgression of EP breeds into the LJP and other domestic pig breeds in China. Using a haplotype-based approach, we quantified introgression levels and compared EP to LJP and other Chinese domestic pigs. The results show that EP introgression is widely prevalent in Chinese domestic pigs, although there are significant differences between breeds. We propose that LJP could potentially act as a mediator for the transmission of EP haplotypes. We also examined the correlation between EP introgression and the number of thoracic vertebrae in LJP and identified VRTN and STUM as candidate genes for this trait. Our study provides evidence of introgressed European haplotypes in the LJP breed and describes the potential role of EP introgression on phenotypic changes of this indigenous breed.
基因流动对于提高牲畜的经济性状至关重要。在中国,几十年来育种者一直在使用杂交策略来提高家畜的性能。在这里,我们对中国本土的丽江猪(LJP)品种进行了全基因组测序。通过整合之前已发表的数据,我们探索了丽江猪的遗传结构以及欧洲商品猪(EP)基因成分对丽江猪的导入,并研究了这种导入对表型性状的影响。我们的分析表明,欧洲猪种在 LJP 和中国其他国内猪种中有明显的导入。利用基于单倍型的方法,我们量化了引种水平,并将 EP 与 LJP 及其他中国家猪进行了比较。结果表明,尽管不同猪种之间存在显著差异,但 EP 引种在中国家猪中广泛存在。我们认为,LJP 有可能是 EP 单倍型传播的媒介。我们还研究了 EP 导入与 LJP 胸椎数量之间的相关性,并确定 VRTN 和 STUM 为该性状的候选基因。我们的研究提供了在LJP品种中引入欧洲单倍型的证据,并描述了EP引入对这一本土品种表型变化的潜在作用。
{"title":"Genetic introgression from commercial European pigs to the indigenous Chinese Lijiang breed and associated changes in phenotypes","authors":"Ruifei Yang, Siqi Jin, Suyun Fang, Dawei Yan, Hao Zhang, Jingru Nie, Jinqiao Liu, Minjuan Lv, Bo Zhang, Xinxing Dong","doi":"10.1186/s12711-024-00893-8","DOIUrl":"https://doi.org/10.1186/s12711-024-00893-8","url":null,"abstract":"Gene flow is crucial for enhancing economic traits of livestock. In China, breeders have used hybridization strategies for decades to improve livestock performance. Here, we performed whole-genome sequencing of a native Chinese Lijiang pig (LJP) breed. By integrating previously published data, we explored the genetic structure and introgression of genetic components from commercial European pigs (EP) into the LJP, and examined the impact of this introgression on phenotypic traits. Our analysis revealed significant introgression of EP breeds into the LJP and other domestic pig breeds in China. Using a haplotype-based approach, we quantified introgression levels and compared EP to LJP and other Chinese domestic pigs. The results show that EP introgression is widely prevalent in Chinese domestic pigs, although there are significant differences between breeds. We propose that LJP could potentially act as a mediator for the transmission of EP haplotypes. We also examined the correlation between EP introgression and the number of thoracic vertebrae in LJP and identified VRTN and STUM as candidate genes for this trait. Our study provides evidence of introgressed European haplotypes in the LJP breed and describes the potential role of EP introgression on phenotypic changes of this indigenous breed.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"130 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140534408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}