Pub Date : 2025-04-09DOI: 10.1186/s12711-025-00965-3
Joseph L. Matt, Jessica Moss Small, Peter D. Kube, Standish K. Allen
Triploid oysters, bred by crossing tetraploid and diploid oysters, are common worldwide in commercial oyster aquaculture and make up much of the hatchery-produced Crassostrea virginica farmed in the mid-Atlantic and southeast of the United States. Breeding diploid and tetraploid animals for genetic improvement of triploid progeny is unique to oysters and can proceed via several possible breeding strategies. Triploid oysters, along with their diploid or tetraploid relatives, have yet been subject to quantitative genetic analyses that could inform a breeding strategy of triploid improvement. The importance of quantitative genetic analyses involving triploid C. virginica has been emphasized by the occurrence of mortality events of near-market sized triploids in late spring. Genetic parameters for survival and weight of triploid and tetraploid C. virginica were estimated from twenty paternal half-sib triploid families and thirty-nine full-sib tetraploid families reared at three sites in the Chesapeake Bay (USA). Traits were analyzed using linear mixed models in ASReml-R. Genetic relationship matrices appropriate for pedigrees with triploid and tetraploid animals were produced using the polyAinv package in R. A mortality event in triploids occurred at one site located on the bayside of the Eastern Shore of Virginia. Between early May and early July, three triploid families had survival of less than 0.70, while most had survival greater than 0.90. The heritability for survival during this period in triploids at this affected site was 0.57 ± 0.23. Triploid survival at the affected site was adversely related to triploid survival at the low salinity site (− 0.50 ± 0.23) and unrelated to tetraploid survival at the site with similar salinity (0.05 ± 0.39). Survival during a late spring mortality event in triploids had a substantial additive genetic basis, suggesting selective breeding of tetraploids can reduce triploid mortalities. Genetic correlations revealed evidence of genotype by environment interactions for triploid survival and weak genetic correlations between survival of tetraploids and triploids. A selective breeding strategy with phenotyping of tetraploid and triploid half-sibs is recommended for genetic improvement of triploid oysters.
{"title":"Quantitative genetic analysis of late spring mortality in triploid Crassostrea virginica","authors":"Joseph L. Matt, Jessica Moss Small, Peter D. Kube, Standish K. Allen","doi":"10.1186/s12711-025-00965-3","DOIUrl":"https://doi.org/10.1186/s12711-025-00965-3","url":null,"abstract":"Triploid oysters, bred by crossing tetraploid and diploid oysters, are common worldwide in commercial oyster aquaculture and make up much of the hatchery-produced Crassostrea virginica farmed in the mid-Atlantic and southeast of the United States. Breeding diploid and tetraploid animals for genetic improvement of triploid progeny is unique to oysters and can proceed via several possible breeding strategies. Triploid oysters, along with their diploid or tetraploid relatives, have yet been subject to quantitative genetic analyses that could inform a breeding strategy of triploid improvement. The importance of quantitative genetic analyses involving triploid C. virginica has been emphasized by the occurrence of mortality events of near-market sized triploids in late spring. Genetic parameters for survival and weight of triploid and tetraploid C. virginica were estimated from twenty paternal half-sib triploid families and thirty-nine full-sib tetraploid families reared at three sites in the Chesapeake Bay (USA). Traits were analyzed using linear mixed models in ASReml-R. Genetic relationship matrices appropriate for pedigrees with triploid and tetraploid animals were produced using the polyAinv package in R. A mortality event in triploids occurred at one site located on the bayside of the Eastern Shore of Virginia. Between early May and early July, three triploid families had survival of less than 0.70, while most had survival greater than 0.90. The heritability for survival during this period in triploids at this affected site was 0.57 ± 0.23. Triploid survival at the affected site was adversely related to triploid survival at the low salinity site (− 0.50 ± 0.23) and unrelated to tetraploid survival at the site with similar salinity (0.05 ± 0.39). Survival during a late spring mortality event in triploids had a substantial additive genetic basis, suggesting selective breeding of tetraploids can reduce triploid mortalities. Genetic correlations revealed evidence of genotype by environment interactions for triploid survival and weak genetic correlations between survival of tetraploids and triploids. A selective breeding strategy with phenotyping of tetraploid and triploid half-sibs is recommended for genetic improvement of triploid oysters.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"108 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-07DOI: 10.1186/s12711-025-00948-4
Lisa Büttgen, Henner Simianer, Torsten Pook
Genomic selection has become an integral component of modern animal breeding programs, having the potential to improve the efficiency of layer breeding programs both by obtaining higher prediction accuracies and reducing the generation interval, particularly for males, who cannot be phenotyped for sex-limited traits such as laying performance. In the current study, we investigate different strategies to reduce the generation interval either for both sexes or only for the male side of the breeding scheme based on stochastic simulation using the software MoBPS. Additionally, prediction accuracies based on varying proportions of genotyping and phenotype- and pedigree-based selection as well as genomic breeding values are compared. Selection of hens based on estimated breeding values, either pedigree-based or genomic, increased genetic gain compared to selection based on phenotypes only. The use of two time-shifted subpopulations with exchange of males between subpopulations to reduce the generation interval on the male side led to significantly higher genetic gains. Reducing the generation interval for both males and females was only efficient when population sizes were maintained, which result in doubling of the number of females to genotype and phenotype within the same time frame compared to the scenarios with the longer generation intervals. Although substantially higher gains were obtained by in particular pedigree-based selection of females and a reduction of generation intervals this led to substantially greater rates of inbreeding per year. The use of a genomic relationship matrix in breeding value estimation instead of a pedigree-based relationship matrix not only increased genetic gains but also reduced inbreeding rates. The use of optimum contribution selection led to basically the same genetic gains as without it but reduced inbreeding rates. However, overall differences obtained with optimal contribution selection were small compared to differences caused by the other effects that were considered. The reduction of the generation interval on the male side by the use of genomic estimated breeding values was highly beneficial. Reduction of the generation interval on the female side was only beneficial when a high proportion of hens was genotyped and housing capacities were increased. On the female side of a layer breeding program, selection based on pedigree-based estimated breeding values was inferior to phenotypic selection, as it resulted in a substantial increase in inbreeding rates.
{"title":"Analysis of different genotyping and selection strategies in laying hen breeding programs","authors":"Lisa Büttgen, Henner Simianer, Torsten Pook","doi":"10.1186/s12711-025-00948-4","DOIUrl":"https://doi.org/10.1186/s12711-025-00948-4","url":null,"abstract":"Genomic selection has become an integral component of modern animal breeding programs, having the potential to improve the efficiency of layer breeding programs both by obtaining higher prediction accuracies and reducing the generation interval, particularly for males, who cannot be phenotyped for sex-limited traits such as laying performance. In the current study, we investigate different strategies to reduce the generation interval either for both sexes or only for the male side of the breeding scheme based on stochastic simulation using the software MoBPS. Additionally, prediction accuracies based on varying proportions of genotyping and phenotype- and pedigree-based selection as well as genomic breeding values are compared. Selection of hens based on estimated breeding values, either pedigree-based or genomic, increased genetic gain compared to selection based on phenotypes only. The use of two time-shifted subpopulations with exchange of males between subpopulations to reduce the generation interval on the male side led to significantly higher genetic gains. Reducing the generation interval for both males and females was only efficient when population sizes were maintained, which result in doubling of the number of females to genotype and phenotype within the same time frame compared to the scenarios with the longer generation intervals. Although substantially higher gains were obtained by in particular pedigree-based selection of females and a reduction of generation intervals this led to substantially greater rates of inbreeding per year. The use of a genomic relationship matrix in breeding value estimation instead of a pedigree-based relationship matrix not only increased genetic gains but also reduced inbreeding rates. The use of optimum contribution selection led to basically the same genetic gains as without it but reduced inbreeding rates. However, overall differences obtained with optimal contribution selection were small compared to differences caused by the other effects that were considered. The reduction of the generation interval on the male side by the use of genomic estimated breeding values was highly beneficial. Reduction of the generation interval on the female side was only beneficial when a high proportion of hens was genotyped and housing capacities were increased. On the female side of a layer breeding program, selection based on pedigree-based estimated breeding values was inferior to phenotypic selection, as it resulted in a substantial increase in inbreeding rates.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"6 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143790194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-01DOI: 10.1186/s12711-025-00956-4
Christopher M. Pooley, Glenn Marion, Jamie Prentice, Ricardo Pong-Wong, Stephen C. Bishop, Andrea Doeschl-Wilson
Genetic selection of individuals that are less susceptible to infection, less infectious once infected, and recover faster, offers an effective and long-lasting solution to reduce the incidence and impact of infectious diseases in farmed animals. However, computational methods for simultaneously estimating genetic parameters for host susceptibility, infectivity and recoverability from real-word data have been lacking. Our previously developed methodology and software tool SIRE 1.0 (Susceptibility, Infectivity and Recoverability Estimator) allows estimation of host genetic effects of a single nucleotide polymorphism (SNP), or other fixed effects (e.g. breed, vaccination status), for these three host traits using individual disease data typically available from field studies and challenge experiments. SIRE 1.0, however, lacks the capability to estimate genetic parameters for these traits in the likely case of underlying polygenic control. This paper introduces novel Bayesian methodology and a new software tool SIRE 2.0 for estimating polygenic contributions (i.e. variance components and additive genetic effects) for host susceptibility, infectivity and recoverability from temporal epidemic data, assuming that pedigree or genomic relationships are known. Analytical expressions for prediction accuracies (PAs) for these traits are derived for simplified scenarios, revealing their dependence on genetic and phenotypic variances, and the distribution of related individuals within and between contact groups. PAs for infectivity are found to be critically dependent on the size of contact groups. Validation of the methodology with data from simulated epidemics demonstrates good agreement between numerically generated PAs and analytical predictions. Genetic correlations between infectivity and other traits substantially increase trait PAs. Incomplete data (e.g. time censored or infrequent sampling) generally yield only small reductions in PAs, except for when infection times are completely unknown, which results in a substantial reduction. The method presented can estimate genetic parameters for host susceptibility, infectivity and recoverability from individual disease records. The freely available SIRE 2.0 software provides a valuable extension to SIRE 1.0 for estimating host polygenic effects underlying infectious disease transmission. This tool will open up new possibilities for analysis and quantification of genetic determinates of disease dynamics.
{"title":"SIRE 2.0: a novel method for estimating polygenic host effects underlying infectious disease transmission, and analytical expressions for prediction accuracies","authors":"Christopher M. Pooley, Glenn Marion, Jamie Prentice, Ricardo Pong-Wong, Stephen C. Bishop, Andrea Doeschl-Wilson","doi":"10.1186/s12711-025-00956-4","DOIUrl":"https://doi.org/10.1186/s12711-025-00956-4","url":null,"abstract":"Genetic selection of individuals that are less susceptible to infection, less infectious once infected, and recover faster, offers an effective and long-lasting solution to reduce the incidence and impact of infectious diseases in farmed animals. However, computational methods for simultaneously estimating genetic parameters for host susceptibility, infectivity and recoverability from real-word data have been lacking. Our previously developed methodology and software tool SIRE 1.0 (Susceptibility, Infectivity and Recoverability Estimator) allows estimation of host genetic effects of a single nucleotide polymorphism (SNP), or other fixed effects (e.g. breed, vaccination status), for these three host traits using individual disease data typically available from field studies and challenge experiments. SIRE 1.0, however, lacks the capability to estimate genetic parameters for these traits in the likely case of underlying polygenic control. This paper introduces novel Bayesian methodology and a new software tool SIRE 2.0 for estimating polygenic contributions (i.e. variance components and additive genetic effects) for host susceptibility, infectivity and recoverability from temporal epidemic data, assuming that pedigree or genomic relationships are known. Analytical expressions for prediction accuracies (PAs) for these traits are derived for simplified scenarios, revealing their dependence on genetic and phenotypic variances, and the distribution of related individuals within and between contact groups. PAs for infectivity are found to be critically dependent on the size of contact groups. Validation of the methodology with data from simulated epidemics demonstrates good agreement between numerically generated PAs and analytical predictions. Genetic correlations between infectivity and other traits substantially increase trait PAs. Incomplete data (e.g. time censored or infrequent sampling) generally yield only small reductions in PAs, except for when infection times are completely unknown, which results in a substantial reduction. The method presented can estimate genetic parameters for host susceptibility, infectivity and recoverability from individual disease records. The freely available SIRE 2.0 software provides a valuable extension to SIRE 1.0 for estimating host polygenic effects underlying infectious disease transmission. This tool will open up new possibilities for analysis and quantification of genetic determinates of disease dynamics.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"34 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143757995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-28DOI: 10.1186/s12711-025-00962-6
Manu Kumar Gundappa, Diego Robledo, Alastair Hamilton, Ross D. Houston, James G. D. Prendergast, Daniel J. Macqueen
Whole genome sequencing (WGS), despite its advantages, is yet to replace methods for genotyping single nucleotide variants (SNVs) such as SNP arrays and targeted genotyping assays. Structural variants (SVs) have larger effects on traits than SNVs, but are more challenging to accurately genotype. Using low-coverage WGS with genotype imputation offers a cost-effective strategy to achieve genome-wide variant coverage, but is yet to be tested for SVs. Here, we investigate combined SNV and SV imputation with low-coverage WGS data in Atlantic salmon (Salmo salar). As the reference panel, we used genotypes for high-confidence SVs and SNVs for n = 365 wild individuals sampled from diverse populations. We also generated 15 × WGS data (n = 20 samples) for a commercial population external to the reference panel, and called SVs and SNVs with gold-standard approaches. An imputation method selected for its established performance using low-coverage sequencing data (GLIMPSE) was tested at WGS depths of 1 × , 2 × , 3 × , and 4 × for samples within and external to the reference panel. SNVs were imputed with high accuracy and recall across all WGS depths, including for samples out-with the reference panel. For SVs, we compared imputation based purely on linkage disequilibrium (LD) with SNVs, to that supplemented with SV genotype likelihoods (GLs) from low-coverage WGS. Including SV GLs increased imputation accuracy, but as a trade-off with recall, requiring 3–4 × depth for best performance. Combining strategies allowed us to capture 84% of the reference panel deletions with 87% accuracy at 1 × depth. We also show that SV length affects imputation performance, with provision of SV GLs greatly enhancing accuracy for the longest SVs in the dataset. This study highlights the promise of reference panel imputation using low-coverage WGS, including novel opportunities to enhance the resolution of genome-wide association studies by capturing SVs.
{"title":"High performance imputation of structural and single nucleotide variants using low-coverage whole genome sequencing","authors":"Manu Kumar Gundappa, Diego Robledo, Alastair Hamilton, Ross D. Houston, James G. D. Prendergast, Daniel J. Macqueen","doi":"10.1186/s12711-025-00962-6","DOIUrl":"https://doi.org/10.1186/s12711-025-00962-6","url":null,"abstract":"Whole genome sequencing (WGS), despite its advantages, is yet to replace methods for genotyping single nucleotide variants (SNVs) such as SNP arrays and targeted genotyping assays. Structural variants (SVs) have larger effects on traits than SNVs, but are more challenging to accurately genotype. Using low-coverage WGS with genotype imputation offers a cost-effective strategy to achieve genome-wide variant coverage, but is yet to be tested for SVs. Here, we investigate combined SNV and SV imputation with low-coverage WGS data in Atlantic salmon (Salmo salar). As the reference panel, we used genotypes for high-confidence SVs and SNVs for n = 365 wild individuals sampled from diverse populations. We also generated 15 × WGS data (n = 20 samples) for a commercial population external to the reference panel, and called SVs and SNVs with gold-standard approaches. An imputation method selected for its established performance using low-coverage sequencing data (GLIMPSE) was tested at WGS depths of 1 × , 2 × , 3 × , and 4 × for samples within and external to the reference panel. SNVs were imputed with high accuracy and recall across all WGS depths, including for samples out-with the reference panel. For SVs, we compared imputation based purely on linkage disequilibrium (LD) with SNVs, to that supplemented with SV genotype likelihoods (GLs) from low-coverage WGS. Including SV GLs increased imputation accuracy, but as a trade-off with recall, requiring 3–4 × depth for best performance. Combining strategies allowed us to capture 84% of the reference panel deletions with 87% accuracy at 1 × depth. We also show that SV length affects imputation performance, with provision of SV GLs greatly enhancing accuracy for the longest SVs in the dataset. This study highlights the promise of reference panel imputation using low-coverage WGS, including novel opportunities to enhance the resolution of genome-wide association studies by capturing SVs.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"57 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143723122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-21DOI: 10.1186/s12711-025-00964-4
Theo Meuwissen, Vinzent Boerner
The GWABLUP (Genome-Wide Association based Best Linear Unbiased Prediction) approach used GWA analysis results to differentially weigh the SNPs in genomic prediction, and was found to improve the reliabilities of genomic predictions. However, the proposed multitrait GWABLUP method assumed that the SNP weights were the same across the traits. Here we extended and validated the multitrait GWABLUP method towards using trait specific SNP weights. In a 3-trait dairy data set, multitrait GWAS estimates of SNP effects and their standard errors were translated into trait specific likelihood ratios for the SNPs having trait effects, and posterior probabilities using the GWABLUP approach. This produced trait specific prior (co)variance matrices for each SNP, which were applied in a SNP-BLUP model for genomic predictions, implemented in the APEX linear model suite. In a validation population, the trait specific SNP weights resulted in more reliable predictions for all three traits. Especially, for somatic cell count, which was hardly related to the other traits, the use of the same weights across all traits was harming genomic predictions. The use of trait specific SNP weights overcame this problem. In multitrait GWABLUP analyses of ~ 30,000 reference population cows, trait specific SNP weights resulted in up to 13% more reliable genomic predictions than unweighted SNP-BLUP, and improved genomic predictions for all three studied traits.
{"title":"Multitrait genome-wide association best linear unbiased prediction of genetic values","authors":"Theo Meuwissen, Vinzent Boerner","doi":"10.1186/s12711-025-00964-4","DOIUrl":"https://doi.org/10.1186/s12711-025-00964-4","url":null,"abstract":"The GWABLUP (Genome-Wide Association based Best Linear Unbiased Prediction) approach used GWA analysis results to differentially weigh the SNPs in genomic prediction, and was found to improve the reliabilities of genomic predictions. However, the proposed multitrait GWABLUP method assumed that the SNP weights were the same across the traits. Here we extended and validated the multitrait GWABLUP method towards using trait specific SNP weights. In a 3-trait dairy data set, multitrait GWAS estimates of SNP effects and their standard errors were translated into trait specific likelihood ratios for the SNPs having trait effects, and posterior probabilities using the GWABLUP approach. This produced trait specific prior (co)variance matrices for each SNP, which were applied in a SNP-BLUP model for genomic predictions, implemented in the APEX linear model suite. In a validation population, the trait specific SNP weights resulted in more reliable predictions for all three traits. Especially, for somatic cell count, which was hardly related to the other traits, the use of the same weights across all traits was harming genomic predictions. The use of trait specific SNP weights overcame this problem. In multitrait GWABLUP analyses of ~ 30,000 reference population cows, trait specific SNP weights resulted in up to 13% more reliable genomic predictions than unweighted SNP-BLUP, and improved genomic predictions for all three studied traits.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"61 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-21DOI: 10.1186/s12711-025-00963-5
Didier Boichard, Sébastien Fritz, Pascal Croiseau, Vincent Ducrocq, Thierry Tribout, Beatriz C. D. Cuyabano
Most validation studies of genomic evaluations on candidates (prior to observing phenotypes) present inflation of their predicted breeding values, i.e., regression coefficients of their later observed phenotypes on the early predictions are smaller than one. The aim of this study was to show that this inflation pattern reflects at least partly long-distance associations between markers and quantitative trait loci (QTL) in the reference population and to propose methods to estimate the corresponding “erosion” coefficient. Across-chromosome linkage disequilibrium (LD) is observed in different dairy cattle breeds, being a result from limited effective population size and from relationships within the reference population. Due to this long distance LD, the estimated SNP effects capture non-zero contributions from distant QTLs, some located on other chromosomes than the SNP itself. Therefore, corresponding SNP effects are partly lost in the next generations and we refer to this loss as “erosion”. With the concept of QTL contribution to SNP effects derived from mixed model equations, we show with simulation that this long range LD explains 6–25% of the variance of the estimated genomic breeding values, a proportion that is unchanged when the evaluation model includes a residual polygenic effect. Two methods are proposed to predict this erosion factor assuming known simulated QTL effects. In Method 1, one generation of progeny is simulated from the reference population and the GEBV of these progeny based on SNP effects estimated in this newly simulated generation are regressed on the GEBV of the same progeny based on SNP effects estimated in the reference population. In Method 2 all the QTL contributions to SNP effects are regressed based on SNP-QTL recombination rates and summed to predict the GEBV at the next generation. The regression coefficient of the GEBV based on eroded contributions on the raw GEBV is also an estimate of erosion. An illustration is given with the French Normande female reference bovine population in 2021, showing erosion factors ranging from 0.84 to 0.87. Accounting for erosion is important to avoid inflation and biased predictions. The ways to both reduce inflation and to correct for it in the prediction are discussed.
{"title":"Erosion of estimated genomic breeding values with generations is due to long distance associations between markers and QTL","authors":"Didier Boichard, Sébastien Fritz, Pascal Croiseau, Vincent Ducrocq, Thierry Tribout, Beatriz C. D. Cuyabano","doi":"10.1186/s12711-025-00963-5","DOIUrl":"https://doi.org/10.1186/s12711-025-00963-5","url":null,"abstract":"Most validation studies of genomic evaluations on candidates (prior to observing phenotypes) present inflation of their predicted breeding values, i.e., regression coefficients of their later observed phenotypes on the early predictions are smaller than one. The aim of this study was to show that this inflation pattern reflects at least partly long-distance associations between markers and quantitative trait loci (QTL) in the reference population and to propose methods to estimate the corresponding “erosion” coefficient. Across-chromosome linkage disequilibrium (LD) is observed in different dairy cattle breeds, being a result from limited effective population size and from relationships within the reference population. Due to this long distance LD, the estimated SNP effects capture non-zero contributions from distant QTLs, some located on other chromosomes than the SNP itself. Therefore, corresponding SNP effects are partly lost in the next generations and we refer to this loss as “erosion”. With the concept of QTL contribution to SNP effects derived from mixed model equations, we show with simulation that this long range LD explains 6–25% of the variance of the estimated genomic breeding values, a proportion that is unchanged when the evaluation model includes a residual polygenic effect. Two methods are proposed to predict this erosion factor assuming known simulated QTL effects. In Method 1, one generation of progeny is simulated from the reference population and the GEBV of these progeny based on SNP effects estimated in this newly simulated generation are regressed on the GEBV of the same progeny based on SNP effects estimated in the reference population. In Method 2 all the QTL contributions to SNP effects are regressed based on SNP-QTL recombination rates and summed to predict the GEBV at the next generation. The regression coefficient of the GEBV based on eroded contributions on the raw GEBV is also an estimate of erosion. An illustration is given with the French Normande female reference bovine population in 2021, showing erosion factors ranging from 0.84 to 0.87. Accounting for erosion is important to avoid inflation and biased predictions. The ways to both reduce inflation and to correct for it in the prediction are discussed.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"1 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143666295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To address the increasing demand for high-quality pork protein, it is essential to implement strategies that enhance diets and produce pigs with excellent production traits. Selective breeding and crossbreeding are the primary methods used for genetic improvement in modern agriculture. However, these methods face challenges due to long breeding cycles and the necessity for beneficial genetic variation associated with high-quality traits within the population. This limitation restricts the transfer of desirable alleles across different genera and species. This article systematically reviews past and current research advancements in porcine molecular breeding. It discusses the screening of clustered regularly interspaced short palindromic repeats (CRISPR) to identify resistance loci in swine and the challenges and future applications of genetically modified pigs. The emergence of transgenic and gene editing technologies has prompted researchers to apply these methods to pig breeding. These advancements allow for alterations in the pig genome through various techniques, ranging from random integration into the genome to site-specific insertion and from target gene knockout (KO) to precise base and prime editing. As a result, numerous desirable traits, such as disease resistance, high meat yield, improved feed efficiency, reduced fat deposition, and lower environmental waste, can be achieved easily and effectively by genetic modification. These traits can serve as valuable resources to enhance swine breeding programmes. In the era of genome editing, molecular breeding of pigs is critical to the future of agriculture. Long-term and multidomain analyses of genetically modified pigs by researchers, related policy development by regulatory agencies, and public awareness and acceptance of their safety are the keys to realizing the transition of genetically modified products from the laboratory to the market.
{"title":"Molecular breeding of pigs in the genome editing era","authors":"Jiahuan Chen, Jiaqi Wang, Haoran Zhao, Xiao Tan, Shihan Yan, Huanyu Zhang, Tiefeng Wang, Xiaochun Tang","doi":"10.1186/s12711-025-00961-7","DOIUrl":"https://doi.org/10.1186/s12711-025-00961-7","url":null,"abstract":"To address the increasing demand for high-quality pork protein, it is essential to implement strategies that enhance diets and produce pigs with excellent production traits. Selective breeding and crossbreeding are the primary methods used for genetic improvement in modern agriculture. However, these methods face challenges due to long breeding cycles and the necessity for beneficial genetic variation associated with high-quality traits within the population. This limitation restricts the transfer of desirable alleles across different genera and species. This article systematically reviews past and current research advancements in porcine molecular breeding. It discusses the screening of clustered regularly interspaced short palindromic repeats (CRISPR) to identify resistance loci in swine and the challenges and future applications of genetically modified pigs. The emergence of transgenic and gene editing technologies has prompted researchers to apply these methods to pig breeding. These advancements allow for alterations in the pig genome through various techniques, ranging from random integration into the genome to site-specific insertion and from target gene knockout (KO) to precise base and prime editing. As a result, numerous desirable traits, such as disease resistance, high meat yield, improved feed efficiency, reduced fat deposition, and lower environmental waste, can be achieved easily and effectively by genetic modification. These traits can serve as valuable resources to enhance swine breeding programmes. In the era of genome editing, molecular breeding of pigs is critical to the future of agriculture. Long-term and multidomain analyses of genetically modified pigs by researchers, related policy development by regulatory agencies, and public awareness and acceptance of their safety are the keys to realizing the transition of genetically modified products from the laboratory to the market.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"19 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143582976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-10DOI: 10.1186/s12711-025-00957-3
Ruilin Su, Jingbo Lv, Yahui Xue, Sheng Jiang, Lei Zhou, Li Jiang, Junyan Tan, Zhencai Shen, Ping Zhong, Jianfeng Liu
The effectiveness of genomic prediction (GP) significantly influences breeding progress, and employing SNP markers to predict phenotypic values is a pivotal aspect of pig breeding. Machine learning (ML) methods are usually used to predict phenotypic values since their advantages in processing high dimensional data. While, the existing researches have not indicated which ML methods are suitable for most pig genomic prediction. Therefore, it is necessary to select appropriate methods from a large number of ML methods as long as genomic prediction is performed. This paper compared the performance of popular ML methods in predicting pig phenotypes and then found out suitable methods for most traits. In this paper, five commonly used datasets from other literatures were utilized to compare the performance of different ML methods. The experimental results demonstrate that Stacking performs best on the PIC dataset where the trait information is hidden, and the performs of kernel ridge regression with rbf kernel (KRR-rbf) closely follows. Support vector regression (SVR) performs best in predicting reproductive traits, followed by genomic best linear unbiased prediction (GBLUP). GBLUP achieves the best performance on growth traits, with SVR as the second best. GBLUP achieves good performance for GP problems. Similarly, the Stacking, SVR, and KRR-RBF methods also achieve high prediction accuracy. Moreover, LR statistical analysis shows that Stacking, SVR and KRR are stable. When applying ML methods for phenotypic values prediction in pigs, we recommend these three approaches.
{"title":"Genomic selection in pig breeding: comparative analysis of machine learning algorithms","authors":"Ruilin Su, Jingbo Lv, Yahui Xue, Sheng Jiang, Lei Zhou, Li Jiang, Junyan Tan, Zhencai Shen, Ping Zhong, Jianfeng Liu","doi":"10.1186/s12711-025-00957-3","DOIUrl":"https://doi.org/10.1186/s12711-025-00957-3","url":null,"abstract":"The effectiveness of genomic prediction (GP) significantly influences breeding progress, and employing SNP markers to predict phenotypic values is a pivotal aspect of pig breeding. Machine learning (ML) methods are usually used to predict phenotypic values since their advantages in processing high dimensional data. While, the existing researches have not indicated which ML methods are suitable for most pig genomic prediction. Therefore, it is necessary to select appropriate methods from a large number of ML methods as long as genomic prediction is performed. This paper compared the performance of popular ML methods in predicting pig phenotypes and then found out suitable methods for most traits. In this paper, five commonly used datasets from other literatures were utilized to compare the performance of different ML methods. The experimental results demonstrate that Stacking performs best on the PIC dataset where the trait information is hidden, and the performs of kernel ridge regression with rbf kernel (KRR-rbf) closely follows. Support vector regression (SVR) performs best in predicting reproductive traits, followed by genomic best linear unbiased prediction (GBLUP). GBLUP achieves the best performance on growth traits, with SVR as the second best. GBLUP achieves good performance for GP problems. Similarly, the Stacking, SVR, and KRR-RBF methods also achieve high prediction accuracy. Moreover, LR statistical analysis shows that Stacking, SVR and KRR are stable. When applying ML methods for phenotypic values prediction in pigs, we recommend these three approaches.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"38 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143582977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-06DOI: 10.1186/s12711-025-00960-8
Samuele Bovo, Anisa Ribani, Flaminia Fanelli, Giuliano Galimberti, Pier Luigi Martelli, Paolo Trevisi, Francesca Bertolini, Matteo Bolner, Rita Casadio, Stefania Dall’Olio, Maurizio Gallo, Diana Luise, Gianluca Mazzoni, Giuseppina Schiavo, Valeria Taurisano, Paolo Zambonelli, Paolo Bosi, Uberto Pagotto, Luca Fontanesi
Metabolomics opens novel avenues to study the basic biological mechanisms underlying complex traits, starting from characterization of metabolites. Metabolites and their levels in a biofluid represent simple molecular phenotypes (metabotypes) that are direct products of enzyme activities and relate to all metabolic pathways, including catabolism and anabolism of nutrients. In this study, we demonstrated the utility of merging metabolomics and genomics in pigs to uncover a large list of genetic factors that influence mammalian metabolism. We obtained targeted characterization of the plasma metabolome of more than 1300 pigs from two populations of Large White and Duroc pig breeds. The metabolomic profiles of these pigs were used to identify genetically influenced metabolites by estimating the heritability of the level of 188 metabolites. Then, combining breed-specific genome-wide association studies of single metabolites and their ratios and across breed meta-analyses, we identified a total of 97 metabolite quantitative trait loci (mQTL), associated with 126 metabolites. Using these results, we constructed a human-pig comparative catalog of genetic factors influencing the metabolomic profile. Whole genome resequencing data identified several putative causative mutations for these mQTL. Additionally, based on a major mQTL for kynurenine level, we designed a nutrigenetic study feeding piglets that carried different genotypes at the candidate gene kynurenine 3-monooxygenase (KMO) varying levels of tryptophan and demonstrated the effect of this genetic factor on the kynurenine pathway. Furthermore, we used metabolomic profiles of Large White and Duroc pigs to reconstruct metabolic pathways using Gaussian Graphical Models, which included perturbation of the identified mQTL. This study has provided the first catalog of genetic factors affecting molecular phenotypes that describe the pig blood metabolome, with links to important metabolic pathways, opening novel avenues to merge genetics and nutrition in this livestock species. The obtained results are relevant for basic and applied biology and to evaluate the pig as a biomedical model. Genetically influenced metabolites can be further exploited in nutrigenetic approaches in pigs. The described molecular phenotypes can be useful to dissect complex traits and design novel feeding, breeding and selection programs in pigs.
{"title":"Merging metabolomics and genomics provides a catalog of genetic factors that influence molecular phenotypes in pigs linking relevant metabolic pathways","authors":"Samuele Bovo, Anisa Ribani, Flaminia Fanelli, Giuliano Galimberti, Pier Luigi Martelli, Paolo Trevisi, Francesca Bertolini, Matteo Bolner, Rita Casadio, Stefania Dall’Olio, Maurizio Gallo, Diana Luise, Gianluca Mazzoni, Giuseppina Schiavo, Valeria Taurisano, Paolo Zambonelli, Paolo Bosi, Uberto Pagotto, Luca Fontanesi","doi":"10.1186/s12711-025-00960-8","DOIUrl":"https://doi.org/10.1186/s12711-025-00960-8","url":null,"abstract":"Metabolomics opens novel avenues to study the basic biological mechanisms underlying complex traits, starting from characterization of metabolites. Metabolites and their levels in a biofluid represent simple molecular phenotypes (metabotypes) that are direct products of enzyme activities and relate to all metabolic pathways, including catabolism and anabolism of nutrients. In this study, we demonstrated the utility of merging metabolomics and genomics in pigs to uncover a large list of genetic factors that influence mammalian metabolism. We obtained targeted characterization of the plasma metabolome of more than 1300 pigs from two populations of Large White and Duroc pig breeds. The metabolomic profiles of these pigs were used to identify genetically influenced metabolites by estimating the heritability of the level of 188 metabolites. Then, combining breed-specific genome-wide association studies of single metabolites and their ratios and across breed meta-analyses, we identified a total of 97 metabolite quantitative trait loci (mQTL), associated with 126 metabolites. Using these results, we constructed a human-pig comparative catalog of genetic factors influencing the metabolomic profile. Whole genome resequencing data identified several putative causative mutations for these mQTL. Additionally, based on a major mQTL for kynurenine level, we designed a nutrigenetic study feeding piglets that carried different genotypes at the candidate gene kynurenine 3-monooxygenase (KMO) varying levels of tryptophan and demonstrated the effect of this genetic factor on the kynurenine pathway. Furthermore, we used metabolomic profiles of Large White and Duroc pigs to reconstruct metabolic pathways using Gaussian Graphical Models, which included perturbation of the identified mQTL. This study has provided the first catalog of genetic factors affecting molecular phenotypes that describe the pig blood metabolome, with links to important metabolic pathways, opening novel avenues to merge genetics and nutrition in this livestock species. The obtained results are relevant for basic and applied biology and to evaluate the pig as a biomedical model. Genetically influenced metabolites can be further exploited in nutrigenetic approaches in pigs. The described molecular phenotypes can be useful to dissect complex traits and design novel feeding, breeding and selection programs in pigs.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"36 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-04DOI: 10.1186/s12711-025-00955-5
Can Yuan, Alain Gillon, José Luis Gualdrón Duarte, Haruko Takeda, Wouter Coppieters, Michel Georges, Tom Druet
The availability of large cohorts of whole-genome sequenced individuals, combined with functional annotation, is expected to provide opportunities to improve the accuracy of genomic selection (GS). However, such benefits have not often been observed in initial applications. The reference population for GS in Belgian Blue Cattle (BBC) continues to grow. Combined with the availability of reference panels of sequenced individuals, it provides an opportunity to evaluate GS models using whole genome sequence (WGS) data and functional annotation. Here, we used data from 16,508 cows, with phenotypes for five muscular development traits and imputed at the WGS level, in combination with in silico functional annotation and catalogs of putative regulatory variants obtained from experimental data. We evaluated first GS models using the entire WGS data, with or without functional annotation. At this marker density, we were able to run two approaches, assuming either a highly polygenic architecture (GBLUP) or allowing some variants to have larger effects (BayesRR-RC, a Bayesian mixture model), and observed an increased reliability compared to the official GBLUP model at medium marker density (on average 0.016 and 0.018 for GBLUP and BayesRR-RC, respectively). When functional annotation was used, we observed slightly higher reliabilities with an extension of GBLUP that included multiple polygenic terms (one per functional group), while reliabilities decreased with BayesRR-RC. We then used large subsets of variants selected based on functional information or with a linkage disequilibrium (LD) pruning approach, which allowed us to evaluate two additional approaches, BayesCπ and Bayesian Sparse Linear Mixed Model (BSLMM). Reliabilities were higher for these panels than for the WGS data, with the highest accuracies obtained when markers were selected based on functional information. In our setting, BSLMM systematically achieved higher reliabilities than other methods. GS with large panels of functional variants selected from WGS data allowed a significant increase in reliability compared to the official genomic evaluation approach. However, the benefits of using WGS and functional data remained modest, indicating that there is still room for improvement, for example by further refining the functional annotation in the BBC breed.
{"title":"Evaluation of genomic selection models using whole genome sequence data and functional annotation in Belgian Blue cattle","authors":"Can Yuan, Alain Gillon, José Luis Gualdrón Duarte, Haruko Takeda, Wouter Coppieters, Michel Georges, Tom Druet","doi":"10.1186/s12711-025-00955-5","DOIUrl":"https://doi.org/10.1186/s12711-025-00955-5","url":null,"abstract":"The availability of large cohorts of whole-genome sequenced individuals, combined with functional annotation, is expected to provide opportunities to improve the accuracy of genomic selection (GS). However, such benefits have not often been observed in initial applications. The reference population for GS in Belgian Blue Cattle (BBC) continues to grow. Combined with the availability of reference panels of sequenced individuals, it provides an opportunity to evaluate GS models using whole genome sequence (WGS) data and functional annotation. Here, we used data from 16,508 cows, with phenotypes for five muscular development traits and imputed at the WGS level, in combination with in silico functional annotation and catalogs of putative regulatory variants obtained from experimental data. We evaluated first GS models using the entire WGS data, with or without functional annotation. At this marker density, we were able to run two approaches, assuming either a highly polygenic architecture (GBLUP) or allowing some variants to have larger effects (BayesRR-RC, a Bayesian mixture model), and observed an increased reliability compared to the official GBLUP model at medium marker density (on average 0.016 and 0.018 for GBLUP and BayesRR-RC, respectively). When functional annotation was used, we observed slightly higher reliabilities with an extension of GBLUP that included multiple polygenic terms (one per functional group), while reliabilities decreased with BayesRR-RC. We then used large subsets of variants selected based on functional information or with a linkage disequilibrium (LD) pruning approach, which allowed us to evaluate two additional approaches, BayesCπ and Bayesian Sparse Linear Mixed Model (BSLMM). Reliabilities were higher for these panels than for the WGS data, with the highest accuracies obtained when markers were selected based on functional information. In our setting, BSLMM systematically achieved higher reliabilities than other methods. GS with large panels of functional variants selected from WGS data allowed a significant increase in reliability compared to the official genomic evaluation approach. However, the benefits of using WGS and functional data remained modest, indicating that there is still room for improvement, for example by further refining the functional annotation in the BBC breed.","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"34 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143538512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}