Pub Date : 2023-11-09DOI: 10.1186/s12711-023-00853-8
Felix Heinrich, Thomas Martin Lange, Magdalena Kircher, Faisal Ramzan, Armin Otto Schmitt, Mehmet Gültas
The ever-increasing availability of high-density genomic markers in the form of single nucleotide polymorphisms (SNPs) enables genomic prediction, i.e. the inference of phenotypes based solely on genomic data, in the field of animal and plant breeding, where it has become an important tool. However, given the limited number of individuals, the abundance of variables (SNPs) can reduce the accuracy of prediction models due to overfitting or irrelevant SNPs. Feature selection can help to reduce the number of irrelevant SNPs and increase the model performance. In this study, we investigated an incremental feature selection approach based on ranking the SNPs according to the results of a genome-wide association study that we combined with random forest as a prediction model, and we applied it on several animal and plant datasets. Applying our approach to different datasets yielded a wide range of outcomes, i.e. from a substantial increase in prediction accuracy in a few cases to minor improvements when only a fraction of the available SNPs were used. Compared with models using all available SNPs, our approach was able to achieve comparable performances with a considerably reduced number of SNPs in several cases. Our approach showcased state-of-the-art efficiency and performance while having a faster computation time. The results of our study suggest that our incremental feature selection approach has the potential to improve prediction accuracy substantially. However, this gain seems to depend on the genomic data used. Even for datasets where the number of markers is smaller than the number of individuals, feature selection may still increase the performance of the genomic prediction. Our approach is implemented in R and is available at https://github.com/FelixHeinrich/GP_with_IFS/ .
{"title":"Exploring the potential of incremental feature selection to improve genomic prediction accuracy","authors":"Felix Heinrich, Thomas Martin Lange, Magdalena Kircher, Faisal Ramzan, Armin Otto Schmitt, Mehmet Gültas","doi":"10.1186/s12711-023-00853-8","DOIUrl":"https://doi.org/10.1186/s12711-023-00853-8","url":null,"abstract":"The ever-increasing availability of high-density genomic markers in the form of single nucleotide polymorphisms (SNPs) enables genomic prediction, i.e. the inference of phenotypes based solely on genomic data, in the field of animal and plant breeding, where it has become an important tool. However, given the limited number of individuals, the abundance of variables (SNPs) can reduce the accuracy of prediction models due to overfitting or irrelevant SNPs. Feature selection can help to reduce the number of irrelevant SNPs and increase the model performance. In this study, we investigated an incremental feature selection approach based on ranking the SNPs according to the results of a genome-wide association study that we combined with random forest as a prediction model, and we applied it on several animal and plant datasets. Applying our approach to different datasets yielded a wide range of outcomes, i.e. from a substantial increase in prediction accuracy in a few cases to minor improvements when only a fraction of the available SNPs were used. Compared with models using all available SNPs, our approach was able to achieve comparable performances with a considerably reduced number of SNPs in several cases. Our approach showcased state-of-the-art efficiency and performance while having a faster computation time. The results of our study suggest that our incremental feature selection approach has the potential to improve prediction accuracy substantially. However, this gain seems to depend on the genomic data used. Even for datasets where the number of markers is smaller than the number of individuals, feature selection may still increase the performance of the genomic prediction. Our approach is implemented in R and is available at https://github.com/FelixHeinrich/GP_with_IFS/ .","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"26 12","pages":""},"PeriodicalIF":4.1,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71524388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-07DOI: 10.1186/s12711-023-00851-w
Guillaume Lenoir, Loïc Flatres-Grall, Rafael Muñoz-Tamayo, Ingrid David, Nicolas C Friggens
Background: There is a growing need to improve robustness of fattening pigs, but this trait is difficult to phenotype. Our first objective was to develop a proxy for robustness of fattening pigs by modelling the longitudinal energy allocation coefficient to growth, with the resulting environmental variance of this allocation coefficient considered as a proxy for robustness. The second objective was to estimate its genetic parameters and correlations with traits under selection and with phenotypes that are routinely collected. In total, 5848 pigs from a Pietrain NN paternal line were tested at the AXIOM boar testing station (Azay-sur-Indre, France) from 2015 to 2022. This farm is equipped with an automatic feeding system that records individual weight and feed intake at each visit. We used a dynamic linear regression model to characterize the evolution of the allocation coefficient between the available cumulative net energy, which was estimated from feed intake, and cumulative weight gain during the fattening period. Longitudinal energy allocation coefficients were analysed using a two-step approach to estimate both the genetic variance of the coefficients and the genetic variance in their residual variance, which will be referred to as the log-transformed squared residual (LSR).
Results: The LSR trait, which could be interpreted as an indicator of the response of the animal to perturbations/stress, showed a low heritability (0.05 ± 0.01), a high favourable genetic correlation with average daily growth (- 0.71 ± 0.06), and unfavourable genetic correlations with feed conversion ratio (- 0.76 ± 0.06) and residual feed intake (- 0.83 ± 0.06). Segmentation of the population in four classes using estimated breeding values for LSR showed that animals with the lowest estimated breeding values were those with the worst values for phenotypic proxies of robustness, which were assessed using records routinely collected on farm.
Conclusions: Results of this study show that selection for robustness, based on estimated breeding values for environmental variance of the allocation coefficients to growth, can be considered in breeding programs for fattening pigs.
{"title":"Disentangling the dynamics of energy allocation to develop a proxy for robustness of fattening pigs.","authors":"Guillaume Lenoir, Loïc Flatres-Grall, Rafael Muñoz-Tamayo, Ingrid David, Nicolas C Friggens","doi":"10.1186/s12711-023-00851-w","DOIUrl":"10.1186/s12711-023-00851-w","url":null,"abstract":"<p><strong>Background: </strong>There is a growing need to improve robustness of fattening pigs, but this trait is difficult to phenotype. Our first objective was to develop a proxy for robustness of fattening pigs by modelling the longitudinal energy allocation coefficient to growth, with the resulting environmental variance of this allocation coefficient considered as a proxy for robustness. The second objective was to estimate its genetic parameters and correlations with traits under selection and with phenotypes that are routinely collected. In total, 5848 pigs from a Pietrain NN paternal line were tested at the AXIOM boar testing station (Azay-sur-Indre, France) from 2015 to 2022. This farm is equipped with an automatic feeding system that records individual weight and feed intake at each visit. We used a dynamic linear regression model to characterize the evolution of the allocation coefficient between the available cumulative net energy, which was estimated from feed intake, and cumulative weight gain during the fattening period. Longitudinal energy allocation coefficients were analysed using a two-step approach to estimate both the genetic variance of the coefficients and the genetic variance in their residual variance, which will be referred to as the log-transformed squared residual (LSR).</p><p><strong>Results: </strong>The LSR trait, which could be interpreted as an indicator of the response of the animal to perturbations/stress, showed a low heritability (0.05 ± 0.01), a high favourable genetic correlation with average daily growth (- 0.71 ± 0.06), and unfavourable genetic correlations with feed conversion ratio (- 0.76 ± 0.06) and residual feed intake (- 0.83 ± 0.06). Segmentation of the population in four classes using estimated breeding values for LSR showed that animals with the lowest estimated breeding values were those with the worst values for phenotypic proxies of robustness, which were assessed using records routinely collected on farm.</p><p><strong>Conclusions: </strong>Results of this study show that selection for robustness, based on estimated breeding values for environmental variance of the allocation coefficients to growth, can be considered in breeding programs for fattening pigs.</p>","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"55 1","pages":"77"},"PeriodicalIF":4.1,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10629156/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71489227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-02DOI: 10.1186/s12711-023-00850-x
Amanda B Alvarenga, Kelli J Retallick, Andre Garcia, Stephen P Miller, Andrew Byrne, Hinayah R Oliveira, Luiz F Brito
Background: Hoof structure and health are essential for the welfare and productivity of beef cattle. Therefore, we assessed the genetic and genomic background of foot score traits in American (US) and Australian (AU) Angus cattle and investigated the feasibility of performing genomic evaluations combining data for foot score traits recorded in US and AU Angus cattle. The traits evaluated were foot angle (FA) and claw set (CS). In total, 109,294 and ~ 1.12 million animals had phenotypic and genomic information, respectively. Four sets of analyses were performed: (1) genomic connectedness between US and AU Angus cattle populations and population structure, (2) estimation of genetic parameters, (3) single-step genomic prediction of breeding values, and (4) single-step genome-wide association studies for FA and CS.
Results: There was no clear genetic differentiation between US and AU Angus populations. Similar heritability estimates (FA: 0.22-0.24 and CS: 0.22-0.27) and moderate-to-high genetic correlations between US and AU foot scores (FA: 0.61 and CS: 0.76) were obtained. A joint-genomic prediction using data from both populations outperformed within-country genomic evaluations. A genomic prediction model considering US and AU datasets as a single population performed similarly to the scenario accounting for genotype-by-environment interactions (i.e., multiple-trait model considering US and AU records as different traits), even though the genetic correlations between countries were lower than 0.80. Common significant genomic regions were observed between US and AU for FA and CS. Significant single nucleotide polymorphisms were identified on the Bos taurus (BTA) chromosomes BTA1, BTA5, BTA11, BTA13, BTA19, BTA20, and BTA23. The candidate genes identified were primarily from growth factor gene families, including FGF12 and GDF5, which were previously associated with bone structure and repair.
Conclusions: This study presents comprehensive population structure and genetic and genomic analyses of foot scores in US and AU Angus cattle populations, which are essential for optimizing the implementation of genomic selection for improved foot scores in Angus cattle breeding programs. We have also identified candidate genes associated with foot scores in the largest Angus cattle populations in the world and made recommendations for genomic evaluations for improved foot score traits in the US and AU.
{"title":"Across-country genetic and genomic analyses of foot score traits in American and Australian Angus cattle.","authors":"Amanda B Alvarenga, Kelli J Retallick, Andre Garcia, Stephen P Miller, Andrew Byrne, Hinayah R Oliveira, Luiz F Brito","doi":"10.1186/s12711-023-00850-x","DOIUrl":"10.1186/s12711-023-00850-x","url":null,"abstract":"<p><strong>Background: </strong>Hoof structure and health are essential for the welfare and productivity of beef cattle. Therefore, we assessed the genetic and genomic background of foot score traits in American (US) and Australian (AU) Angus cattle and investigated the feasibility of performing genomic evaluations combining data for foot score traits recorded in US and AU Angus cattle. The traits evaluated were foot angle (FA) and claw set (CS). In total, 109,294 and ~ 1.12 million animals had phenotypic and genomic information, respectively. Four sets of analyses were performed: (1) genomic connectedness between US and AU Angus cattle populations and population structure, (2) estimation of genetic parameters, (3) single-step genomic prediction of breeding values, and (4) single-step genome-wide association studies for FA and CS.</p><p><strong>Results: </strong>There was no clear genetic differentiation between US and AU Angus populations. Similar heritability estimates (FA: 0.22-0.24 and CS: 0.22-0.27) and moderate-to-high genetic correlations between US and AU foot scores (FA: 0.61 and CS: 0.76) were obtained. A joint-genomic prediction using data from both populations outperformed within-country genomic evaluations. A genomic prediction model considering US and AU datasets as a single population performed similarly to the scenario accounting for genotype-by-environment interactions (i.e., multiple-trait model considering US and AU records as different traits), even though the genetic correlations between countries were lower than 0.80. Common significant genomic regions were observed between US and AU for FA and CS. Significant single nucleotide polymorphisms were identified on the Bos taurus (BTA) chromosomes BTA1, BTA5, BTA11, BTA13, BTA19, BTA20, and BTA23. The candidate genes identified were primarily from growth factor gene families, including FGF12 and GDF5, which were previously associated with bone structure and repair.</p><p><strong>Conclusions: </strong>This study presents comprehensive population structure and genetic and genomic analyses of foot scores in US and AU Angus cattle populations, which are essential for optimizing the implementation of genomic selection for improved foot scores in Angus cattle breeding programs. We have also identified candidate genes associated with foot scores in the largest Angus cattle populations in the world and made recommendations for genomic evaluations for improved foot score traits in the US and AU.</p>","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"55 1","pages":"76"},"PeriodicalIF":4.1,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10621155/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71429288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-25DOI: 10.1186/s12711-023-00846-7
Katherine D Arias, Juan Pablo Gutiérrez, Iván Fernández, Isabel Álvarez, Félix Goyache
Background: In spite of the availability of single nucleotide polymorphism (SNP) array data, differentiation between observed homozygosity and that caused by mating between relatives (autozygosity) introduces major difficulties. Homozygosity estimators show large variation due to different causes, namely, Mendelian sampling, population structure, and differences among chromosomes. Therefore, the ascertainment of how inbreeding is reflected in the genome is still an issue. The aim of this research was to study the usefulness of genomic information for the assessment of genetic diversity in the highly endangered Gochu Asturcelta pig breed. Pedigree depth varied from 0 (founders) to 4 equivalent discrete generations (t). Four homozygosity parameters (runs of homozygosity, FROH; heterozygosity-rich regions, FHRR; Li and Horvitz's, FLH; and Yang and colleague's FYAN) were computed for each individual, adjusted for the variability in the base population (BP; six individuals) and further jackknifed over autosomes. Individual increases in homozygosity (depending on t) and increases in pairwise homozygosity (i.e., increase in the parents' mean) were computed for each individual in the pedigree, and effective population size (Ne) was computed for five subpopulations (cohorts). Genealogical parameters (individual inbreeding, individual increase in inbreeding, and Ne) were used for comparisons.
Results: The mean F was 0.120 ± 0.074 and the mean BP-adjusted homozygosity ranged from 0.099 ± 0.081 (FLH) to 0.152 ± 0.075 (FYAN). After jackknifing, the mean values were slightly lower. The increase in pairwise homozygosity tended to be twofold higher than the corresponding individual increase in homozygosity values. When compared with genealogical estimates, estimates of Ne obtained using FYAN tended to have low root-mean-squared errors. However, Ne estimates based on increases in pairwise homozygosity using both FROH and FHRR estimates of genomic inbreeding had lower root-mean-squared errors.
Conclusions: Parameters characterizing homozygosity may not accurately depict losses of variability in small populations in which breeding policy prohibits matings between close relatives. After BP adjustment, the performance of FROH and FHRR was highly consistent. Assuming that an increase in homozygosity depends only on pedigree depth can lead to underestimating it in populations with shallow pedigrees. An increase in pairwise homozygosity computed from either FROH or FHRR is a promising approach for characterizing autozygosity.
{"title":"Approaching autozygosity in a small pedigree of Gochu Asturcelta pigs.","authors":"Katherine D Arias, Juan Pablo Gutiérrez, Iván Fernández, Isabel Álvarez, Félix Goyache","doi":"10.1186/s12711-023-00846-7","DOIUrl":"10.1186/s12711-023-00846-7","url":null,"abstract":"<p><strong>Background: </strong>In spite of the availability of single nucleotide polymorphism (SNP) array data, differentiation between observed homozygosity and that caused by mating between relatives (autozygosity) introduces major difficulties. Homozygosity estimators show large variation due to different causes, namely, Mendelian sampling, population structure, and differences among chromosomes. Therefore, the ascertainment of how inbreeding is reflected in the genome is still an issue. The aim of this research was to study the usefulness of genomic information for the assessment of genetic diversity in the highly endangered Gochu Asturcelta pig breed. Pedigree depth varied from 0 (founders) to 4 equivalent discrete generations (t). Four homozygosity parameters (runs of homozygosity, F<sub>ROH</sub>; heterozygosity-rich regions, F<sub>HRR</sub>; Li and Horvitz's, F<sub>LH</sub>; and Yang and colleague's F<sub>YAN</sub>) were computed for each individual, adjusted for the variability in the base population (BP; six individuals) and further jackknifed over autosomes. Individual increases in homozygosity (depending on t) and increases in pairwise homozygosity (i.e., increase in the parents' mean) were computed for each individual in the pedigree, and effective population size (N<sub>e</sub>) was computed for five subpopulations (cohorts). Genealogical parameters (individual inbreeding, individual increase in inbreeding, and N<sub>e</sub>) were used for comparisons.</p><p><strong>Results: </strong>The mean F was 0.120 ± 0.074 and the mean BP-adjusted homozygosity ranged from 0.099 ± 0.081 (F<sub>LH</sub>) to 0.152 ± 0.075 (F<sub>YAN</sub>). After jackknifing, the mean values were slightly lower. The increase in pairwise homozygosity tended to be twofold higher than the corresponding individual increase in homozygosity values. When compared with genealogical estimates, estimates of N<sub>e</sub> obtained using F<sub>YAN</sub> tended to have low root-mean-squared errors. However, N<sub>e</sub> estimates based on increases in pairwise homozygosity using both F<sub>ROH</sub> and F<sub>HRR</sub> estimates of genomic inbreeding had lower root-mean-squared errors.</p><p><strong>Conclusions: </strong>Parameters characterizing homozygosity may not accurately depict losses of variability in small populations in which breeding policy prohibits matings between close relatives. After BP adjustment, the performance of F<sub>ROH</sub> and F<sub>HRR</sub> was highly consistent. Assuming that an increase in homozygosity depends only on pedigree depth can lead to underestimating it in populations with shallow pedigrees. An increase in pairwise homozygosity computed from either F<sub>ROH</sub> or F<sub>HRR</sub> is a promising approach for characterizing autozygosity.</p>","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"55 1","pages":"74"},"PeriodicalIF":4.1,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10601182/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50163843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-25DOI: 10.1186/s12711-023-00839-6
Laure-Hélène Maugan, Roberta Rostellato, Thierry Tribout, Sophie Mattalia, Vincent Ducrocq
Background: For years, multiple trait genetic evaluations have been used to increase the accuracy of estimated breeding values (EBV) using information from correlated traits. In France, accurate approximations of multiple trait evaluations were implemented for traits that are described by different models by combining the results of univariate best linear unbiased prediction (BLUP) evaluations. Functional longevity (FL) is the trait that has most benefited from this approach. Currently, with many single-step (SS) evaluations, only univariate FL evaluations can be run. The aim of this study was to implement a "combined" SS (CSS) evaluation that extends the "combined" BLUP evaluation to obtain more accurate genomic (G) EBV for FL when information from five correlated traits (somatic cell score, clinical mastitis, conception rate for heifers and cows, and udder depth) is added.
Results: GEBV obtained from univariate SS (USS) evaluations and from a CSS evaluation were compared. The correlations between these GEBV showed the benefits of including information from correlated traits. Indeed, a CSS evaluation run without any performances on FL showed that the indirect information from correlated traits to evaluate FL was substantial. USS and CSS evaluations that mimic SS evaluations with data available in 2016 were compared. For each evaluation separately, the GEBV were sorted and then split into 10 consecutive groups (deciles). Survival curves were calculated for each group, based on the observed productive life of these cows as known in 2021. Regardless of their genotyping status, the worst group of heifers based on their GEBV in 2016 was well identified in the CSS evaluation and they had a substantially shorter herd life, while those in the best heifer group had a longer herd life. The gaps between groups were more important for the genotyped than the ungenotyped heifers, which indicates better prediction of future survival.
Conclusions: A CSS evaluation is an efficient tool to improve FL. It allows a proper combination of information on functional traits that influence culling. In contrast, because of the strong selection intensity on young bulls for functional traits, the benefit of such a "combined" evaluation of functional traits is more modest for these males.
{"title":"Combined single-step evaluation of functional longevity of dairy cows including correlated traits.","authors":"Laure-Hélène Maugan, Roberta Rostellato, Thierry Tribout, Sophie Mattalia, Vincent Ducrocq","doi":"10.1186/s12711-023-00839-6","DOIUrl":"10.1186/s12711-023-00839-6","url":null,"abstract":"<p><strong>Background: </strong>For years, multiple trait genetic evaluations have been used to increase the accuracy of estimated breeding values (EBV) using information from correlated traits. In France, accurate approximations of multiple trait evaluations were implemented for traits that are described by different models by combining the results of univariate best linear unbiased prediction (BLUP) evaluations. Functional longevity (FL) is the trait that has most benefited from this approach. Currently, with many single-step (SS) evaluations, only univariate FL evaluations can be run. The aim of this study was to implement a \"combined\" SS (CSS) evaluation that extends the \"combined\" BLUP evaluation to obtain more accurate genomic (G) EBV for FL when information from five correlated traits (somatic cell score, clinical mastitis, conception rate for heifers and cows, and udder depth) is added.</p><p><strong>Results: </strong>GEBV obtained from univariate SS (USS) evaluations and from a CSS evaluation were compared. The correlations between these GEBV showed the benefits of including information from correlated traits. Indeed, a CSS evaluation run without any performances on FL showed that the indirect information from correlated traits to evaluate FL was substantial. USS and CSS evaluations that mimic SS evaluations with data available in 2016 were compared. For each evaluation separately, the GEBV were sorted and then split into 10 consecutive groups (deciles). Survival curves were calculated for each group, based on the observed productive life of these cows as known in 2021. Regardless of their genotyping status, the worst group of heifers based on their GEBV in 2016 was well identified in the CSS evaluation and they had a substantially shorter herd life, while those in the best heifer group had a longer herd life. The gaps between groups were more important for the genotyped than the ungenotyped heifers, which indicates better prediction of future survival.</p><p><strong>Conclusions: </strong>A CSS evaluation is an efficient tool to improve FL. It allows a proper combination of information on functional traits that influence culling. In contrast, because of the strong selection intensity on young bulls for functional traits, the benefit of such a \"combined\" evaluation of functional traits is more modest for these males.</p>","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"55 1","pages":"75"},"PeriodicalIF":4.1,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10601146/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50163844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-23DOI: 10.1186/s12711-023-00849-4
Zhang Wang, Weihua Tian, Dandan Wang, Yulong Guo, Zhimin Cheng, Yanyan Zhang, Xinyan Li, Yihao Zhi, Donghua Li, Zhuanjian Li, Ruirui Jiang, Guoxi Li, Yadong Tian, Xiangtao Kang, Hong Li, Ian C Dunn, Xiaojun Liu
Background: Modern breeding strategies have resulted in significant differences in muscle mass between indigenous chicken and specialized broiler. However, the molecular regulatory mechanisms that underlie these differences remain elusive. The aim of this study was to identify key genes and regulatory mechanisms underlying differences in breast muscle development between indigenous chicken and specialized broiler.
Results: Two time-series RNA-sequencing profiles of breast muscles were generated from commercial Arbor Acres (AA) broiler (fast-growing) and Chinese indigenous Lushi blue-shelled-egg (LS) chicken (slow-growing) at embryonic days 10, 14, and 18, and post-hatching day 1 and weeks 1, 3, and 5. Principal component analysis of the transcriptome profiles showed that the top four principal components accounted for more than 80% of the total variance in each breed. The developmental axes between the AA and LS chicken overlapped at the embryonic stages but gradually separated at the adult stages. Integrative investigation of differentially-expressed transcripts contained in the top four principal components identified 44 genes that formed a molecular network associated with differences in breast muscle mass between the two breeds. In addition, alternative splicing analysis revealed that genes with multiple isoforms always had one dominant transcript that exhibited a significantly higher expression level than the others. Among the 44 genes, the TNFRSF6B gene, a mediator of signal transduction pathways and cell proliferation, harbored two alternative splicing isoforms, TNFRSF6B-X1 and TNFRSF6B-X2. TNFRSF6B-X1 was the dominant isoform in both breeds before the age of one week. A switching event of the dominant isoform occurred at one week of age, resulting in TNFRSF6B-X2 being the dominant isoform in AA broiler, whereas TNFRSF6B-X1 remained the dominant isoform in LS chicken. Gain-of-function assays demonstrated that both isoforms promoted the proliferation of chicken primary myoblasts, but only TNFRSF6B-X2 augmented the differentiation and intracellular protein content of chicken primary myoblasts.
Conclusions: For the first time, we identified several key genes and dominant isoforms that may be responsible for differences in muscle mass between slow-growing indigenous chicken and fast-growing commercial broiler. These findings provide new insights into the regulatory mechanisms underlying breast muscle development in chicken.
{"title":"Comparative analyses of dynamic transcriptome profiles highlight key response genes and dominant isoforms for muscle development and growth in chicken.","authors":"Zhang Wang, Weihua Tian, Dandan Wang, Yulong Guo, Zhimin Cheng, Yanyan Zhang, Xinyan Li, Yihao Zhi, Donghua Li, Zhuanjian Li, Ruirui Jiang, Guoxi Li, Yadong Tian, Xiangtao Kang, Hong Li, Ian C Dunn, Xiaojun Liu","doi":"10.1186/s12711-023-00849-4","DOIUrl":"10.1186/s12711-023-00849-4","url":null,"abstract":"<p><strong>Background: </strong>Modern breeding strategies have resulted in significant differences in muscle mass between indigenous chicken and specialized broiler. However, the molecular regulatory mechanisms that underlie these differences remain elusive. The aim of this study was to identify key genes and regulatory mechanisms underlying differences in breast muscle development between indigenous chicken and specialized broiler.</p><p><strong>Results: </strong>Two time-series RNA-sequencing profiles of breast muscles were generated from commercial Arbor Acres (AA) broiler (fast-growing) and Chinese indigenous Lushi blue-shelled-egg (LS) chicken (slow-growing) at embryonic days 10, 14, and 18, and post-hatching day 1 and weeks 1, 3, and 5. Principal component analysis of the transcriptome profiles showed that the top four principal components accounted for more than 80% of the total variance in each breed. The developmental axes between the AA and LS chicken overlapped at the embryonic stages but gradually separated at the adult stages. Integrative investigation of differentially-expressed transcripts contained in the top four principal components identified 44 genes that formed a molecular network associated with differences in breast muscle mass between the two breeds. In addition, alternative splicing analysis revealed that genes with multiple isoforms always had one dominant transcript that exhibited a significantly higher expression level than the others. Among the 44 genes, the TNFRSF6B gene, a mediator of signal transduction pathways and cell proliferation, harbored two alternative splicing isoforms, TNFRSF6B-X1 and TNFRSF6B-X2. TNFRSF6B-X1 was the dominant isoform in both breeds before the age of one week. A switching event of the dominant isoform occurred at one week of age, resulting in TNFRSF6B-X2 being the dominant isoform in AA broiler, whereas TNFRSF6B-X1 remained the dominant isoform in LS chicken. Gain-of-function assays demonstrated that both isoforms promoted the proliferation of chicken primary myoblasts, but only TNFRSF6B-X2 augmented the differentiation and intracellular protein content of chicken primary myoblasts.</p><p><strong>Conclusions: </strong>For the first time, we identified several key genes and dominant isoforms that may be responsible for differences in muscle mass between slow-growing indigenous chicken and fast-growing commercial broiler. These findings provide new insights into the regulatory mechanisms underlying breast muscle development in chicken.</p>","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"55 1","pages":"73"},"PeriodicalIF":4.1,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10591418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49694152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-18DOI: 10.1186/s12711-023-00843-w
Di Zhu, Yiqiang Zhao, Ran Zhang, Hanyu Wu, Gengyuan Cai, Zhenfang Wu, Yuzhe Wang, Xiaoxiang Hu
Background: Although the accumulation of whole-genome sequencing (WGS) data has accelerated the identification of mutations underlying complex traits, its impact on the accuracy of genomic predictions is limited. Reliable genotyping data and pre-selected beneficial loci can be used to improve prediction accuracy. Previously, we reported a low-coverage sequencing genotyping method that yielded 11.3 million highly accurate single-nucleotide polymorphisms (SNPs) in pigs. Here, we introduce a method termed selective linkage disequilibrium pruning (SLDP), which refines the set of SNPs that show a large gain during prediction of complex traits using whole-genome SNP data.
Results: We used the SLDP method to identify and select markers among millions of SNPs based on genome-wide association study (GWAS) prior information. We evaluated the performance of SLDP with respect to three real traits and six simulated traits with varying genetic architectures using two representative models (genomic best linear unbiased prediction and BayesR) on samples from 3579 Duroc boars. SLDP was determined by testing 180 combinations of two core parameters (GWAS P-value thresholds and linkage disequilibrium r2). The parameters for each trait were optimized in the training population by five fold cross-validation and then tested in the validation population. Similar to previous GWAS prior-based methods, the performance of SLDP was mainly affected by the genetic architecture of the traits analyzed. Specifically, SLDP performed better for traits controlled by major quantitative trait loci (QTL) or a small number of quantitative trait nucleotides (QTN). Compared with two commercial SNP chips, genotyping-by-sequencing data, and an unselected whole-genome SNP panel, the SLDP strategy led to significant improvements in prediction accuracy, which ranged from 0.84 to 3.22% for real traits controlled by major or moderate QTL and from 1.23 to 11.47% for simulated traits controlled by a small number of QTN.
Conclusions: The SLDP marker selection method can be incorporated into mainstream prediction models to yield accuracy improvements for traits with a relatively simple genetic architecture, however, it has no significant advantage for traits not controlled by major QTL. The main factors that affect its performance are the genetic architecture of traits and the reliability of GWAS prior information. Our findings can facilitate the application of WGS-based genomic selection.
{"title":"Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.","authors":"Di Zhu, Yiqiang Zhao, Ran Zhang, Hanyu Wu, Gengyuan Cai, Zhenfang Wu, Yuzhe Wang, Xiaoxiang Hu","doi":"10.1186/s12711-023-00843-w","DOIUrl":"10.1186/s12711-023-00843-w","url":null,"abstract":"<p><strong>Background: </strong>Although the accumulation of whole-genome sequencing (WGS) data has accelerated the identification of mutations underlying complex traits, its impact on the accuracy of genomic predictions is limited. Reliable genotyping data and pre-selected beneficial loci can be used to improve prediction accuracy. Previously, we reported a low-coverage sequencing genotyping method that yielded 11.3 million highly accurate single-nucleotide polymorphisms (SNPs) in pigs. Here, we introduce a method termed selective linkage disequilibrium pruning (SLDP), which refines the set of SNPs that show a large gain during prediction of complex traits using whole-genome SNP data.</p><p><strong>Results: </strong>We used the SLDP method to identify and select markers among millions of SNPs based on genome-wide association study (GWAS) prior information. We evaluated the performance of SLDP with respect to three real traits and six simulated traits with varying genetic architectures using two representative models (genomic best linear unbiased prediction and BayesR) on samples from 3579 Duroc boars. SLDP was determined by testing 180 combinations of two core parameters (GWAS P-value thresholds and linkage disequilibrium r<sup>2</sup>). The parameters for each trait were optimized in the training population by five fold cross-validation and then tested in the validation population. Similar to previous GWAS prior-based methods, the performance of SLDP was mainly affected by the genetic architecture of the traits analyzed. Specifically, SLDP performed better for traits controlled by major quantitative trait loci (QTL) or a small number of quantitative trait nucleotides (QTN). Compared with two commercial SNP chips, genotyping-by-sequencing data, and an unselected whole-genome SNP panel, the SLDP strategy led to significant improvements in prediction accuracy, which ranged from 0.84 to 3.22% for real traits controlled by major or moderate QTL and from 1.23 to 11.47% for simulated traits controlled by a small number of QTN.</p><p><strong>Conclusions: </strong>The SLDP marker selection method can be incorporated into mainstream prediction models to yield accuracy improvements for traits with a relatively simple genetic architecture, however, it has no significant advantage for traits not controlled by major QTL. The main factors that affect its performance are the genetic architecture of traits and the reliability of GWAS prior information. Our findings can facilitate the application of WGS-based genomic selection.</p>","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"55 1","pages":"72"},"PeriodicalIF":4.1,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10583454/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49685287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-16DOI: 10.1186/s12711-023-00847-6
Ben J Hayes, James Copley, Elsie Dodd, Elizabeth M Ross, Shannon Speight, Geoffry Fordyce
Background: It has been challenging to implement genomic selection in multi-breed tropical beef cattle populations. If commercial (often crossbred) animals could be used in the reference population for these genomic evaluations, this could allow for very large reference populations. In tropical beef systems, such animals often have no pedigree information. Here we investigate potential models for such data, using marker heterozygosity (to model heterosis) and breed composition derived from genetic markers, as covariates in the model. Models treated breed effects as either fixed or random, and included genomic best linear unbiased prediction (GBLUP) and BayesR. A tropically-adapted beef cattle dataset of 29,391 purebred, crossbred and composite commercial animals was used to evaluate the models.
Results: Treating breed effects as random, in an approach analogous to genetic groups allowed partitioning of the genetic variance into within-breed and across breed-components (even with a large number of breeds), and estimation of within-breed and across-breed genomic estimated breeding values (GEBV). We demonstrate that moderately-accurate (0.30-0.43) GEBV can be calculated using these models. Treating breed effects as random gave more accurate GEBV than treating breed as fixed. A simple GBLUP model where no breed effects were fitted gave the same accuracy (and correlations of GEBV very close to 1) as a model where GEBV for within-breed and the GEBV for (random) across-breed effects were included. When GEBV were predicted for herds with no data in the reference population, BayesR resulted in the highest accuracy, with 3% accuracy improvement averaged across traits, especially when the validation population was less related to the reference population. Estimates of heterosis from our models were in line with previous estimates from beef cattle. A method for estimating the number of effective breed comparisons for each breed combination accumulated across contemporary groups is presented.
Conclusions: When no pedigree is available, breed composition and heterosis for inclusion in multi-breed genomic evaluation can be estimated from genotypes. When GEBV were predicted for herds with no data in the reference population, BayesR resulted in the highest accuracy.
{"title":"Multi-breed genomic evaluation for tropical beef cattle when no pedigree information is available.","authors":"Ben J Hayes, James Copley, Elsie Dodd, Elizabeth M Ross, Shannon Speight, Geoffry Fordyce","doi":"10.1186/s12711-023-00847-6","DOIUrl":"10.1186/s12711-023-00847-6","url":null,"abstract":"<p><strong>Background: </strong>It has been challenging to implement genomic selection in multi-breed tropical beef cattle populations. If commercial (often crossbred) animals could be used in the reference population for these genomic evaluations, this could allow for very large reference populations. In tropical beef systems, such animals often have no pedigree information. Here we investigate potential models for such data, using marker heterozygosity (to model heterosis) and breed composition derived from genetic markers, as covariates in the model. Models treated breed effects as either fixed or random, and included genomic best linear unbiased prediction (GBLUP) and BayesR. A tropically-adapted beef cattle dataset of 29,391 purebred, crossbred and composite commercial animals was used to evaluate the models.</p><p><strong>Results: </strong>Treating breed effects as random, in an approach analogous to genetic groups allowed partitioning of the genetic variance into within-breed and across breed-components (even with a large number of breeds), and estimation of within-breed and across-breed genomic estimated breeding values (GEBV). We demonstrate that moderately-accurate (0.30-0.43) GEBV can be calculated using these models. Treating breed effects as random gave more accurate GEBV than treating breed as fixed. A simple GBLUP model where no breed effects were fitted gave the same accuracy (and correlations of GEBV very close to 1) as a model where GEBV for within-breed and the GEBV for (random) across-breed effects were included. When GEBV were predicted for herds with no data in the reference population, BayesR resulted in the highest accuracy, with 3% accuracy improvement averaged across traits, especially when the validation population was less related to the reference population. Estimates of heterosis from our models were in line with previous estimates from beef cattle. A method for estimating the number of effective breed comparisons for each breed combination accumulated across contemporary groups is presented.</p><p><strong>Conclusions: </strong>When no pedigree is available, breed composition and heterosis for inclusion in multi-breed genomic evaluation can be estimated from genotypes. When GEBV were predicted for herds with no data in the reference population, BayesR resulted in the highest accuracy.</p>","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"55 1","pages":"71"},"PeriodicalIF":4.1,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10578004/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41241102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-12DOI: 10.1186/s12711-023-00848-5
Marie-Pierre Sanchez, Thierry Tribout, Naveen K Kadri, Praveen K Chitneedi, Steffen Maak, Chris Hozé, Mekki Boussaha, Pascal Croiseau, Romain Philippe, Mirjam Spengeler, Christa Kühn, Yining Wang, Changxi Li, Graham Plastow, Hubert Pausch, Didier Boichard
Background: Combining the results of within-population genome-wide association studies (GWAS) based on whole-genome sequences into a single meta-analysis (MA) is an accurate and powerful method for identifying variants associated with complex traits. As part of the H2020 BovReg project, we performed sequence-level MA for beef production traits. Five partners from France, Switzerland, Germany, and Canada contributed summary statistics from sequence-based GWAS conducted with 54,782 animals from 15 purebred or crossbred populations. We combined the summary statistics for four growth, nine morphology, and 15 carcass traits into 16 MA, using both fixed effects and z-score methods.
Results: The fixed-effects method was generally more informative to provide indication on potentially causal variants, although we combined substantially different traits in each MA. In comparison with within-population GWAS, this approach highlighted (i) a larger number of quantitative trait loci (QTL), (ii) QTL more frequently located in genomic regions known for their effects on growth and meat/carcass traits, (iii) a smaller number of genomic variants within the QTL, and (iv) candidate variants that were more frequently located in genes. MA pinpointed variants in genes, including MSTN, LCORL, and PLAG1 that have been previously associated with morphology and carcass traits. We also identified dozens of other variants located in genes associated with growth and carcass traits, or with a function that may be related to meat production (e.g., HS6ST1, HERC2, WDR75, COL3A1, SLIT2, MED28, and ANKAR). Some of these variants overlapped with expression or splicing QTL reported in the cattle Genotype-Tissue Expression atlas (CattleGTEx) and could therefore regulate gene expression.
Conclusions: By identifying candidate genes and potential causal variants associated with beef production traits in cattle, MA demonstrates great potential for investigating the biological mechanisms underlying these traits. As a complement to within-population GWAS, this approach can provide deeper insights into the genetic architecture of complex traits in beef cattle.
{"title":"Sequence-based GWAS meta-analyses for beef production traits.","authors":"Marie-Pierre Sanchez, Thierry Tribout, Naveen K Kadri, Praveen K Chitneedi, Steffen Maak, Chris Hozé, Mekki Boussaha, Pascal Croiseau, Romain Philippe, Mirjam Spengeler, Christa Kühn, Yining Wang, Changxi Li, Graham Plastow, Hubert Pausch, Didier Boichard","doi":"10.1186/s12711-023-00848-5","DOIUrl":"10.1186/s12711-023-00848-5","url":null,"abstract":"<p><strong>Background: </strong>Combining the results of within-population genome-wide association studies (GWAS) based on whole-genome sequences into a single meta-analysis (MA) is an accurate and powerful method for identifying variants associated with complex traits. As part of the H2020 BovReg project, we performed sequence-level MA for beef production traits. Five partners from France, Switzerland, Germany, and Canada contributed summary statistics from sequence-based GWAS conducted with 54,782 animals from 15 purebred or crossbred populations. We combined the summary statistics for four growth, nine morphology, and 15 carcass traits into 16 MA, using both fixed effects and z-score methods.</p><p><strong>Results: </strong>The fixed-effects method was generally more informative to provide indication on potentially causal variants, although we combined substantially different traits in each MA. In comparison with within-population GWAS, this approach highlighted (i) a larger number of quantitative trait loci (QTL), (ii) QTL more frequently located in genomic regions known for their effects on growth and meat/carcass traits, (iii) a smaller number of genomic variants within the QTL, and (iv) candidate variants that were more frequently located in genes. MA pinpointed variants in genes, including MSTN, LCORL, and PLAG1 that have been previously associated with morphology and carcass traits. We also identified dozens of other variants located in genes associated with growth and carcass traits, or with a function that may be related to meat production (e.g., HS6ST1, HERC2, WDR75, COL3A1, SLIT2, MED28, and ANKAR). Some of these variants overlapped with expression or splicing QTL reported in the cattle Genotype-Tissue Expression atlas (CattleGTEx) and could therefore regulate gene expression.</p><p><strong>Conclusions: </strong>By identifying candidate genes and potential causal variants associated with beef production traits in cattle, MA demonstrates great potential for investigating the biological mechanisms underlying these traits. As a complement to within-population GWAS, this approach can provide deeper insights into the genetic architecture of complex traits in beef cattle.</p>","PeriodicalId":55120,"journal":{"name":"Genetics Selection Evolution","volume":"55 1","pages":"70"},"PeriodicalIF":4.1,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10568825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41220687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}