{"title":"The paradigm of genomic selection: Does it need an update?","authors":"Johannes A. Lenstra","doi":"10.1002/aro2.88","DOIUrl":null,"url":null,"abstract":"<p>The genetics and genomics of livestock is, as for other species, a dynamic and successful field of research. It is divided into two clearly different, although closely interacting disciplines: the molecular and the quantitative genetics. Remarkably, this contrast has a close parallel in the opposing views during a short and fierce war (1904–1906) between Mendelians and biometricians. Although the accepted views soon became more balanced [<span>1, 2</span>], the 20th century saw the emergence of two distinct genetic disciplines.</p><p>The development of the molecular genetics is an amazing and unending series of pioneering success stories featuring a legion of Nobel prize winners [<span>3</span>]: from chromosomes to DNA and to the central dogma; from recombinant DNA to PCR, microsatellites and SNPs; the routine whole-genome sequencing (WGS) with telomere to telomere genomes and pangenomes as the newest toys; and now also the CRISPR/Cas9 gene editing, although not yet of primary relevance for livestock [<span>4, 5</span>]. This was all typical laboratory science, which now has become a lot cleaner by automation and a growing emphasis on bioinformatics.</p><p>It illustrates the hectic progress that the promises made after one breakthrough were fulfilled after the next. Southern blotting of restriction fragment length polymorphism (RFLP) markers in the 80s and a little later the PCR–RFLP did not deliver the intended dense genetic map of a genome, so the discovery at the end of the decade of the microsatellites was most timely. This allowed the genetic mapping of monogenic traits, but until 20 years ago most causative mutations in livestock species were found via the candidate gene approach [<span>1, 6</span>]. In the new millennium microsatellites were replaced by high-density genome-wide SNP arrays, which deliver accurate genetic localizations. At the same time, WGS became affordable and monogenic causative variants became sitting ducks. However, we did not unravel the molecular mechanisms of complex traits [<span>6, 7</span>], so now we accept a less than satisfactory infinitesimal model of countless small contributions [<span>4</span>].</p><p>Starting during the decade of WWII, the quantitative geneticists, who never touch a pipette, started to provide scientific support to the breeding industry and developed the concept of breeding values [<span>8</span>]. For a long time, this was solely based on phenotypes, but they did not hesitate to exploit the advances in the molecular field. During the last 2 decades of the millennium the concept or dream of master-assisted selection was an important source of inspiration [<span>9, 10</span>]. This led to genetic localizations of enough quantitative trait loci (QTL) to fill the Animal QTLdb, but these explain only a small part of the phenotypic variation [<span>4</span>].</p><p>Again, we needed another breakthrough to fulfill the promises already made. In a visionary paper, Meuwissen et al. proposed genetic selection (GS) based on the predicted contributions to the breeding value of variants across the whole genome [<span>11</span>]. GS became a resounding success [<span>7</span>], a triumph for quantitative genetics, which now ensures a continuous genetic progress for the highly productive breeds all around the world. Breeders are happy, so why should we still care for the underlying molecular mechanisms?</p><p>Of course, we care [<span>7</span>]. The molecular geneticists did not sit on their hands. WGS data reveal a multitude of missense and nonsense mutations, and we can predict their functional consequences. If a deleterious mutation results in a loss-of-function of an indispensable protein, there are for this mutation no homozygotes in the population. This autozygous depletion by embryonic lethality is also observed on the haplotype level [<span>4</span>]. Less drastic effects of autozygous genotypes (or of compound heterozygotes if the parental and maternal gene copies carry different recessive deleterious mutations) are sterility, a genetic disorder, reduced fitness and/or low productivity. Deleterious mutations may also be dominant in haploinsufficient genes all this also holds for the regulatory mutations controlling gene expressions. These are more difficult to identify in WGS datasets, but may be detected via their deleterious effects.</p><p>Fitness and performance are polygenic traits, but their causative variants may be linked to “intermediate phenotypes” or “endophenotypes”, for instance gene expression levels, enzyme activities or metabolite concentrations [<span>4, 12</span>].</p><p>A more recent and important development is the discovery by novel long-read sequencing of large structural variations (SVs): deletions, copy number variations or divergent (and therefore non-recombining) alleles involving up to millions of base pair. These were so far largely overlooked by short-read WGS, but change the gene repertoire, disrupt topologically associating domains and are associated with genetic diseases and several other traits [<span>13-16</span>]. Because of these observations, SVs are now considered as a major source of phenotypic variation.</p><p>There are several examples of balancing selection, often by a favorable effect of heterozygote genotypes on production and a deleterious effect of homozygote mutant genotypes on fitness [<span>4</span>]. However, it is plausible that for most gene variants effects on fitness and on agricultural performance correlate. Given the multitude of variants that potentially affect fitness, linkage disequilibrium with these variants is likely to cover a large part of the genome. Plausibly, this explains the perception of the infinitesimal model that underlies the GS.</p><p>It may make sense to improve GS by accommodating the effect of causative variants [<span>4</span>] as far as their contributions to breeding values can be verified. This of course also applies to the large SVs. We do not know yet how many causative variants, many of which have low minor allele frequencies, would have the same predictive power as the current GS.</p><p>Another open question: would a panel of 10,000 or 20,000 of the most consequential causative variants already be useful? Why these numbers? These are presently the numbers of variants that can be genotyped in low-density bead arrays for $ 50 or less, depending on the number of samples. For the highly productive cattle breeds, this is not too high to invest in a single cow. Thus, we would see another paradigm shift, already mentioned by Georges et al. [<span>4</span>]: the routine testing of both sires and dams, allowing a two-dimensional GS (2D-GS) of breeding mates. Instead of “one sire fits all”, a DNA-mediated female choice would accomplish an individual heterosis, which is expected to improve performance, health and well-being of the offspring as well as the genetic diversity of the population.</p><p>The feasibility of 2D-GS obviously depends on the number of variants that can be tested, the additional income per tested female and practical considerations. It will be more difficult for other species and minor breeds, so new and cheaper technologies are welcome. Only variants with a broad breed distribution would be useful across breeds. However, the many different breeds with separate histories and selection regimes ensures together with de novo variants an endless supply of causative variants, which are worth investigating. For instance, they may implicate genes for which we do not have yet a clue about their function.</p><p>Will this work? Or is it just another dream? At least, we would collect a lot of data on the functional effects of DNA variants. This is of fundamental interest and remains the core business of molecular genetics.</p><p><b>Johannes A. Lenstra</b>: Conceptualization; investigation; methodology; writing—original draft; writing—review & editing.</p><p>The author declares no conflicts of interest.</p>","PeriodicalId":100086,"journal":{"name":"Animal Research and One Health","volume":"2 4","pages":"360-362"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aro2.88","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Animal Research and One Health","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aro2.88","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The genetics and genomics of livestock is, as for other species, a dynamic and successful field of research. It is divided into two clearly different, although closely interacting disciplines: the molecular and the quantitative genetics. Remarkably, this contrast has a close parallel in the opposing views during a short and fierce war (1904–1906) between Mendelians and biometricians. Although the accepted views soon became more balanced [1, 2], the 20th century saw the emergence of two distinct genetic disciplines.
The development of the molecular genetics is an amazing and unending series of pioneering success stories featuring a legion of Nobel prize winners [3]: from chromosomes to DNA and to the central dogma; from recombinant DNA to PCR, microsatellites and SNPs; the routine whole-genome sequencing (WGS) with telomere to telomere genomes and pangenomes as the newest toys; and now also the CRISPR/Cas9 gene editing, although not yet of primary relevance for livestock [4, 5]. This was all typical laboratory science, which now has become a lot cleaner by automation and a growing emphasis on bioinformatics.
It illustrates the hectic progress that the promises made after one breakthrough were fulfilled after the next. Southern blotting of restriction fragment length polymorphism (RFLP) markers in the 80s and a little later the PCR–RFLP did not deliver the intended dense genetic map of a genome, so the discovery at the end of the decade of the microsatellites was most timely. This allowed the genetic mapping of monogenic traits, but until 20 years ago most causative mutations in livestock species were found via the candidate gene approach [1, 6]. In the new millennium microsatellites were replaced by high-density genome-wide SNP arrays, which deliver accurate genetic localizations. At the same time, WGS became affordable and monogenic causative variants became sitting ducks. However, we did not unravel the molecular mechanisms of complex traits [6, 7], so now we accept a less than satisfactory infinitesimal model of countless small contributions [4].
Starting during the decade of WWII, the quantitative geneticists, who never touch a pipette, started to provide scientific support to the breeding industry and developed the concept of breeding values [8]. For a long time, this was solely based on phenotypes, but they did not hesitate to exploit the advances in the molecular field. During the last 2 decades of the millennium the concept or dream of master-assisted selection was an important source of inspiration [9, 10]. This led to genetic localizations of enough quantitative trait loci (QTL) to fill the Animal QTLdb, but these explain only a small part of the phenotypic variation [4].
Again, we needed another breakthrough to fulfill the promises already made. In a visionary paper, Meuwissen et al. proposed genetic selection (GS) based on the predicted contributions to the breeding value of variants across the whole genome [11]. GS became a resounding success [7], a triumph for quantitative genetics, which now ensures a continuous genetic progress for the highly productive breeds all around the world. Breeders are happy, so why should we still care for the underlying molecular mechanisms?
Of course, we care [7]. The molecular geneticists did not sit on their hands. WGS data reveal a multitude of missense and nonsense mutations, and we can predict their functional consequences. If a deleterious mutation results in a loss-of-function of an indispensable protein, there are for this mutation no homozygotes in the population. This autozygous depletion by embryonic lethality is also observed on the haplotype level [4]. Less drastic effects of autozygous genotypes (or of compound heterozygotes if the parental and maternal gene copies carry different recessive deleterious mutations) are sterility, a genetic disorder, reduced fitness and/or low productivity. Deleterious mutations may also be dominant in haploinsufficient genes all this also holds for the regulatory mutations controlling gene expressions. These are more difficult to identify in WGS datasets, but may be detected via their deleterious effects.
Fitness and performance are polygenic traits, but their causative variants may be linked to “intermediate phenotypes” or “endophenotypes”, for instance gene expression levels, enzyme activities or metabolite concentrations [4, 12].
A more recent and important development is the discovery by novel long-read sequencing of large structural variations (SVs): deletions, copy number variations or divergent (and therefore non-recombining) alleles involving up to millions of base pair. These were so far largely overlooked by short-read WGS, but change the gene repertoire, disrupt topologically associating domains and are associated with genetic diseases and several other traits [13-16]. Because of these observations, SVs are now considered as a major source of phenotypic variation.
There are several examples of balancing selection, often by a favorable effect of heterozygote genotypes on production and a deleterious effect of homozygote mutant genotypes on fitness [4]. However, it is plausible that for most gene variants effects on fitness and on agricultural performance correlate. Given the multitude of variants that potentially affect fitness, linkage disequilibrium with these variants is likely to cover a large part of the genome. Plausibly, this explains the perception of the infinitesimal model that underlies the GS.
It may make sense to improve GS by accommodating the effect of causative variants [4] as far as their contributions to breeding values can be verified. This of course also applies to the large SVs. We do not know yet how many causative variants, many of which have low minor allele frequencies, would have the same predictive power as the current GS.
Another open question: would a panel of 10,000 or 20,000 of the most consequential causative variants already be useful? Why these numbers? These are presently the numbers of variants that can be genotyped in low-density bead arrays for $ 50 or less, depending on the number of samples. For the highly productive cattle breeds, this is not too high to invest in a single cow. Thus, we would see another paradigm shift, already mentioned by Georges et al. [4]: the routine testing of both sires and dams, allowing a two-dimensional GS (2D-GS) of breeding mates. Instead of “one sire fits all”, a DNA-mediated female choice would accomplish an individual heterosis, which is expected to improve performance, health and well-being of the offspring as well as the genetic diversity of the population.
The feasibility of 2D-GS obviously depends on the number of variants that can be tested, the additional income per tested female and practical considerations. It will be more difficult for other species and minor breeds, so new and cheaper technologies are welcome. Only variants with a broad breed distribution would be useful across breeds. However, the many different breeds with separate histories and selection regimes ensures together with de novo variants an endless supply of causative variants, which are worth investigating. For instance, they may implicate genes for which we do not have yet a clue about their function.
Will this work? Or is it just another dream? At least, we would collect a lot of data on the functional effects of DNA variants. This is of fundamental interest and remains the core business of molecular genetics.
Johannes A. Lenstra: Conceptualization; investigation; methodology; writing—original draft; writing—review & editing.