Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae167
John A Calarco, Seth R Taylor, David M Miller
Reliable methods for detecting and analyzing gene expression are necessary tools for understanding development and investigating biological responses to genetic and environmental perturbation. With its fully sequenced genome, invariant cell lineage, transparent body, wiring diagram, detailed anatomy, and wide array of genetic tools, Caenorhabditis elegans is an exceptionally useful model organism for linking gene expression to cellular phenotypes. The development of new techniques in recent years has greatly expanded our ability to detect gene expression at high resolution. Here, we provide an overview of gene expression methods for C. elegans, including techniques for detecting transcripts and proteins in situ, bulk RNA sequencing of whole worms and specific tissues and cells, single-cell RNA sequencing, and high-throughput proteomics. We discuss important considerations for choosing among these techniques and provide an overview of publicly available online resources for gene expression data.
{"title":"Detecting gene expression in Caenorhabditis elegans.","authors":"John A Calarco, Seth R Taylor, David M Miller","doi":"10.1093/genetics/iyae167","DOIUrl":"10.1093/genetics/iyae167","url":null,"abstract":"<p><p>Reliable methods for detecting and analyzing gene expression are necessary tools for understanding development and investigating biological responses to genetic and environmental perturbation. With its fully sequenced genome, invariant cell lineage, transparent body, wiring diagram, detailed anatomy, and wide array of genetic tools, Caenorhabditis elegans is an exceptionally useful model organism for linking gene expression to cellular phenotypes. The development of new techniques in recent years has greatly expanded our ability to detect gene expression at high resolution. Here, we provide an overview of gene expression methods for C. elegans, including techniques for detecting transcripts and proteins in situ, bulk RNA sequencing of whole worms and specific tissues and cells, single-cell RNA sequencing, and high-throughput proteomics. We discuss important considerations for choosing among these techniques and provide an overview of publicly available online resources for gene expression data.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1-108"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142856196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae180
Amjad Dabi, Daniel R Schrider
Simulations are an essential tool in all areas of population genetic research, used in tasks such as the validation of theoretical analysis and the study of complex evolutionary models. Forward-in-time simulations are especially flexible, allowing for various types of natural selection, complex genetic architectures, and non-Wright-Fisher dynamics. However, their intense computational requirements can be prohibitive to simulating large populations and genomes. A popular method to alleviate this burden is to scale down the population size by some scaling factor while scaling up the mutation rate, selection coefficients, and recombination rate by the same factor. However, this rescaling approach may in some cases bias simulation results. To investigate the manner and degree to which rescaling impacts simulation outcomes, we carried out simulations with different demographic histories and distributions of fitness effects using several values of the rescaling factor, Q, and compared the deviation of key outcomes (fixation times, allele frequencies, linkage disequilibrium, and the fraction of mutations that fix during the simulation) between the scaled and unscaled simulations. Our results indicate that scaling introduces substantial biases to each of these measured outcomes, even at small values of Q. Moreover, the nature of these effects depends on the evolutionary model and scaling factor being examined. While increasing the scaling factor tends to increase the observed biases, this relationship is not always straightforward; thus, it may be difficult to know the impact of scaling on simulation outcomes a priori. However, it appears that for most models, only a small number of replicates was needed to accurately quantify the bias produced by rescaling for a given Q. In summary, while rescaling forward-in-time simulations may be necessary in many cases, researchers should be aware of the rescaling procedure's impact on simulation outcomes and consider investigating its magnitude in smaller scale simulations of the desired model(s) before selecting an appropriate value of Q.
{"title":"Population size rescaling significantly biases outcomes of forward-in-time population genetic simulations.","authors":"Amjad Dabi, Daniel R Schrider","doi":"10.1093/genetics/iyae180","DOIUrl":"10.1093/genetics/iyae180","url":null,"abstract":"<p><p>Simulations are an essential tool in all areas of population genetic research, used in tasks such as the validation of theoretical analysis and the study of complex evolutionary models. Forward-in-time simulations are especially flexible, allowing for various types of natural selection, complex genetic architectures, and non-Wright-Fisher dynamics. However, their intense computational requirements can be prohibitive to simulating large populations and genomes. A popular method to alleviate this burden is to scale down the population size by some scaling factor while scaling up the mutation rate, selection coefficients, and recombination rate by the same factor. However, this rescaling approach may in some cases bias simulation results. To investigate the manner and degree to which rescaling impacts simulation outcomes, we carried out simulations with different demographic histories and distributions of fitness effects using several values of the rescaling factor, Q, and compared the deviation of key outcomes (fixation times, allele frequencies, linkage disequilibrium, and the fraction of mutations that fix during the simulation) between the scaled and unscaled simulations. Our results indicate that scaling introduces substantial biases to each of these measured outcomes, even at small values of Q. Moreover, the nature of these effects depends on the evolutionary model and scaling factor being examined. While increasing the scaling factor tends to increase the observed biases, this relationship is not always straightforward; thus, it may be difficult to know the impact of scaling on simulation outcomes a priori. However, it appears that for most models, only a small number of replicates was needed to accurately quantify the bias produced by rescaling for a given Q. In summary, while rescaling forward-in-time simulations may be necessary in many cases, researchers should be aware of the rescaling procedure's impact on simulation outcomes and consider investigating its magnitude in smaller scale simulations of the desired model(s) before selecting an appropriate value of Q.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1-57"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11708920/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142584680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae187
Markku Kuismin, Mikko J Sillanpää
Gene co-expression networks typically comprise modules and their associated hub genes, which are regulating numerous downstream interactions within the network. Methods for hub screening, as well as data-driven estimation of hub co-expression networks using graphical models, can serve as useful tools for identifying these hubs. Graphical model-based penalization methods typically have one or multiple regularization terms, each of which encourages some favorable characteristics (e.g. sparsity, hubs, and power-law) to the estimated complex gene network. It is common practice to find a single optimal graphical model corresponding to a specific value of the regularization parameter(s). However, instead of doing this, one could aggregate information across several graphical models, all of which depend on the same data set, along the solution path in the hub gene detection process. We propose a novel method for detecting hub genes that utilizes the information available in the solution path. Our procedure is related to stability selection, but we replace resampling with a simple statistic. This procedure amalgamates information from each node of the data-driven graphical models into a single influence statistic, similar to Cook's distance. We call this statistic the Mean Degree Squared Distance (MDSD). Our simulation and empirical studies demonstrate that the MDSD statistic maintains a good balance between false positive and true positive hubs. An R package MDSD is publicly available on GitHub under the General Public License https://github.com/markkukuismin/MDSD.
基因共表达网络通常由模块及其相关的中枢基因组成,这些基因调控着网络中众多的下游相互作用。枢纽筛选方法以及使用图形模型对枢纽共表达网络进行数据驱动估算,可作为识别这些枢纽的有用工具。基于图形模型的惩罚方法通常有一个或多个正则化项,每个正则化项都会对估计的复杂基因网络产生一些有利的影响(如稀疏性、集线器、幂律)。通常的做法是找到与正则化参数的特定值相对应的单一最优图形模型。然而,与其这样做,我们还不如在中心基因检测过程中,沿着求解路径将多个图形模型的信息汇总起来,所有这些模型都依赖于相同的数据集。我们提出了一种利用求解路径中可用信息来检测中心基因的新方法。我们的程序与稳定性选择有关,但我们用一个简单的统计量取代了重采样。这一程序将数据驱动图形模型中每个节点的信息合并为一个影响统计量,类似于库克距离。我们称这种统计量为平均度平方距离(MDSD)。我们的模拟和实证研究表明,MDSD 统计量在假阳性枢纽和真阳性枢纽之间保持了良好的平衡。MDSD 的 R 软件包以通用公共许可证 https://github.com/markkukuismin/MDSD 在 GitHub 上公开发布。
{"title":"Network hub gene detection using the entire solution path information.","authors":"Markku Kuismin, Mikko J Sillanpää","doi":"10.1093/genetics/iyae187","DOIUrl":"10.1093/genetics/iyae187","url":null,"abstract":"<p><p>Gene co-expression networks typically comprise modules and their associated hub genes, which are regulating numerous downstream interactions within the network. Methods for hub screening, as well as data-driven estimation of hub co-expression networks using graphical models, can serve as useful tools for identifying these hubs. Graphical model-based penalization methods typically have one or multiple regularization terms, each of which encourages some favorable characteristics (e.g. sparsity, hubs, and power-law) to the estimated complex gene network. It is common practice to find a single optimal graphical model corresponding to a specific value of the regularization parameter(s). However, instead of doing this, one could aggregate information across several graphical models, all of which depend on the same data set, along the solution path in the hub gene detection process. We propose a novel method for detecting hub genes that utilizes the information available in the solution path. Our procedure is related to stability selection, but we replace resampling with a simple statistic. This procedure amalgamates information from each node of the data-driven graphical models into a single influence statistic, similar to Cook's distance. We call this statistic the Mean Degree Squared Distance (MDSD). Our simulation and empirical studies demonstrate that the MDSD statistic maintains a good balance between false positive and true positive hubs. An R package MDSD is publicly available on GitHub under the General Public License https://github.com/markkukuismin/MDSD.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1-33"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11708912/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae178
Yu Sung Kang, Jeffery Jung, Holly L Brown, Chase Mateusiak, Tamara L Doering, Michael R Brent
Cryptococcus neoformans is an opportunistic fungal pathogen with a polysaccharide capsule that becomes greatly enlarged in the mammalian host and during in vitro growth under host-like conditions. To understand how individual environmental signals affect capsule size and gene expression, we grew cells in all combinations of 5 signals implicated in capsule size and systematically measured cell and capsule sizes. We also sampled these cultures over time and performed RNA-seq in quadruplicate, yielding 881 RNA-seq samples. Analysis of the resulting data sets showed that capsule induction in tissue culture medium, typically used to represent host-like conditions, requires the presence of either CO2 or exogenous cyclic AMP. Surprisingly, adding either of these pushes overall gene expression in the opposite direction from tissue culture media alone, even though both are required for capsule development. Another unexpected finding was that rich medium blocks capsule growth completely. Statistical analysis further revealed many genes whose expression is associated with capsule thickness; deletion of one of these significantly reduced capsule size. Beyond illuminating capsule induction, our massive, uniformly collected data set will be a significant resource for the research community.
{"title":"Leveraging a new data resource to define the response of Cryptococcus neoformans to environmental signals.","authors":"Yu Sung Kang, Jeffery Jung, Holly L Brown, Chase Mateusiak, Tamara L Doering, Michael R Brent","doi":"10.1093/genetics/iyae178","DOIUrl":"10.1093/genetics/iyae178","url":null,"abstract":"<p><p>Cryptococcus neoformans is an opportunistic fungal pathogen with a polysaccharide capsule that becomes greatly enlarged in the mammalian host and during in vitro growth under host-like conditions. To understand how individual environmental signals affect capsule size and gene expression, we grew cells in all combinations of 5 signals implicated in capsule size and systematically measured cell and capsule sizes. We also sampled these cultures over time and performed RNA-seq in quadruplicate, yielding 881 RNA-seq samples. Analysis of the resulting data sets showed that capsule induction in tissue culture medium, typically used to represent host-like conditions, requires the presence of either CO2 or exogenous cyclic AMP. Surprisingly, adding either of these pushes overall gene expression in the opposite direction from tissue culture media alone, even though both are required for capsule development. Another unexpected finding was that rich medium blocks capsule growth completely. Statistical analysis further revealed many genes whose expression is associated with capsule thickness; deletion of one of these significantly reduced capsule size. Beyond illuminating capsule induction, our massive, uniformly collected data set will be a significant resource for the research community.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1-29"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11708910/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142562981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae177
{"title":"Editor's Note: Ribosome Association and Stability of the Nascent Polypeptide-Associated Complex Is Dependent Upon Its Own Ubiquitination.","authors":"","doi":"10.1093/genetics/iyae177","DOIUrl":"10.1093/genetics/iyae177","url":null,"abstract":"","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142668845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae171
Haixiao Hu, Renaud Rincent, Daniel E Runcie
Multienvironment trials (METs) are crucial for identifying varieties that perform well across a target population of environments. However, METs are typically too small to sufficiently represent all relevant environment-types, and face challenges from changing environment-types due to climate change. Statistical methods that enable prediction of variety performance for new environments beyond the METs are needed. We recently developed MegaLMM, a statistical model that can leverage hundreds of trials to significantly improve genetic value prediction accuracy within METs. Here, we extend MegaLMM to enable genomic prediction in new environments by learning regressions of latent factor loadings on Environmental Covariates (ECs) across trials. We evaluated the extended MegaLMM using the maize Genome-To-Fields dataset, consisting of 4,402 varieties cultivated in 195 trials with 87.1% of phenotypic values missing, and demonstrated its high accuracy in genomic prediction under various breeding scenarios. Furthermore, we showcased MegaLMM's superiority over univariate GBLUP in predicting trait performance of experimental genotypes in new environments. Finally, we explored the use of higher-dimensional quantitative ECs and discussed when and how detailed environmental data can be leveraged for genomic prediction from METs. We propose that MegaLMM can be applied to plant breeding of diverse crops and different fields of genetics where large-scale linear mixed models are utilized.
{"title":"MegaLMM improves genomic predictions in new environments using environmental covariates.","authors":"Haixiao Hu, Renaud Rincent, Daniel E Runcie","doi":"10.1093/genetics/iyae171","DOIUrl":"10.1093/genetics/iyae171","url":null,"abstract":"<p><p>Multienvironment trials (METs) are crucial for identifying varieties that perform well across a target population of environments. However, METs are typically too small to sufficiently represent all relevant environment-types, and face challenges from changing environment-types due to climate change. Statistical methods that enable prediction of variety performance for new environments beyond the METs are needed. We recently developed MegaLMM, a statistical model that can leverage hundreds of trials to significantly improve genetic value prediction accuracy within METs. Here, we extend MegaLMM to enable genomic prediction in new environments by learning regressions of latent factor loadings on Environmental Covariates (ECs) across trials. We evaluated the extended MegaLMM using the maize Genome-To-Fields dataset, consisting of 4,402 varieties cultivated in 195 trials with 87.1% of phenotypic values missing, and demonstrated its high accuracy in genomic prediction under various breeding scenarios. Furthermore, we showcased MegaLMM's superiority over univariate GBLUP in predicting trait performance of experimental genotypes in new environments. Finally, we explored the use of higher-dimensional quantitative ECs and discussed when and how detailed environmental data can be leveraged for genomic prediction from METs. We propose that MegaLMM can be applied to plant breeding of diverse crops and different fields of genetics where large-scale linear mixed models are utilized.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1-41"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11708919/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142548474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae182
Nathan W Anderson, Lloyd Kirk, Joshua G Schraiber, Aaron P Ragsdale
Many phenotypic traits have a polygenic genetic basis, making it challenging to learn their genetic architectures and predict individual phenotypes. One promising avenue to resolve the genetic basis of complex traits is through evolve-and-resequence (E&R) experiments, in which laboratory populations are exposed to some selective pressure and trait-contributing loci are identified by extreme frequency changes over the course of the experiment. However, small laboratory populations will experience substantial random genetic drift, and it is difficult to determine whether selection played a role in a given allele frequency change (AFC). Predicting AFCs under drift and selection, even for alleles contributing to simple, monogenic traits, has remained a challenging problem. Recently, there have been efforts to apply the path integral, a method borrowed from physics, to solve this problem. So far, this approach has been limited to genic selection, and is therefore inadequate to capture the complexity of quantitative, highly polygenic traits that are commonly studied. Here, we extend one of these path integral methods, the perturbation approximation, to selection scenarios that are of interest to quantitative genetics. We derive analytic expressions for the transition probability (i.e. the probability that an allele will change in frequency from x to y in time t) of an allele contributing to a trait subject to stabilizing selection, as well as that of an allele contributing to a trait rapidly adapting to a new phenotypic optimum. We use these expressions to characterize the use of AFC to test for selection, as well as explore optimal design choices for E&R experiments to uncover the genetic architecture of polygenic traits under selection.
许多表型性状都有多基因遗传基础,因此了解其遗传结构并预测个体表型具有挑战性。解决复杂性状遗传基础的一个有希望的途径是通过 "进化与序列 "实验,在实验中,实验室种群面临一定的选择性压力,通过实验过程中的极端频率变化来确定性状贡献位点。然而,小规模的实验室群体会经历大量的随机遗传漂移,因此很难确定选择是否在特定等位基因频率变化中起了作用。预测漂移和选择作用下等位基因频率的变化,即使是预测简单的单基因性状的等位基因频率变化,仍然是一个具有挑战性的问题。最近,人们开始应用路径积分(一种借用物理学的方法)来解决这个问题。迄今为止,这种方法仅限于基因选择,因此不足以捕捉通常研究的定量、高度多基因性状的复杂性。在这里,我们将这些路径积分方法之一--扰动近似--扩展到数量遗传学感兴趣的选择情景中。我们推导出了受稳定选择影响的等位基因性状的过渡概率(即等位基因的频率在 t 时间内从 x 变为 y 的概率)以及快速适应新表型最佳性状的等位基因性状的过渡概率的解析表达式。我们利用这些表达式来描述利用等位基因频率变化来测试选择的特点,并探索进化和序列实验的最佳设计选择,以揭示选择下多基因性状的遗传结构。
{"title":"A path integral approach for allele frequency dynamics under polygenic selection.","authors":"Nathan W Anderson, Lloyd Kirk, Joshua G Schraiber, Aaron P Ragsdale","doi":"10.1093/genetics/iyae182","DOIUrl":"10.1093/genetics/iyae182","url":null,"abstract":"<p><p>Many phenotypic traits have a polygenic genetic basis, making it challenging to learn their genetic architectures and predict individual phenotypes. One promising avenue to resolve the genetic basis of complex traits is through evolve-and-resequence (E&R) experiments, in which laboratory populations are exposed to some selective pressure and trait-contributing loci are identified by extreme frequency changes over the course of the experiment. However, small laboratory populations will experience substantial random genetic drift, and it is difficult to determine whether selection played a role in a given allele frequency change (AFC). Predicting AFCs under drift and selection, even for alleles contributing to simple, monogenic traits, has remained a challenging problem. Recently, there have been efforts to apply the path integral, a method borrowed from physics, to solve this problem. So far, this approach has been limited to genic selection, and is therefore inadequate to capture the complexity of quantitative, highly polygenic traits that are commonly studied. Here, we extend one of these path integral methods, the perturbation approximation, to selection scenarios that are of interest to quantitative genetics. We derive analytic expressions for the transition probability (i.e. the probability that an allele will change in frequency from x to y in time t) of an allele contributing to a trait subject to stabilizing selection, as well as that of an allele contributing to a trait rapidly adapting to a new phenotypic optimum. We use these expressions to characterize the use of AFC to test for selection, as well as explore optimal design choices for E&R experiments to uncover the genetic architecture of polygenic traits under selection.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1-63"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae183
Kevin A Bird, Jordan R Brock, Paul P Grabowski, Avril M Harder, Adam L Healy, Shengqiang Shu, Kerrie Barry, LoriBeth Boston, Christopher Daum, Jie Guo, Anna Lipzen, Rachel Walstead, Jane Grimwood, Jeremy Schmutz, Chaofu Lu, Luca Comai, John K McKay, J Chris Pires, Patrick P Edger, John T Lovell, Daniel J Kliebenstein
Ancient whole-genome duplications are believed to facilitate novelty and adaptation by providing the raw fuel for new genes. However, it is unclear how recent whole-genome duplications may contribute to evolvability within recent polyploids. Hybridization accompanying some whole-genome duplications may combine divergent gene content among diploid species. Some theory and evidence suggest that polyploids have a greater accumulation and tolerance of gene presence-absence and genomic structural variation, but it is unclear to what extent either is true. To test how recent polyploidy may influence pangenomic variation, we sequenced, assembled, and annotated 12 complete, chromosome-scale genomes of Camelina sativa, an allohexaploid biofuel crop with 3 distinct subgenomes. Using pangenomic comparative analyses, we characterized gene presence-absence and genomic structural variation both within and between the subgenomes. We found over 75% of ortholog gene clusters are core in C. sativa and <10% of sequence space was affected by genomic structural rearrangements. In contrast, 19% of gene clusters were unique to one subgenome, and the majority of these were Camelina specific (no ortholog in Arabidopsis). We identified an inversion that may contribute to vernalization requirements in winter-type Camelina and an enrichment of Camelina-specific genes with enzymatic processes related to seed oil quality and Camelina's unique glucosinolate profile. Genes related to these traits exhibited little presence-absence variation. Our results reveal minimal pangenomic variation in this species and instead show how hybridization accompanied by whole-genome duplication may benefit polyploids by merging diverged gene content of different species.
{"title":"Allopolyploidy expanded gene content but not pangenomic variation in the hexaploid oilseed Camelina sativa.","authors":"Kevin A Bird, Jordan R Brock, Paul P Grabowski, Avril M Harder, Adam L Healy, Shengqiang Shu, Kerrie Barry, LoriBeth Boston, Christopher Daum, Jie Guo, Anna Lipzen, Rachel Walstead, Jane Grimwood, Jeremy Schmutz, Chaofu Lu, Luca Comai, John K McKay, J Chris Pires, Patrick P Edger, John T Lovell, Daniel J Kliebenstein","doi":"10.1093/genetics/iyae183","DOIUrl":"10.1093/genetics/iyae183","url":null,"abstract":"<p><p>Ancient whole-genome duplications are believed to facilitate novelty and adaptation by providing the raw fuel for new genes. However, it is unclear how recent whole-genome duplications may contribute to evolvability within recent polyploids. Hybridization accompanying some whole-genome duplications may combine divergent gene content among diploid species. Some theory and evidence suggest that polyploids have a greater accumulation and tolerance of gene presence-absence and genomic structural variation, but it is unclear to what extent either is true. To test how recent polyploidy may influence pangenomic variation, we sequenced, assembled, and annotated 12 complete, chromosome-scale genomes of Camelina sativa, an allohexaploid biofuel crop with 3 distinct subgenomes. Using pangenomic comparative analyses, we characterized gene presence-absence and genomic structural variation both within and between the subgenomes. We found over 75% of ortholog gene clusters are core in C. sativa and <10% of sequence space was affected by genomic structural rearrangements. In contrast, 19% of gene clusters were unique to one subgenome, and the majority of these were Camelina specific (no ortholog in Arabidopsis). We identified an inversion that may contribute to vernalization requirements in winter-type Camelina and an enrichment of Camelina-specific genes with enzymatic processes related to seed oil quality and Camelina's unique glucosinolate profile. Genes related to these traits exhibited little presence-absence variation. Our results reveal minimal pangenomic variation in this species and instead show how hybridization accompanied by whole-genome duplication may benefit polyploids by merging diverged gene content of different species.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1-44"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae192
Yun Deng, Rasmus Nielsen, Yun S Song
It was recently reported that a severe ancient bottleneck occurred around 900 thousand years ago in the ancestry of African populations, while this signal is absent in non-African populations. Here, we present evidence to show that this finding is likely a statistical artifact.
{"title":"A previously reported bottleneck in human ancestry 900 kya is likely a statistical artifact.","authors":"Yun Deng, Rasmus Nielsen, Yun S Song","doi":"10.1093/genetics/iyae192","DOIUrl":"10.1093/genetics/iyae192","url":null,"abstract":"<p><p>It was recently reported that a severe ancient bottleneck occurred around 900 thousand years ago in the ancestry of African populations, while this signal is absent in non-African populations. Here, we present evidence to show that this finding is likely a statistical artifact.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1-3"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11708913/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142830566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-08DOI: 10.1093/genetics/iyae175
Matthew V Rockman
Self-fertile Caenorhabditis nematodes carry a surprising number of Medea elements, alleles that act in heterozygous mothers and cause death or developmental delay in offspring that do not inherit them. At some loci, both alleles in a cross operate as independent Medeas, affecting all the homozygous progeny of a selfing heterozygote. The genomic coincidence of Medea elements and ancient, deeply coalescing haplotypes, which pepper the otherwise homogeneous genomes of these animals, raises questions about how these apparent gene-drive elements persist for long periods of time. Here, I investigate how mating system affects the evolution of Medeas, and their paternal-effect counterparts, peels. Despite an intuition that antagonistic alleles should induce balancing selection by killing homozygotes, models show that, under partial selfing, antagonistic elements experience positive frequency dependence: the common allele drives the rare one extinct, even if the rare one is more penetrant. Analytical results for the threshold frequency required for one allele to invade a population show that a very weakly penetrant allele, one whose effects would escape laboratory detection, could nevertheless prevent a much more penetrant allele from invading under high rates of selfing. Ubiquitous weak antagonistic Medeas and peels could then act as localized barriers to gene flow between populations, generating genomic islands of deep coalescence. Analysis of gene expression data, however, suggests that this cannot be the whole story. A complementary explanation is that ordinary ecological balancing selection generates ancient haplotypes on which Medeas can evolve, while high homozygosity in these selfers minimizes the role of gene drive in their evolution.
{"title":"Parental-effect gene-drive elements under partial selfing, or why do Caenorhabditis genomes have hyperdivergent regions?","authors":"Matthew V Rockman","doi":"10.1093/genetics/iyae175","DOIUrl":"10.1093/genetics/iyae175","url":null,"abstract":"<p><p>Self-fertile Caenorhabditis nematodes carry a surprising number of Medea elements, alleles that act in heterozygous mothers and cause death or developmental delay in offspring that do not inherit them. At some loci, both alleles in a cross operate as independent Medeas, affecting all the homozygous progeny of a selfing heterozygote. The genomic coincidence of Medea elements and ancient, deeply coalescing haplotypes, which pepper the otherwise homogeneous genomes of these animals, raises questions about how these apparent gene-drive elements persist for long periods of time. Here, I investigate how mating system affects the evolution of Medeas, and their paternal-effect counterparts, peels. Despite an intuition that antagonistic alleles should induce balancing selection by killing homozygotes, models show that, under partial selfing, antagonistic elements experience positive frequency dependence: the common allele drives the rare one extinct, even if the rare one is more penetrant. Analytical results for the threshold frequency required for one allele to invade a population show that a very weakly penetrant allele, one whose effects would escape laboratory detection, could nevertheless prevent a much more penetrant allele from invading under high rates of selfing. Ubiquitous weak antagonistic Medeas and peels could then act as localized barriers to gene flow between populations, generating genomic islands of deep coalescence. Analysis of gene expression data, however, suggests that this cannot be the whole story. A complementary explanation is that ordinary ecological balancing selection generates ancient haplotypes on which Medeas can evolve, while high homozygosity in these selfers minimizes the role of gene drive in their evolution.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":"1-36"},"PeriodicalIF":3.3,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11708918/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142548475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}