Pub Date : 2024-11-15DOI: 10.1093/genetics/iyae183
Kevin A Bird, Jordan R Brock, Paul P Grabowski, Avril M Harder, Adam Healy, Shengqiang Shu, Kerrie Barry, LoriBeth Boston, Christopher Daum, Jie Guo, Anna Lipzen, Rachel Walstead, Jane Grimwood, Jeremy Schmutz, Chaofu Lu, Luca Comai, John K McKay, J Chris Pires, Patrick P Edger, John T Lovell, Daniel J Kliebenstein
Ancient whole-genome duplications (WGDs) are believed to facilitate novelty and adaptation by providing the raw fuel for new genes. However, it is unclear how recent WGDs may contribute to evolvability within recent polyploids. Hybridization accompanying some WGDs may combine divergent gene content among diploid species. Some theory and evidence suggest that polyploids have a greater accumulation and tolerance of gene presence-absence and genomic structural variation, but it is unclear to what extent either is true. To test how recent polyploidy may influence pangenomic variation, we sequenced, assembled, and annotated twelve complete, chromosome-scale genomes of Camelina sativa, an allohexaploid biofuel crop with three distinct subgenomes. Using pangenomic comparative analyses, we characterized gene presence-absence and genomic structural variation both within and between the subgenomes. We found over 75% of ortholog gene clusters are core in Camelina sativa and <10% of sequence space was affected by genomic structural rearrangements. In contrast, 19% of gene clusters were unique to one subgenome, and the majority of these were Camelina-specific (no ortholog in Arabidopsis). We identified an inversion that may contribute to vernalization requirements in winter-type Camelina, and an enrichment of Camelina-specific genes with enzymatic processes related to seed oil quality and Camelina's unique glucosinolate profile. Genes related to these traits exhibited little presence-absence variation. Our results reveal minimal pangenomic variation in this species, and instead show how hybridization accompanied by WGD may benefit polyploids by merging diverged gene content of different species.
{"title":"Allopolyploidy expanded gene content but not pangenomic variation in the hexaploid oilseed Camelina sativa.","authors":"Kevin A Bird, Jordan R Brock, Paul P Grabowski, Avril M Harder, Adam Healy, Shengqiang Shu, Kerrie Barry, LoriBeth Boston, Christopher Daum, Jie Guo, Anna Lipzen, Rachel Walstead, Jane Grimwood, Jeremy Schmutz, Chaofu Lu, Luca Comai, John K McKay, J Chris Pires, Patrick P Edger, John T Lovell, Daniel J Kliebenstein","doi":"10.1093/genetics/iyae183","DOIUrl":"https://doi.org/10.1093/genetics/iyae183","url":null,"abstract":"<p><p>Ancient whole-genome duplications (WGDs) are believed to facilitate novelty and adaptation by providing the raw fuel for new genes. However, it is unclear how recent WGDs may contribute to evolvability within recent polyploids. Hybridization accompanying some WGDs may combine divergent gene content among diploid species. Some theory and evidence suggest that polyploids have a greater accumulation and tolerance of gene presence-absence and genomic structural variation, but it is unclear to what extent either is true. To test how recent polyploidy may influence pangenomic variation, we sequenced, assembled, and annotated twelve complete, chromosome-scale genomes of Camelina sativa, an allohexaploid biofuel crop with three distinct subgenomes. Using pangenomic comparative analyses, we characterized gene presence-absence and genomic structural variation both within and between the subgenomes. We found over 75% of ortholog gene clusters are core in Camelina sativa and <10% of sequence space was affected by genomic structural rearrangements. In contrast, 19% of gene clusters were unique to one subgenome, and the majority of these were Camelina-specific (no ortholog in Arabidopsis). We identified an inversion that may contribute to vernalization requirements in winter-type Camelina, and an enrichment of Camelina-specific genes with enzymatic processes related to seed oil quality and Camelina's unique glucosinolate profile. Genes related to these traits exhibited little presence-absence variation. Our results reveal minimal pangenomic variation in this species, and instead show how hybridization accompanied by WGD may benefit polyploids by merging diverged gene content of different species.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13DOI: 10.1093/genetics/iyae187
Markku Kuismin, Mikko J Sillanp
Gene co-expression networks typically comprise modules and their associated hub genes, which are regulating numerous downstream interactions within the network. Methods for hub screening, as well as data-driven estimation of hub co-expression networks using graphical models, can serve as useful tools for identifying these hubs. Graphical model-based penalization methods typically have one or multiple regularization terms, each of which encourages some favorable characteristics (e.g., sparsity, hubs, power-law) to the estimated complex gene network. It is common practice to find a single optimal graphical model corresponding to a specific value of the regularization parameter(s). However, instead of doing this, one could aggregate information across several graphical models, all of which depend on the same data set, along the solution path in the hub gene detection process. We propose a novel method for detecting hub genes that utilizes the information available in the solution path. Our procedure is related to stability selection, but we replace resampling with a simple statistic. This procedure amalgamates information from each node of the data-driven graphical models into a single influence statistic, similar to Cook's distance. We call this statistic the Mean Degree Squared Distance (MDSD). Our simulation and empirical studies demonstrate that the MDSD statistic maintains a good balance between false positive and true positive hubs. An R package MDSD is publicly available on GitHub under the General Public License https://github.com/markkukuismin/MDSD.
基因共表达网络通常由模块及其相关的中枢基因组成,这些基因调控着网络中众多的下游相互作用。枢纽筛选方法以及使用图形模型对枢纽共表达网络进行数据驱动估算,可作为识别这些枢纽的有用工具。基于图形模型的惩罚方法通常有一个或多个正则化项,每个正则化项都会对估计的复杂基因网络产生一些有利的影响(如稀疏性、集线器、幂律)。通常的做法是找到与正则化参数的特定值相对应的单一最优图形模型。然而,与其这样做,我们还不如在中心基因检测过程中,沿着求解路径将多个图形模型的信息汇总起来,所有这些模型都依赖于相同的数据集。我们提出了一种利用求解路径中可用信息来检测中心基因的新方法。我们的程序与稳定性选择有关,但我们用一个简单的统计量取代了重采样。这一程序将数据驱动图形模型中每个节点的信息合并为一个影响统计量,类似于库克距离。我们称这种统计量为平均度平方距离(MDSD)。我们的模拟和实证研究表明,MDSD 统计量在假阳性枢纽和真阳性枢纽之间保持了良好的平衡。MDSD 的 R 软件包以通用公共许可证 https://github.com/markkukuismin/MDSD 在 GitHub 上公开发布。
{"title":"Network hub gene detection using the entire solution path information.","authors":"Markku Kuismin, Mikko J Sillanp","doi":"10.1093/genetics/iyae187","DOIUrl":"https://doi.org/10.1093/genetics/iyae187","url":null,"abstract":"<p><p>Gene co-expression networks typically comprise modules and their associated hub genes, which are regulating numerous downstream interactions within the network. Methods for hub screening, as well as data-driven estimation of hub co-expression networks using graphical models, can serve as useful tools for identifying these hubs. Graphical model-based penalization methods typically have one or multiple regularization terms, each of which encourages some favorable characteristics (e.g., sparsity, hubs, power-law) to the estimated complex gene network. It is common practice to find a single optimal graphical model corresponding to a specific value of the regularization parameter(s). However, instead of doing this, one could aggregate information across several graphical models, all of which depend on the same data set, along the solution path in the hub gene detection process. We propose a novel method for detecting hub genes that utilizes the information available in the solution path. Our procedure is related to stability selection, but we replace resampling with a simple statistic. This procedure amalgamates information from each node of the data-driven graphical models into a single influence statistic, similar to Cook's distance. We call this statistic the Mean Degree Squared Distance (MDSD). Our simulation and empirical studies demonstrate that the MDSD statistic maintains a good balance between false positive and true positive hubs. An R package MDSD is publicly available on GitHub under the General Public License https://github.com/markkukuismin/MDSD.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-12DOI: 10.1093/genetics/iyae185
Stacia R Engel, Suzi Aleksander, Robert S Nash, Edith D Wong, Shuai Weng, Stuart R Miyasato, Gavin Sherlock, J Michael Cherry
Budding yeast (Saccharomyces cerevisiae) is the most extensively characterized eukaryotic model organism and has long been used to gain insight into the fundamentals of genetics, cellular biology, and the functions of specific genes and proteins. The Saccharomyces Genome Database (SGD) is a scientific resource that provides information about the genome and biology of S. cerevisiae. For more than 30 years, SGD has maintained the genetic nomenclature, chromosome maps, and functional annotation for budding yeast along with search and analysis tools to explore these data. Here we describe recent updates at SGD, including the two most recent reference genome annotation updates, expanded biochemical pathways representation, changes to SGD search and data files, and other enhancements to the SGD website and user interface. These activities are part of our continuing effort to promote insights gained from yeast to enable the discovery of functional relationships between sequence and gene products in fungi and higher eukaryotes.
{"title":"Saccharomyces Genome Database: Advances in Genome Annotation, Expanded Biochemical Pathways, and Other Key Enhancements.","authors":"Stacia R Engel, Suzi Aleksander, Robert S Nash, Edith D Wong, Shuai Weng, Stuart R Miyasato, Gavin Sherlock, J Michael Cherry","doi":"10.1093/genetics/iyae185","DOIUrl":"10.1093/genetics/iyae185","url":null,"abstract":"<p><p>Budding yeast (Saccharomyces cerevisiae) is the most extensively characterized eukaryotic model organism and has long been used to gain insight into the fundamentals of genetics, cellular biology, and the functions of specific genes and proteins. The Saccharomyces Genome Database (SGD) is a scientific resource that provides information about the genome and biology of S. cerevisiae. For more than 30 years, SGD has maintained the genetic nomenclature, chromosome maps, and functional annotation for budding yeast along with search and analysis tools to explore these data. Here we describe recent updates at SGD, including the two most recent reference genome annotation updates, expanded biochemical pathways representation, changes to SGD search and data files, and other enhancements to the SGD website and user interface. These activities are part of our continuing effort to promote insights gained from yeast to enable the discovery of functional relationships between sequence and gene products in fungi and higher eukaryotes.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-12DOI: 10.1093/genetics/iyae182
Nathan W Anderson, Lloyd Kirk, Joshua G Schraiber, Aaron P Ragsdale
Many phenotypic traits have a polygenic genetic basis, making it challenging to learn their genetic architectures and predict individual phenotypes. One promising avenue to resolve the genetic basis of complex traits is through evolve-and-resequence experiments, in which laboratory populations are exposed to some selective pressure and trait-contributing loci are identified by extreme frequency changes over the course of the experiment. However, small laboratory populations will experience substantial random genetic drift, and it is difficult to determine whether selection played a role in a given allele frequency change. Predicting allele frequency changes under drift and selection, even for alleles contributing to simple, monogenic traits, has remained a challenging problem. Recently, there have been efforts to apply the path integral, a method borrowed from physics, to solve this problem. So far, this approach has been limited to genic selection, and is therefore inadequate to capture the complexity of quantitative, highly polygenic traits that are commonly studied. Here we extend one of these path integral methods, the perturbation approximation, to selection scenarios that are of interest to quantitative genetics. We derive analytic expressions for the transition probability (i.e., the probability that an allele will change in frequency from x to y in time t) of an allele contributing to a trait subject to stabilizing selection, as well as that of an allele contributing to a trait rapidly adapting to a new phenotypic optimum. We use these expressions to characterize the use of allele frequency change to test for selection, as well as explore optimal design choices for evolve-and-resequence experiments to uncover the genetic architecture of polygenic traits under selection.
许多表型性状都有多基因遗传基础,因此了解其遗传结构并预测个体表型具有挑战性。解决复杂性状遗传基础的一个有希望的途径是通过 "进化与序列 "实验,在实验中,实验室种群面临一定的选择性压力,通过实验过程中的极端频率变化来确定性状贡献位点。然而,小规模的实验室群体会经历大量的随机遗传漂移,因此很难确定选择是否在特定等位基因频率变化中起了作用。预测漂移和选择作用下等位基因频率的变化,即使是预测简单的单基因性状的等位基因频率变化,仍然是一个具有挑战性的问题。最近,人们开始应用路径积分(一种借用物理学的方法)来解决这个问题。迄今为止,这种方法仅限于基因选择,因此不足以捕捉通常研究的定量、高度多基因性状的复杂性。在这里,我们将这些路径积分方法之一--扰动近似--扩展到数量遗传学感兴趣的选择情景中。我们推导出了受稳定选择影响的等位基因性状的过渡概率(即等位基因的频率在 t 时间内从 x 变为 y 的概率)以及快速适应新表型最佳性状的等位基因性状的过渡概率的解析表达式。我们利用这些表达式来描述利用等位基因频率变化来测试选择的特点,并探索进化和序列实验的最佳设计选择,以揭示选择下多基因性状的遗传结构。
{"title":"A path integral approach for allele frequency dynamics under polygenic selection.","authors":"Nathan W Anderson, Lloyd Kirk, Joshua G Schraiber, Aaron P Ragsdale","doi":"10.1093/genetics/iyae182","DOIUrl":"10.1093/genetics/iyae182","url":null,"abstract":"<p><p>Many phenotypic traits have a polygenic genetic basis, making it challenging to learn their genetic architectures and predict individual phenotypes. One promising avenue to resolve the genetic basis of complex traits is through evolve-and-resequence experiments, in which laboratory populations are exposed to some selective pressure and trait-contributing loci are identified by extreme frequency changes over the course of the experiment. However, small laboratory populations will experience substantial random genetic drift, and it is difficult to determine whether selection played a role in a given allele frequency change. Predicting allele frequency changes under drift and selection, even for alleles contributing to simple, monogenic traits, has remained a challenging problem. Recently, there have been efforts to apply the path integral, a method borrowed from physics, to solve this problem. So far, this approach has been limited to genic selection, and is therefore inadequate to capture the complexity of quantitative, highly polygenic traits that are commonly studied. Here we extend one of these path integral methods, the perturbation approximation, to selection scenarios that are of interest to quantitative genetics. We derive analytic expressions for the transition probability (i.e., the probability that an allele will change in frequency from x to y in time t) of an allele contributing to a trait subject to stabilizing selection, as well as that of an allele contributing to a trait rapidly adapting to a new phenotypic optimum. We use these expressions to characterize the use of allele frequency change to test for selection, as well as explore optimal design choices for evolve-and-resequence experiments to uncover the genetic architecture of polygenic traits under selection.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-07DOI: 10.1093/genetics/iyae181
Carlos S Djoko Tagne, Mersimine F M Kouamo, Magellan Tchouakui, Abdullahi Muhammad, Leon J L Mugenzi, Nelly M T Tatchou-Nebangwa, Riccado F Thiomela, Mahamat Gadji, Murielle J Wondji, Jack Hearn, Mbouobda H Desire, Sulaiman S Ibrahim, Charles S Wondji
Metabolic mechanisms conferring pyrethroid resistance in malaria vectors are jeopardizing the effectiveness of insecticide-based interventions, and identification of their markers is a key requirement for robust resistance management. Here, using a field-lab-field approach, we demonstrated that a single mutation G454A in the P450 CYP9K1 is driving pyrethroid resistance in the major malaria vector Anopheles funestus in East and Central Africa. Drastic reduction in CYP9K1 diversity was observed in Ugandan samples collected in 2014, with selection of a predominant haplotype (G454A mutation at 90%), which was completely absent in the other African regions. However, six years later (2020) the Ugandan 454A-CYP9K1 haplotype was found predominant in Cameroon (84.6%), but absent in Malawi (Southern Africa) and Ghana (West Africa). Comparative in vitro heterologous expression and metabolism assays revealed that the mutant 454A-CYP9K1 (R) allele significantly metabolises more type II pyrethroid (deltamethrin) compared with the wild G454-CYP9K1 (S) allele. Transgenic Drosophila melanogaster flies expressing 454A-CYP9K1 (R) allele exhibited significantly higher type I and II pyrethroids resistance compared to flies expressing the wild G454-CYP9K1 (S) allele. Furthermore, laboratory testing and field experimental hut trials in Cameroon demonstrated that mosquitoes harbouring the resistant 454A-CYP9K1 allele significantly survived to pyrethroids exposure (Odds ratio = 567, p < 0.0001). This study highlights the rapid spread of pyrethroid resistant CYP9K1 allele, under directional selection in East and Central Africa, contributing to reduced bed net efficacy. The newly designed DNA-based assay here will add to the toolbox of resistance monitoring and improving its management strategies.
疟疾病媒对拟除虫菊酯产生抗药性的代谢机制正在危及基于杀虫剂的干预措施的有效性,而鉴定其标记物是进行强有力的抗药性管理的关键要求。在这里,我们采用现场-实验室-现场的方法,证明了 P450 CYP9K1 中的单一突变 G454A 正在驱动非洲东部和中部的主要疟疾病媒疟原虫对拟除虫菊酯产生抗药性。在 2014 年采集的乌干达样本中观察到 CYP9K1 多样性急剧下降,并选择了一种占主导地位的单倍型(G454A 突变占 90%),而其他非洲地区则完全没有这种单倍型。然而,六年后(2020 年),乌干达的 454A-CYP9K1 单倍型在喀麦隆(84.6%)占主导地位,但在马拉维(南部非洲)和加纳(西非)却不存在。体外异源表达和新陈代谢比较试验显示,与野生 G454-CYP9K1 (S) 等位基因相比,突变体 454A-CYP9K1 (R) 等位基因能代谢更多的 II 型拟除虫菊酯(溴氰菊酯)。与表达野生 G454-CYP9K1 (S) 等位基因的果蝇相比,表达 454A-CYP9K1 (R) 等位基因的转基因黑腹果蝇对 I 型和 II 型拟除虫菊酯的抗性明显更高。此外,在喀麦隆进行的实验室测试和野外实验小屋试验表明,携带抗性 454A-CYP9K1 等位基因的蚊子在除虫菊酯暴露中存活率很高(Odds ratio = 567,p < 0.0001)。这项研究表明,在非洲东部和中部,除虫菊酯抗性 CYP9K1 等位基因在定向选择下迅速扩散,导致蚊帐功效降低。新设计的基于 DNA 的检测方法将为抗药性监测工具箱增添新的内容,并改善其管理策略。
{"title":"A single mutation G454A in the P450 CYP9K1 drives pyrethroid resistance in the major malaria vector Anopheles funestus reducing bed net efficacy.","authors":"Carlos S Djoko Tagne, Mersimine F M Kouamo, Magellan Tchouakui, Abdullahi Muhammad, Leon J L Mugenzi, Nelly M T Tatchou-Nebangwa, Riccado F Thiomela, Mahamat Gadji, Murielle J Wondji, Jack Hearn, Mbouobda H Desire, Sulaiman S Ibrahim, Charles S Wondji","doi":"10.1093/genetics/iyae181","DOIUrl":"https://doi.org/10.1093/genetics/iyae181","url":null,"abstract":"<p><p>Metabolic mechanisms conferring pyrethroid resistance in malaria vectors are jeopardizing the effectiveness of insecticide-based interventions, and identification of their markers is a key requirement for robust resistance management. Here, using a field-lab-field approach, we demonstrated that a single mutation G454A in the P450 CYP9K1 is driving pyrethroid resistance in the major malaria vector Anopheles funestus in East and Central Africa. Drastic reduction in CYP9K1 diversity was observed in Ugandan samples collected in 2014, with selection of a predominant haplotype (G454A mutation at 90%), which was completely absent in the other African regions. However, six years later (2020) the Ugandan 454A-CYP9K1 haplotype was found predominant in Cameroon (84.6%), but absent in Malawi (Southern Africa) and Ghana (West Africa). Comparative in vitro heterologous expression and metabolism assays revealed that the mutant 454A-CYP9K1 (R) allele significantly metabolises more type II pyrethroid (deltamethrin) compared with the wild G454-CYP9K1 (S) allele. Transgenic Drosophila melanogaster flies expressing 454A-CYP9K1 (R) allele exhibited significantly higher type I and II pyrethroids resistance compared to flies expressing the wild G454-CYP9K1 (S) allele. Furthermore, laboratory testing and field experimental hut trials in Cameroon demonstrated that mosquitoes harbouring the resistant 454A-CYP9K1 allele significantly survived to pyrethroids exposure (Odds ratio = 567, p < 0.0001). This study highlights the rapid spread of pyrethroid resistant CYP9K1 allele, under directional selection in East and Central Africa, contributing to reduced bed net efficacy. The newly designed DNA-based assay here will add to the toolbox of resistance monitoring and improving its management strategies.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142607160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1093/genetics/iyae144
Gözde Atağ, Shamam Waldman, Shai Carmi, Mehmet Somel
Patterson's f-statistics are among the most heavily utilized tools for analyzing genome-wide allele frequency data for demographic inference. Beyond studying admixture, f3- and f4-statistics are also used for clustering populations to identify groups with similar histories. However, previous studies have noted an unexpected behavior of f-statistics: multiple populations from a certain region systematically show higher genetic affinity to a more distant population than to their neighbors, a pattern that is mismatched with alternative measures of genetic similarity. We call this counter-intuitive pattern "sister repulsion". We first present a novel instance of sister repulsion, where genomes from Bronze Age East Anatolian sites show higher affinity toward Bronze Age Greece rather than each other. This is observed both using f3- and f4-statistics, contrasts with archaeological/historical expectation, and also contradicts genetic affinity patterns captured using principal components analysis or multidimensional scaling on genetic distances. We then propose a simple demographic model to explain this pattern, where sister populations receive gene flow from a genetically distant source. We calculate f3- and f4-statistics using simulated genetic data with varying population genetic parameters, confirming that low-level gene flow from an external source into populations from 1 region can create sister repulsion in f-statistics. Unidirectional gene flow between the studied regions (without an external source) can likewise create repulsion. Meanwhile, similar to our empirical observations, multidimensional scaling analyses of genetic distances still cluster sister populations together. Overall, our results highlight the impact of low-level admixture events when inferring demographic history using f-statistics.
Patterson 的 f 统计量是用于分析全基因组等位基因频率数据以进行人口推断的最常用工具之一。除了研究混杂外,f3 和 f4 统计量还用于聚类,以确定具有相似历史的群体。然而,以往的研究注意到了 f 统计量的一种意想不到的行为:来自某一地区的多个种群系统性地表现出与较远种群的遗传亲和性高于与邻近种群的遗传亲和性,这种模式与遗传相似性的其他衡量标准不匹配。我们称这种反直觉模式为 "姊妹排斥"。我们首先介绍了姊妹排斥的一个新实例,即青铜时代东安纳托利亚遗址的基因组与青铜时代希腊的亲和力更高,而不是相互亲和力更高。这是用 f3- 和 f4 统计法观察到的,与考古学/历史学的预期相反,也与用主成分分析或遗传距离多维缩放捕捉到的遗传亲和模式相矛盾。随后,我们提出了一个简单的人口统计模型来解释这种模式,即姐妹种群接受来自遗传上遥远来源的基因流。我们利用不同种群遗传参数的模拟遗传数据计算了f3-和f4-统计量,证实了来自外部的低水平基因流进入来自一个地区的种群会在f-统计量中产生姊妹排斥。研究区域之间的单向基因流动(无外部来源)同样会产生排斥。同时,与我们的经验观察相似,遗传距离的多维比例分析仍然会将姊妹种群聚集在一起。总之,我们的研究结果凸显了利用 f 统计量推断人口历史时低水平混杂事件的影响。
{"title":"An explanation for the sister repulsion phenomenon in Patterson's f-statistics.","authors":"Gözde Atağ, Shamam Waldman, Shai Carmi, Mehmet Somel","doi":"10.1093/genetics/iyae144","DOIUrl":"10.1093/genetics/iyae144","url":null,"abstract":"<p><p>Patterson's f-statistics are among the most heavily utilized tools for analyzing genome-wide allele frequency data for demographic inference. Beyond studying admixture, f3- and f4-statistics are also used for clustering populations to identify groups with similar histories. However, previous studies have noted an unexpected behavior of f-statistics: multiple populations from a certain region systematically show higher genetic affinity to a more distant population than to their neighbors, a pattern that is mismatched with alternative measures of genetic similarity. We call this counter-intuitive pattern \"sister repulsion\". We first present a novel instance of sister repulsion, where genomes from Bronze Age East Anatolian sites show higher affinity toward Bronze Age Greece rather than each other. This is observed both using f3- and f4-statistics, contrasts with archaeological/historical expectation, and also contradicts genetic affinity patterns captured using principal components analysis or multidimensional scaling on genetic distances. We then propose a simple demographic model to explain this pattern, where sister populations receive gene flow from a genetically distant source. We calculate f3- and f4-statistics using simulated genetic data with varying population genetic parameters, confirming that low-level gene flow from an external source into populations from 1 region can create sister repulsion in f-statistics. Unidirectional gene flow between the studied regions (without an external source) can likewise create repulsion. Meanwhile, similar to our empirical observations, multidimensional scaling analyses of genetic distances still cluster sister populations together. Overall, our results highlight the impact of low-level admixture events when inferring demographic history using f-statistics.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11538414/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1093/genetics/iyae140
Arthur Zwaenepoel, Himani Sachdeva, Christelle Fraïsse
We consider how the genetic architecture underlying locally adaptive traits determines the strength of a barrier to gene flow in a mainland-island model. Assuming a general life cycle, we derive an expression for the effective migration rate when local adaptation is due to genetic variation at many loci under directional selection on the island, allowing for arbitrary fitness and dominance effects across loci. We show how the effective migration rate can be combined with classical single-locus diffusion theory to accurately predict multilocus differentiation between the mainland and island at migration-selection-drift equilibrium and determine the migration rate beyond which local adaptation collapses, while accounting for genetic drift and weak linkage. Using our efficient numerical tools, we then present a detailed study of the effects of dominance on barriers to gene flow, showing that when total selection is sufficiently strong, more recessive local adaptation generates stronger barriers to gene flow. We then study how heterogeneous genetic architectures of local adaptation affect barriers to gene flow, characterizing adaptive differentiation at migration-selection balance for different distributions of fitness effects. We find that a more heterogeneous genetic architecture generally yields a stronger genome-wide barrier to gene flow and that the detailed genetic architecture underlying locally adaptive traits can have an important effect on observable differentiation when divergence is not too large. Lastly, we study the limits of our approach as loci become more tightly linked, showing that our predictions remain accurate over a large biologically relevant domain.
{"title":"The genetic architecture of polygenic local adaptation and its role in shaping barriers to gene flow.","authors":"Arthur Zwaenepoel, Himani Sachdeva, Christelle Fraïsse","doi":"10.1093/genetics/iyae140","DOIUrl":"10.1093/genetics/iyae140","url":null,"abstract":"<p><p>We consider how the genetic architecture underlying locally adaptive traits determines the strength of a barrier to gene flow in a mainland-island model. Assuming a general life cycle, we derive an expression for the effective migration rate when local adaptation is due to genetic variation at many loci under directional selection on the island, allowing for arbitrary fitness and dominance effects across loci. We show how the effective migration rate can be combined with classical single-locus diffusion theory to accurately predict multilocus differentiation between the mainland and island at migration-selection-drift equilibrium and determine the migration rate beyond which local adaptation collapses, while accounting for genetic drift and weak linkage. Using our efficient numerical tools, we then present a detailed study of the effects of dominance on barriers to gene flow, showing that when total selection is sufficiently strong, more recessive local adaptation generates stronger barriers to gene flow. We then study how heterogeneous genetic architectures of local adaptation affect barriers to gene flow, characterizing adaptive differentiation at migration-selection balance for different distributions of fitness effects. We find that a more heterogeneous genetic architecture generally yields a stronger genome-wide barrier to gene flow and that the detailed genetic architecture underlying locally adaptive traits can have an important effect on observable differentiation when divergence is not too large. Lastly, we study the limits of our approach as loci become more tightly linked, showing that our predictions remain accurate over a large biologically relevant domain.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11538419/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1093/genetics/iyae152
Robert A Townley, Kennedy S Stacy, Fatemeh Cheraghi, Claire C de la Cova
Raf protein kinases act as Ras-GTP sensing components of the ERK signal transduction pathway in animal cells, influencing cell proliferation, differentiation, and survival. In humans, somatic and germline mutations in the genes BRAF and RAF1 are associated with malignancies and developmental disorders. Recent studies shed light on the structure of activated Raf, a heterotetramer consisting of Raf and 14-3-3 dimers, and raised the possibility that a Raf C-terminal distal tail segment (DTS) regulates activation. We investigated the role of the DTS using the Caenorhabditis elegans Raf ortholog lin-45. Truncations removing the DTS strongly enhanced lin-45(S312A), a weak gain-of-function allele equivalent to RAF1 mutations found in patients with Noonan Syndrome. We genetically defined three elements of the LIN-45 DTS, which we termed the active site binding sequence (ASBS), the KTP motif, and the aromatic cluster. In the context of lin-45(S312A), the mutation of each of these elements enhanced activity. We used AlphaFold to predict DTS protein interactions for LIN-45, fly Raf, and human BRAF within the activated heterotetramer complex. We propose the following distinct functions for the LIN-45 DTS elements: (1) the ASBS binds the kinase active site as an inhibitor; (2) phosphorylation of the KTP motif modulates the DTS-kinase domain interaction; and (3) the aromatic cluster anchors the DTS in an inhibitory conformation. Human RASopathy-associated variants in BRAF affect residues of the DTS, consistent with these predictions. This work establishes that the Raf/LIN-45 DTS negatively regulates signaling in C. elegans and provides a model for its function in other Raf proteins.
{"title":"The Raf/LIN-45 C-terminal distal tail segment negatively regulates signaling in Caenorhabditis elegans.","authors":"Robert A Townley, Kennedy S Stacy, Fatemeh Cheraghi, Claire C de la Cova","doi":"10.1093/genetics/iyae152","DOIUrl":"10.1093/genetics/iyae152","url":null,"abstract":"<p><p>Raf protein kinases act as Ras-GTP sensing components of the ERK signal transduction pathway in animal cells, influencing cell proliferation, differentiation, and survival. In humans, somatic and germline mutations in the genes BRAF and RAF1 are associated with malignancies and developmental disorders. Recent studies shed light on the structure of activated Raf, a heterotetramer consisting of Raf and 14-3-3 dimers, and raised the possibility that a Raf C-terminal distal tail segment (DTS) regulates activation. We investigated the role of the DTS using the Caenorhabditis elegans Raf ortholog lin-45. Truncations removing the DTS strongly enhanced lin-45(S312A), a weak gain-of-function allele equivalent to RAF1 mutations found in patients with Noonan Syndrome. We genetically defined three elements of the LIN-45 DTS, which we termed the active site binding sequence (ASBS), the KTP motif, and the aromatic cluster. In the context of lin-45(S312A), the mutation of each of these elements enhanced activity. We used AlphaFold to predict DTS protein interactions for LIN-45, fly Raf, and human BRAF within the activated heterotetramer complex. We propose the following distinct functions for the LIN-45 DTS elements: (1) the ASBS binds the kinase active site as an inhibitor; (2) phosphorylation of the KTP motif modulates the DTS-kinase domain interaction; and (3) the aromatic cluster anchors the DTS in an inhibitory conformation. Human RASopathy-associated variants in BRAF affect residues of the DTS, consistent with these predictions. This work establishes that the Raf/LIN-45 DTS negatively regulates signaling in C. elegans and provides a model for its function in other Raf proteins.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11538406/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1093/genetics/iyae131
Carolaing Gabaldón, Ozgur Karakuzu, Danielle A Garsin
During challenge of Caenorhabditis elegans with human bacterial pathogens such as Pseudomonas aeruginosa and Enterococcus faecalis, the elicited host response can be damaging if not properly controlled. The activation of Nrf (nuclear factor erythroid-related factor)/CNC (Cap-n-collar) transcriptional regulators modulates the response by upregulating genes that neutralize damaging molecules and promote repair processes. Activation of the C. elegans Nrf ortholog, SKN-1, is tightly controlled by a myriad of regulatory mechanisms, but a central feature is an activating phosphorylation accomplished by the p38 mitogen-activated kinase (MAPK) cascade. In this work, loss of CDC-48, an AAA+ ATPase, was observed to severely compromise SKN-1 activation on pathogen and we sought to understand the mechanism. CDC-48 is part of the endoplasmic reticulum (ER)-associated degradation (ERAD) complex where it functions as a remodeling chaperone enabling the translocation of proteins from the ER to the cytoplasm for degradation by the proteosome. Interestingly, one of the proteins retrotranslocated by ERAD, a process necessary for its activation, is SKN-1A, the ER isoform of SKN-1. However, we discovered that SKN-1A is not activated by pathogen exposure in marked contrast to the cytoplasmic-associated isoform SKN-1C. Rather, loss of CDC-48 blocks the antioxidant response normally orchestrated by SKN-1C by strongly inducing the unfolded protein response (UPRER). The data are consistent with the model of these 2 pathways being mutually inhibitory and support the emerging paradigm in the field of coordinated cooperation between different stress responses.
{"title":"SKN-1 activation during infection of Caenorhabditis elegans requires CDC-48 and endoplasmic reticulum proteostasis.","authors":"Carolaing Gabaldón, Ozgur Karakuzu, Danielle A Garsin","doi":"10.1093/genetics/iyae131","DOIUrl":"10.1093/genetics/iyae131","url":null,"abstract":"<p><p>During challenge of Caenorhabditis elegans with human bacterial pathogens such as Pseudomonas aeruginosa and Enterococcus faecalis, the elicited host response can be damaging if not properly controlled. The activation of Nrf (nuclear factor erythroid-related factor)/CNC (Cap-n-collar) transcriptional regulators modulates the response by upregulating genes that neutralize damaging molecules and promote repair processes. Activation of the C. elegans Nrf ortholog, SKN-1, is tightly controlled by a myriad of regulatory mechanisms, but a central feature is an activating phosphorylation accomplished by the p38 mitogen-activated kinase (MAPK) cascade. In this work, loss of CDC-48, an AAA+ ATPase, was observed to severely compromise SKN-1 activation on pathogen and we sought to understand the mechanism. CDC-48 is part of the endoplasmic reticulum (ER)-associated degradation (ERAD) complex where it functions as a remodeling chaperone enabling the translocation of proteins from the ER to the cytoplasm for degradation by the proteosome. Interestingly, one of the proteins retrotranslocated by ERAD, a process necessary for its activation, is SKN-1A, the ER isoform of SKN-1. However, we discovered that SKN-1A is not activated by pathogen exposure in marked contrast to the cytoplasmic-associated isoform SKN-1C. Rather, loss of CDC-48 blocks the antioxidant response normally orchestrated by SKN-1C by strongly inducing the unfolded protein response (UPRER). The data are consistent with the model of these 2 pathways being mutually inhibitory and support the emerging paradigm in the field of coordinated cooperation between different stress responses.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11538416/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1093/genetics/iyae156
Gabriel E Boyle, Katherine A Sitko, Jared G Galloway, Hugh K Haddox, Aisha Haley Bianchi, Ajeya Dixon, Melinda K Wheelock, Allyssa J Vandi, Ziyu R Wang, Raine E S Thomson, Riddhiman K Garge, Allan E Rettie, Alan F Rubin, Renee C Geck, Elizabeth M J Gillam, William S DeWitt, Frederick A Matsen, Douglas M Fowler
The cytochrome P450s enzyme family metabolizes ∼80% of small molecule drugs. Variants in cytochrome P450s can substantially alter drug metabolism, leading to improper dosing and severe adverse drug reactions. Due to low sequence conservation, predicting variant effects across cytochrome P450s is challenging. Even closely related cytochrome P450s like CYP2C9 and CYP2C19, which share 92% amino acid sequence identity, display distinct phenotypic properties. Using variant abundance by massively parallel sequencing, we measured the steady-state protein abundance of 7,660 single amino acid variants in CYP2C19 expressed in cultured human cells. Our findings confirmed critical positions and structural features essential for cytochrome P450 function, and revealed how variants at conserved positions influence abundance. We jointly analyzed 4,670 variants whose abundance was measured in both CYP2C19 and CYP2C9, finding that the homologs have different variant abundances in substrate recognition sites within the hydrophobic core. We also measured the abundance of all single and some multiple wild type amino acid exchanges between CYP2C19 and CYP2C9. While most exchanges had no effect, substitutions in substrate recognition site 4 reduced abundance in CYP2C19. Double and triple mutants showed distinct interactions, highlighting a region that points to differing thermodynamic properties between the 2 homologs. These positions are known contributors to substrate specificity, suggesting an evolutionary tradeoff between stability and enzymatic function. Finally, we analyzed 368 previously unannotated human variants, finding that 43% had decreased abundance. By comparing variant effects between these homologs, we uncovered regions underlying their functional differences, advancing our understanding of this versatile family of enzymes.
{"title":"Deep mutational scanning of CYP2C19 in human cells reveals a substrate specificity-abundance tradeoff.","authors":"Gabriel E Boyle, Katherine A Sitko, Jared G Galloway, Hugh K Haddox, Aisha Haley Bianchi, Ajeya Dixon, Melinda K Wheelock, Allyssa J Vandi, Ziyu R Wang, Raine E S Thomson, Riddhiman K Garge, Allan E Rettie, Alan F Rubin, Renee C Geck, Elizabeth M J Gillam, William S DeWitt, Frederick A Matsen, Douglas M Fowler","doi":"10.1093/genetics/iyae156","DOIUrl":"10.1093/genetics/iyae156","url":null,"abstract":"<p><p>The cytochrome P450s enzyme family metabolizes ∼80% of small molecule drugs. Variants in cytochrome P450s can substantially alter drug metabolism, leading to improper dosing and severe adverse drug reactions. Due to low sequence conservation, predicting variant effects across cytochrome P450s is challenging. Even closely related cytochrome P450s like CYP2C9 and CYP2C19, which share 92% amino acid sequence identity, display distinct phenotypic properties. Using variant abundance by massively parallel sequencing, we measured the steady-state protein abundance of 7,660 single amino acid variants in CYP2C19 expressed in cultured human cells. Our findings confirmed critical positions and structural features essential for cytochrome P450 function, and revealed how variants at conserved positions influence abundance. We jointly analyzed 4,670 variants whose abundance was measured in both CYP2C19 and CYP2C9, finding that the homologs have different variant abundances in substrate recognition sites within the hydrophobic core. We also measured the abundance of all single and some multiple wild type amino acid exchanges between CYP2C19 and CYP2C9. While most exchanges had no effect, substitutions in substrate recognition site 4 reduced abundance in CYP2C19. Double and triple mutants showed distinct interactions, highlighting a region that points to differing thermodynamic properties between the 2 homologs. These positions are known contributors to substrate specificity, suggesting an evolutionary tradeoff between stability and enzymatic function. Finally, we analyzed 368 previously unannotated human variants, finding that 43% had decreased abundance. By comparing variant effects between these homologs, we uncovered regions underlying their functional differences, advancing our understanding of this versatile family of enzymes.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11538415/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142330515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}