首页 > 最新文献

Genome research最新文献

英文 中文
The grasshopper genome reveals long-term gene content conservation of the X Chromosome and temporal variation in X Chromosome evolution. 蚱蜢基因组揭示了X染色体基因内容的长期保存和X染色体进化的时间变异。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-20 DOI: 10.1101/gr.278794.123
Xinghua Li, Judith E Mank, Liping Ban

We present the first chromosome-level genome assembly of the grasshopper, Locusta migratoria, one of the largest insect genomes. We use coverage differences between females (XX) and males (X0) to identify the X Chromosome gene content, and find that the X Chromosome shows both complete dosage compensation in somatic tissues and an underrepresentation of testis-expressed genes. X-linked gene content from L. migratoria is highly conserved across seven insect orders, namely Orthoptera, Odonata, Phasmatodea, Hemiptera, Neuroptera, Coleoptera, and Diptera, and the 800 Mb grasshopper X Chromosome is homologous to the fly ancestral X Chromosome despite 400 million years of divergence, suggesting either repeated origin of sex chromosomes with highly similar gene content, or long-term conservation of the X Chromosome. We use this broad conservation of the X Chromosome to test for temporal dynamics to Fast-X evolution, and find evidence of a recent burst evolution for new X-linked genes in contrast to slow evolution of X-conserved genes.

我们首次展示了蚱蜢(Locusta migratoria)染色体水平的基因组组装,这是最大的昆虫基因组之一。我们利用雌性(XX)和雄性(X0)之间的覆盖率差异来确定 X 染色体的基因含量,并发现 X 染色体在体细胞组织中表现出完全的剂量补偿以及睾丸表达基因的代表性不足。蚱蜢的 800 Mb X 染色体与蝇类祖先的 X 染色体同源,尽管二者已经存在 4 亿年的差异,这表明具有高度相似基因内容的性染色体是重复起源的,或者 X 染色体是长期保存的。我们利用 X 染色体的这种广泛保护来检验快速-X 进化的时间动态,并发现了新的 X 连锁基因近期爆发性进化的证据,这与 X 保守基因的缓慢进化形成了鲜明对比。
{"title":"The grasshopper genome reveals long-term gene content conservation of the X Chromosome and temporal variation in X Chromosome evolution.","authors":"Xinghua Li, Judith E Mank, Liping Ban","doi":"10.1101/gr.278794.123","DOIUrl":"10.1101/gr.278794.123","url":null,"abstract":"<p><p>We present the first chromosome-level genome assembly of the grasshopper, <i>Locusta migratoria</i>, one of the largest insect genomes. We use coverage differences between females (XX) and males (X0) to identify the X Chromosome gene content, and find that the X Chromosome shows both complete dosage compensation in somatic tissues and an underrepresentation of testis-expressed genes. X-linked gene content from <i>L. migratoria</i> is highly conserved across seven insect orders, namely Orthoptera, Odonata, Phasmatodea, Hemiptera, Neuroptera, Coleoptera, and Diptera, and the 800 Mb grasshopper X Chromosome is homologous to the fly ancestral X Chromosome despite 400 million years of divergence, suggesting either repeated origin of sex chromosomes with highly similar gene content, or long-term conservation of the X Chromosome. We use this broad conservation of the X Chromosome to test for temporal dynamics to Fast-X evolution, and find evidence of a recent burst evolution for new X-linked genes in contrast to slow evolution of X-conserved genes.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"997-1007"},"PeriodicalIF":6.2,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141893250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Streamlined spatial and environmental expression signatures characterize the minimalist duckweed Wolffia australiana. 简约浮萍 Wolffia australiana 的空间和环境表达特征。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-20 DOI: 10.1101/gr.279091.124
Tom Denyer, Pin-Jou Wu, Kelly Colt, Bradley W Abramson, Zhili Pang, Pavel Solansky, Allen Mamerto, Tatsuya Nobori, Joseph R Ecker, Eric Lam, Todd P Michael, Marja C P Timmermans

Single-cell genomics permits a new resolution in the examination of molecular and cellular dynamics, allowing global, parallel assessments of cell types and cellular behaviors through development and in response to environmental circumstances, such as interaction with water and the light-dark cycle of the Earth. Here, we leverage the smallest, and possibly most structurally reduced, plant, the semiaquatic Wolffia australiana, to understand dynamics of cell expression in these contexts at the whole-plant level. We examined single-cell-resolution RNA-sequencing data and found Wolffia cells divide into four principal clusters representing the above- and below-water-situated parenchyma and epidermis. Although these tissues share transcriptomic similarity with model plants, they display distinct adaptations that Wolffia has made for the aquatic environment. Within this broad classification, discrete subspecializations are evident, with select cells showing unique transcriptomic signatures associated with developmental maturation and specialized physiologies. Assessing this simplified biological system temporally at two key time-of-day (TOD) transitions, we identify additional TOD-responsive genes previously overlooked in whole-plant transcriptomic approaches and demonstrate that the core circadian clock machinery and its downstream responses can vary in cell-specific manners, even in this simplified system. Distinctions between cell types and their responses to submergence and/or TOD are driven by expression changes of unexpectedly few genes, characterizing Wolffia as a highly streamlined organism with the majority of genes dedicated to fundamental cellular processes. Wolffia provides a unique opportunity to apply reductionist biology to elucidate signaling functions at the organismal level, for which this work provides a powerful resource.

单细胞基因组学为分子和细胞动态研究提供了新的分辨率,可以对细胞类型和细胞在发育过程中的行为以及对环境条件(如与水的相互作用和地球的光-暗循环)的反应进行全面、平行的评估。在这里,我们利用半水生的澳大利亚狼尾草(Wolffia australiana)这种最小、也可能是结构最简单的植物,来了解这些情况下细胞在整株植物水平上的表达动态。我们研究了单细胞分辨率的 RNA 测序数据,发现灰灰菜细胞分为四个主要群组,分别代表水上和水下的实质和表皮。虽然这些组织与模式植物的转录组相似,但它们显示了狼尾草对水生环境的独特适应性。在这一广泛的分类中,离散的亚专业化非常明显,部分细胞显示出与发育成熟和专业生理相关的独特转录组特征。通过在两个关键的日时(TOD)转换阶段对这一简化的生物系统进行时间评估,我们发现了更多以前在全植物转录组学方法中被忽视的 TOD 响应基因,并证明即使在这一简化系统中,核心昼夜节律时钟机制及其下游响应也会以细胞特异性的方式发生变化。细胞类型之间的差异及其对浸没和/或 TOD 的反应是由出乎意料的少数基因的表达变化驱动的,这说明狼尾草是一种高度精简的生物体,其大部分基因都用于基本的细胞过程。狼尾草为应用还原生物学阐明生物体水平的信号功能提供了一个独特的机会,这项工作为此提供了强大的资源。
{"title":"Streamlined spatial and environmental expression signatures characterize the minimalist duckweed <i>Wolffia australiana</i>.","authors":"Tom Denyer, Pin-Jou Wu, Kelly Colt, Bradley W Abramson, Zhili Pang, Pavel Solansky, Allen Mamerto, Tatsuya Nobori, Joseph R Ecker, Eric Lam, Todd P Michael, Marja C P Timmermans","doi":"10.1101/gr.279091.124","DOIUrl":"10.1101/gr.279091.124","url":null,"abstract":"<p><p>Single-cell genomics permits a new resolution in the examination of molecular and cellular dynamics, allowing global, parallel assessments of cell types and cellular behaviors through development and in response to environmental circumstances, such as interaction with water and the light-dark cycle of the Earth. Here, we leverage the smallest, and possibly most structurally reduced, plant, the semiaquatic <i>Wolffia australiana</i>, to understand dynamics of cell expression in these contexts at the whole-plant level. We examined single-cell-resolution RNA-sequencing data and found <i>Wolffia</i> cells divide into four principal clusters representing the above- and below-water-situated parenchyma and epidermis. Although these tissues share transcriptomic similarity with model plants, they display distinct adaptations that <i>Wolffia</i> has made for the aquatic environment. Within this broad classification, discrete subspecializations are evident, with select cells showing unique transcriptomic signatures associated with developmental maturation and specialized physiologies. Assessing this simplified biological system temporally at two key time-of-day (TOD) transitions, we identify additional TOD-responsive genes previously overlooked in whole-plant transcriptomic approaches and demonstrate that the core circadian clock machinery and its downstream responses can vary in cell-specific manners, even in this simplified system. Distinctions between cell types and their responses to submergence and/or TOD are driven by expression changes of unexpectedly few genes, characterizing <i>Wolffia</i> as a highly streamlined organism with the majority of genes dedicated to fundamental cellular processes. <i>Wolffia</i> provides a unique opportunity to apply reductionist biology to elucidate signaling functions at the organismal level, for which this work provides a powerful resource.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1106-1120"},"PeriodicalIF":6.2,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368201/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141476458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CodonBERT large language model for mRNA vaccines. 用于 mRNA 疫苗的 CodonBert 大语言模型。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-20 DOI: 10.1101/gr.278870.123
Sizhen Li, Saeed Moayedpour, Ruijiang Li, Michael Bailey, Saleh Riahi, Lorenzo Kogler-Anele, Milad Miladi, Jacob Miner, Fabien Pertuy, Dinghai Zheng, Jun Wang, Akshay Balsubramani, Khang Tran, Minnie Zacharia, Monica Wu, Xiaobo Gu, Ryan Clinton, Carla Asquith, Joseph Skaleski, Lianne Boeglin, Sudha Chivukula, Anusha Dias, Tod Strugnell, Fernando Ulloa Montoya, Vikram Agarwal, Ziv Bar-Joseph, Sven Jager

mRNA-based vaccines and therapeutics are gaining popularity and usage across a wide range of conditions. One of the critical issues when designing such mRNAs is sequence optimization. Even small proteins or peptides can be encoded by an enormously large number of mRNAs. The actual mRNA sequence can have a large impact on several properties, including expression, stability, immunogenicity, and more. To enable the selection of an optimal sequence, we developed CodonBERT, a large language model (LLM) for mRNAs. Unlike prior models, CodonBERT uses codons as inputs, which enables it to learn better representations. CodonBERT was trained using more than 10 million mRNA sequences from a diverse set of organisms. The resulting model captures important biological concepts. CodonBERT can also be extended to perform prediction tasks for various mRNA properties. CodonBERT outperforms previous mRNA prediction methods, including on a new flu vaccine data set.

以 mRNA 为基础的疫苗和疗法越来越受到人们的青睐,并被广泛应用于各种疾病。设计此类 mRNA 的关键问题之一是序列优化。即使是很小的蛋白质或肽也可以由大量的 mRNA 编码。实际的 mRNA 序列会对包括表达、稳定性、免疫原性等在内的多种特性产生重大影响。为了能够选择最佳序列,我们开发了用于 mRNA 的大语言模型(LLM)--CodonBERT。与之前的模型不同,CodonBERT 使用密码子作为输入,这使它能够学习更好的表征。CodonBERT 使用来自不同生物体的 1000 多万条 mRNA 序列进行训练。由此产生的模型捕捉到了重要的生物学概念。CodonBERT 还可扩展用于执行各种 mRNA 属性的预测任务。CodonBERT 的表现优于之前的 mRNA 预测方法,包括在一个新的流感疫苗数据集上的表现。
{"title":"CodonBERT large language model for mRNA vaccines.","authors":"Sizhen Li, Saeed Moayedpour, Ruijiang Li, Michael Bailey, Saleh Riahi, Lorenzo Kogler-Anele, Milad Miladi, Jacob Miner, Fabien Pertuy, Dinghai Zheng, Jun Wang, Akshay Balsubramani, Khang Tran, Minnie Zacharia, Monica Wu, Xiaobo Gu, Ryan Clinton, Carla Asquith, Joseph Skaleski, Lianne Boeglin, Sudha Chivukula, Anusha Dias, Tod Strugnell, Fernando Ulloa Montoya, Vikram Agarwal, Ziv Bar-Joseph, Sven Jager","doi":"10.1101/gr.278870.123","DOIUrl":"10.1101/gr.278870.123","url":null,"abstract":"<p><p>mRNA-based vaccines and therapeutics are gaining popularity and usage across a wide range of conditions. One of the critical issues when designing such mRNAs is sequence optimization. Even small proteins or peptides can be encoded by an enormously large number of mRNAs. The actual mRNA sequence can have a large impact on several properties, including expression, stability, immunogenicity, and more. To enable the selection of an optimal sequence, we developed CodonBERT, a large language model (LLM) for mRNAs. Unlike prior models, CodonBERT uses codons as inputs, which enables it to learn better representations. CodonBERT was trained using more than 10 million mRNA sequences from a diverse set of organisms. The resulting model captures important biological concepts. CodonBERT can also be extended to perform prediction tasks for various mRNA properties. CodonBERT outperforms previous mRNA prediction methods, including on a new flu vaccine data set.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1027-1035"},"PeriodicalIF":6.2,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368176/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141476456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-fidelity, large-scale targeted profiling of microsatellites. 高保真、大规模的微卫星定向剖析。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-20 DOI: 10.1101/gr.278785.123
Caitlin A Loh, Danielle A Shields, Adam Schwing, Gilad D Evrony

Microsatellites are highly mutable sequences that can serve as markers for relationships among individuals or cells within a population. The accuracy and resolution of reconstructing these relationships depends on the fidelity of microsatellite profiling and the number of microsatellites profiled. However, current methods for targeted profiling of microsatellites incur significant "stutter" artifacts that interfere with accurate genotyping, and sequencing costs preclude whole-genome microsatellite profiling of a large number of samples. We developed a novel method for accurate and cost-effective targeted profiling of a panel of more than 150,000 microsatellites per sample, along with a computational tool for designing large-scale microsatellite panels. Our method addresses the greatest challenge for microsatellite profiling-"stutter" artifacts-with a low-temperature hybridization capture that significantly reduces these artifacts. We also developed a computational tool for accurate genotyping of the resulting microsatellite sequencing data that uses an ensemble approach integrating three microsatellite genotyping tools, which we optimize by analysis of de novo microsatellite mutations in human trios. Altogether, our suite of experimental and computational tools enables high-fidelity, large-scale profiling of microsatellites, which may find utility in diverse applications such as lineage tracing, population genetics, ecology, and forensics.

微卫星是高度易变的序列,可作为群体中个体或细胞间关系的标记。重建这些关系的准确性和分辨率取决于微卫星剖析的保真度和剖析的微卫星数量。然而,目前有针对性的微卫星分析方法会产生明显的 "滞后 "伪影,干扰准确的基因分型,而且测序成本高,无法对大量样本进行全基因组微卫星分析。我们开发了一种新方法,可对每个样本中大于 15 万个微卫星进行准确且经济高效的靶向分析,同时还开发了一种用于设计大规模微卫星面板的计算工具。我们的方法解决了微卫星图谱分析的最大挑战--"停顿 "伪影--低温杂交捕获可显著减少这些伪影。我们还开发了一种计算工具,用于对得到的微卫星测序数据进行准确的基因分型,该工具采用了一种集合方法,整合了三种微卫星基因分型工具,我们通过分析人类三组微卫星的新突变对其进行了优化。总之,我们的这套实验和计算工具能够对微卫星进行高保真、大规模的分析,这可能会在世系追踪、群体遗传学、生态学和法医学等多种应用中找到用武之地。
{"title":"High-fidelity, large-scale targeted profiling of microsatellites.","authors":"Caitlin A Loh, Danielle A Shields, Adam Schwing, Gilad D Evrony","doi":"10.1101/gr.278785.123","DOIUrl":"10.1101/gr.278785.123","url":null,"abstract":"<p><p>Microsatellites are highly mutable sequences that can serve as markers for relationships among individuals or cells within a population. The accuracy and resolution of reconstructing these relationships depends on the fidelity of microsatellite profiling and the number of microsatellites profiled. However, current methods for targeted profiling of microsatellites incur significant \"stutter\" artifacts that interfere with accurate genotyping, and sequencing costs preclude whole-genome microsatellite profiling of a large number of samples. We developed a novel method for accurate and cost-effective targeted profiling of a panel of more than 150,000 microsatellites per sample, along with a computational tool for designing large-scale microsatellite panels. Our method addresses the greatest challenge for microsatellite profiling-\"stutter\" artifacts-with a low-temperature hybridization capture that significantly reduces these artifacts. We also developed a computational tool for accurate genotyping of the resulting microsatellite sequencing data that uses an ensemble approach integrating three microsatellite genotyping tools, which we optimize by analysis of de novo microsatellite mutations in human trios. Altogether, our suite of experimental and computational tools enables high-fidelity, large-scale profiling of microsatellites, which may find utility in diverse applications such as lineage tracing, population genetics, ecology, and forensics.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1008-1026"},"PeriodicalIF":6.2,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368184/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141626499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interspecies regulatory landscapes and elements revealed by novel joint systematic integration of human and mouse blood cell epigenomes. 通过对人类和小鼠血细胞表观基因组进行新的联合系统整合,揭示了种间调控景观和要素。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-20 DOI: 10.1101/gr.277950.123
Guanjue Xiang, Xi He, Belinda M Giardine, Kathryn J Isaac, Dylan J Taylor, Rajiv C McCoy, Camden Jansen, Cheryl A Keller, Alexander Q Wixom, April Cockburn, Amber Miller, Qian Qi, Yanghua He, Yichao Li, Jens Lichtenberg, Elisabeth F Heuston, Stacie M Anderson, Jing Luan, Marit W Vermunt, Feng Yue, Michael E G Sauria, Michael C Schatz, James Taylor, Berthold Göttgens, Jim R Hughes, Douglas R Higgs, Mitchell J Weiss, Yong Cheng, Gerd A Blobel, David M Bodine, Yu Zhang, Qunhua Li, Shaun Mahony, Ross C Hardison

Knowledge of locations and activities of cis-regulatory elements (CREs) is needed to decipher basic mechanisms of gene regulation and to understand the impact of genetic variants on complex traits. Previous studies identified candidate CREs (cCREs) using epigenetic features in one species, making comparisons difficult between species. In contrast, we conducted an interspecies study defining epigenetic states and identifying cCREs in blood cell types to generate regulatory maps that are comparable between species, using integrative modeling of eight epigenetic features jointly in human and mouse in our Validated Systematic Integration (VISION) Project. The resulting catalogs of cCREs are useful resources for further studies of gene regulation in blood cells, indicated by high overlap with known functional elements and strong enrichment for human genetic variants associated with blood cell phenotypes. The contribution of each epigenetic state in cCREs to gene regulation, inferred from a multivariate regression, was used to estimate epigenetic state regulatory potential (esRP) scores for each cCRE in each cell type, which were used to categorize dynamic changes in cCREs. Groups of cCREs displaying similar patterns of regulatory activity in human and mouse cell types, obtained by joint clustering on esRP scores, harbor distinctive transcription factor binding motifs that are similar between species. An interspecies comparison of cCREs revealed both conserved and species-specific patterns of epigenetic evolution. Finally, we show that comparisons of the epigenetic landscape between species can reveal elements with similar roles in regulation, even in the absence of genomic sequence alignment.

要破译基因调控的基本机制并了解基因变异对复杂性状的影响,就需要了解顺式调控元件(CRE)的位置和活性。以前的研究利用一个物种的表观遗传特征确定了候选 CRE(cCRE),因此很难在物种间进行比较。与此相反,我们开展了一项跨物种研究,在我们的 "验证系统整合(VISION)项目 "中联合使用人和小鼠的八种表观遗传特征进行整合建模,在血细胞类型中定义表观遗传状态并鉴定 cCRE,以生成物种间可比的调控图谱。所得到的 cCREs 目录与已知的功能元件高度重叠,而且与血细胞表型相关的人类基因变异具有很强的富集性,因此是进一步研究血细胞基因调控的有用资源。通过多元回归推断出的 cCRE 中每种表观遗传状态对基因调控的贡献,被用来估算每种细胞类型中每种 cCRE 的表观遗传状态调控潜能(esRP)得分,并以此对 cCRE 的动态变化进行分类。通过对esRP评分进行联合聚类,得出了在人类和小鼠细胞类型中显示出相似调控活动模式的cCREs群组,这些群组含有物种间相似的独特转录因子结合基序。cCREs 的种间比较揭示了表观遗传进化的保守模式和物种特异模式。最后,我们还表明,即使没有基因组序列比对,物种间表观遗传景观的比较也能揭示在调控中具有相似作用的元素。
{"title":"Interspecies regulatory landscapes and elements revealed by novel joint systematic integration of human and mouse blood cell epigenomes.","authors":"Guanjue Xiang, Xi He, Belinda M Giardine, Kathryn J Isaac, Dylan J Taylor, Rajiv C McCoy, Camden Jansen, Cheryl A Keller, Alexander Q Wixom, April Cockburn, Amber Miller, Qian Qi, Yanghua He, Yichao Li, Jens Lichtenberg, Elisabeth F Heuston, Stacie M Anderson, Jing Luan, Marit W Vermunt, Feng Yue, Michael E G Sauria, Michael C Schatz, James Taylor, Berthold Göttgens, Jim R Hughes, Douglas R Higgs, Mitchell J Weiss, Yong Cheng, Gerd A Blobel, David M Bodine, Yu Zhang, Qunhua Li, Shaun Mahony, Ross C Hardison","doi":"10.1101/gr.277950.123","DOIUrl":"10.1101/gr.277950.123","url":null,"abstract":"<p><p>Knowledge of locations and activities of <i>cis</i>-regulatory elements (CREs) is needed to decipher basic mechanisms of gene regulation and to understand the impact of genetic variants on complex traits. Previous studies identified candidate CREs (cCREs) using epigenetic features in one species, making comparisons difficult between species. In contrast, we conducted an interspecies study defining epigenetic states and identifying cCREs in blood cell types to generate regulatory maps that are comparable between species, using integrative modeling of eight epigenetic features jointly in human and mouse in our Validated Systematic Integration (VISION) Project. The resulting catalogs of cCREs are useful resources for further studies of gene regulation in blood cells, indicated by high overlap with known functional elements and strong enrichment for human genetic variants associated with blood cell phenotypes. The contribution of each epigenetic state in cCREs to gene regulation, inferred from a multivariate regression, was used to estimate epigenetic state regulatory potential (esRP) scores for each cCRE in each cell type, which were used to categorize dynamic changes in cCREs. Groups of cCREs displaying similar patterns of regulatory activity in human and mouse cell types, obtained by joint clustering on esRP scores, harbor distinctive transcription factor binding motifs that are similar between species. An interspecies comparison of cCREs revealed both conserved and species-specific patterns of epigenetic evolution. Finally, we show that comparisons of the epigenetic landscape between species can reveal elements with similar roles in regulation, even in the absence of genomic sequence alignment.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1089-1105"},"PeriodicalIF":6.2,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368181/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141476457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pangenome-spanning epistasis and coselection analysis via de Bruijn graphs. 通过 de Bruijn 图进行泛基因组跨表观和共选择分析。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-20 DOI: 10.1101/gr.278485.123
Juri Kuronen, Samuel T Horsfield, Anna K Pöntinen, Sudaraka Mallawaarachchi, Sergio Arredondo-Alonso, Harry Thorpe, Rebecca A Gladstone, Rob J L Willems, Stephen D Bentley, Nicholas J Croucher, Johan Pensar, John A Lees, Gerry Tonkin-Hill, Jukka Corander

Studies of bacterial adaptation and evolution are hampered by the difficulty of measuring traits such as virulence, drug resistance, and transmissibility in large populations. In contrast, it is now feasible to obtain high-quality complete assemblies of many bacterial genomes thanks to scalable high-accuracy long-read sequencing technologies. To exploit this opportunity, we introduce a phenotype- and alignment-free method for discovering coselected and epistatically interacting genomic variation from genome assemblies covering both core and accessory parts of genomes. Our approach uses a compact colored de Bruijn graph to approximate the intragenome distances between pairs of loci for a collection of bacterial genomes to account for the impacts of linkage disequilibrium (LD). We demonstrate the versatility of our approach to efficiently identify associations between loci linked with drug resistance and adaptation to the hospital niche in the major human bacterial pathogens Streptococcus pneumoniae and Enterococcus faecalis.

由于难以测量细菌的毒性、耐药性和在大群体中的传播性等性状,对细菌适应性和进化的研究受到了阻碍。相比之下,由于采用了可扩展的高精度长线程测序技术,现在已经可以获得许多细菌基因组的高质量完整组装。为了利用这一机会,我们介绍了一种无表型和无比对的方法,用于从基因组的核心和附属部分的基因组组装中发现共选和表观相互作用的基因组变异。我们的方法使用一个紧凑的彩色德布鲁因图(de Bruijn graph)来近似计算细菌基因组集合中成对基因座之间的基因组内距离,以考虑连锁不平衡(LD)的影响。我们展示了这种方法的多功能性,它能在主要人类细菌病原体肺炎链球菌和粪肠球菌中有效地识别与耐药性和适应医院生态位相关的基因位点之间的联系。
{"title":"Pangenome-spanning epistasis and coselection analysis via de Bruijn graphs.","authors":"Juri Kuronen, Samuel T Horsfield, Anna K Pöntinen, Sudaraka Mallawaarachchi, Sergio Arredondo-Alonso, Harry Thorpe, Rebecca A Gladstone, Rob J L Willems, Stephen D Bentley, Nicholas J Croucher, Johan Pensar, John A Lees, Gerry Tonkin-Hill, Jukka Corander","doi":"10.1101/gr.278485.123","DOIUrl":"10.1101/gr.278485.123","url":null,"abstract":"<p><p>Studies of bacterial adaptation and evolution are hampered by the difficulty of measuring traits such as virulence, drug resistance, and transmissibility in large populations. In contrast, it is now feasible to obtain high-quality complete assemblies of many bacterial genomes thanks to scalable high-accuracy long-read sequencing technologies. To exploit this opportunity, we introduce a phenotype- and alignment-free method for discovering coselected and epistatically interacting genomic variation from genome assemblies covering both core and accessory parts of genomes. Our approach uses a compact colored de Bruijn graph to approximate the intragenome distances between pairs of loci for a collection of bacterial genomes to account for the impacts of linkage disequilibrium (LD). We demonstrate the versatility of our approach to efficiently identify associations between loci linked with drug resistance and adaptation to the hospital niche in the major human bacterial pathogens <i>Streptococcus pneumoniae</i> and <i>Enterococcus faecalis</i>.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1081-1088"},"PeriodicalIF":6.2,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368177/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141970985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A gene regulatory network-aware graph learning method for cell identity annotation in single-cell RNA-seq data. 用于单细胞 RNA-seq 数据中细胞身份注释的基因调控网络感知图学习方法。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-20 DOI: 10.1101/gr.278439.123
Mengyuan Zhao, Jiawei Li, Xiaoyi Liu, Ke Ma, Jijun Tang, Fei Guo

Cell identity annotation for single-cell transcriptome data is a crucial process for constructing cell atlases, unraveling pathogenesis, and inspiring therapeutic approaches. Currently, the efficacy of existing methodologies is contingent upon specific data sets. Nevertheless, such data are often sourced from various batches, sequencing technologies, tissues, and even species. Notably, the gene regulatory relationship remains unaffected by the aforementioned factors, highlighting the extensive gene interactions within organisms. Therefore, we propose scHGR, an automated annotation tool designed to leverage gene regulatory relationships in constructing gene-mediated cell communication graphs for single-cell transcriptome data. This strategy helps reduce noise from diverse data sources while establishing distant cellular connections, yielding valuable biological insights. Experiments involving 22 scenarios demonstrate that scHGR precisely and consistently annotates cell identities, benchmarked against state-of-the-art methods. Crucially, scHGR uncovers novel subtypes within peripheral blood mononuclear cells, specifically from CD4+ T cells and cytotoxic T cells. Furthermore, by characterizing a cell atlas comprising 56 cell types for COVID-19 patients, scHGR identifies vital factors like IL1 and calcium ions, offering insights for targeted therapeutic interventions.

单细胞转录组数据的细胞身份注释是构建细胞图谱、揭示发病机制和启发治疗方法的关键过程。目前,现有方法的有效性取决于特定的数据集。然而,这些数据往往来自不同的批次、测序技术、组织甚至物种。值得注意的是,基因调控关系仍然不受上述因素的影响,这凸显了生物体内广泛的基因相互作用。因此,我们提出了 scHGR,这是一种自动注释工具,旨在利用基因调控关系为单细胞转录组数据构建基因介导的细胞通讯图谱。这种策略有助于减少来自不同数据源的噪声,同时建立遥远的细胞联系,从而获得有价值的生物学见解。涉及 22 种情况的实验表明,与最先进的方法相比,scHGR 能精确、一致地注释细胞身份。最重要的是,scHGR 发现了外周血单核细胞中的新型亚型,特别是 CD4+ T 细胞和细胞毒性 T 细胞。此外,通过对 COVID-19 患者的 56 种细胞类型组成的细胞图谱进行特征描述,scHGR 确定了 IL1 和钙离子等重要因子,为有针对性的治疗干预提供了启示。
{"title":"A gene regulatory network-aware graph learning method for cell identity annotation in single-cell RNA-seq data.","authors":"Mengyuan Zhao, Jiawei Li, Xiaoyi Liu, Ke Ma, Jijun Tang, Fei Guo","doi":"10.1101/gr.278439.123","DOIUrl":"10.1101/gr.278439.123","url":null,"abstract":"<p><p>Cell identity annotation for single-cell transcriptome data is a crucial process for constructing cell atlases, unraveling pathogenesis, and inspiring therapeutic approaches. Currently, the efficacy of existing methodologies is contingent upon specific data sets. Nevertheless, such data are often sourced from various batches, sequencing technologies, tissues, and even species. Notably, the gene regulatory relationship remains unaffected by the aforementioned factors, highlighting the extensive gene interactions within organisms. Therefore, we propose scHGR, an automated annotation tool designed to leverage gene regulatory relationships in constructing gene-mediated cell communication graphs for single-cell transcriptome data. This strategy helps reduce noise from diverse data sources while establishing distant cellular connections, yielding valuable biological insights. Experiments involving 22 scenarios demonstrate that scHGR precisely and consistently annotates cell identities, benchmarked against state-of-the-art methods. Crucially, scHGR uncovers novel subtypes within peripheral blood mononuclear cells, specifically from CD4<sup>+</sup> T cells and cytotoxic T cells. Furthermore, by characterizing a cell atlas comprising 56 cell types for COVID-19 patients, scHGR identifies vital factors like IL1 and calcium ions, offering insights for targeted therapeutic interventions.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1036-1051"},"PeriodicalIF":6.2,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368180/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141970983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Widespread natural selection on metabolite levels in humans 人类代谢物水平的广泛自然选择
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-16 DOI: 10.1101/gr.278756.123
Yanina Timasheva, Kaido Lepik, Orsolya Liska, Balázs Papp, Zoltan Kutalik
Natural selection acts ubiquitously on complex human traits, predominantly constraining the occurrence of extreme phenotypes (stabilizing selection). These constraints propagate to DNA sequence variants associated with traits under selection. The genetic imprints of such evolutionary events can thus be detected via combining effect size estimates from genetic association studies and the corresponding allele frequencies. While this approach has been successfully applied to high-level traits, the prevalence and mode of selection acting on molecular traits remains poorly understood. Here, we estimate the action of natural selection on genetic variants associated with metabolite levels, an important layer of molecular traits. By leveraging summary statistics of published genome-wide association studies with large sample sizes, we find strong evidence of stabilizing selection for 15 out of 97 plasma metabolites. Mendelian randomization analysis revealed that metabolites under stronger stabilizing selection display larger effects on a range of clinically relevant complex traits, suggesting that maintaining a disease-free profile may be an important source of selective constraints on the metabolome. Metabolites under strong stabilizing selection in humans are also more conserved in their concentrations among diverse mammalian species, suggesting shared selective forces across micro and macroevolutionary time scales. Finally, we also found evidence for both disruptive and directional selection on specific lipid metabolites, potentially indicating ongoing evolutionary adaptation in humans. Overall, this study demonstrates that variation in metabolite levels among humans is frequently shaped by natural selection and this may act through their causal impact on disease susceptibility.
自然选择对人类复杂性状的作用无处不在,主要是限制极端表型的出现(稳定选择)。这些制约因素会传播到与被选择性状相关的 DNA 序列变异中。因此,通过结合遗传关联研究的效应大小估计值和相应的等位基因频率,可以检测到此类进化事件的遗传印记。虽然这种方法已成功应用于高级性状,但人们对分子性状选择的普遍性和模式仍然知之甚少。在这里,我们估算了自然选择对与代谢物水平相关的遗传变异的作用,代谢物水平是分子性状的一个重要层面。通过利用已发表的大样本量全基因组关联研究的汇总统计,我们发现在 97 种血浆代谢物中,有 15 种存在稳定选择的有力证据。孟德尔随机化分析表明,处于较强稳定选择下的代谢物对一系列临床相关的复杂性状具有较大的影响,这表明保持无病特征可能是代谢组选择性限制的一个重要来源。在人类中处于强稳定选择下的代谢物在不同哺乳动物物种中的浓度也更加一致,这表明在微观和宏观进化时间尺度上存在共同的选择性力量。最后,我们还发现了对特定脂质代谢物进行破坏性选择和定向选择的证据,这可能表明人类正在进行进化适应。总之,这项研究表明,人类代谢物水平的变化经常受到自然选择的影响,这可能通过它们对疾病易感性的因果影响发挥作用。
{"title":"Widespread natural selection on metabolite levels in humans","authors":"Yanina Timasheva, Kaido Lepik, Orsolya Liska, Balázs Papp, Zoltan Kutalik","doi":"10.1101/gr.278756.123","DOIUrl":"https://doi.org/10.1101/gr.278756.123","url":null,"abstract":"Natural selection acts ubiquitously on complex human traits, predominantly constraining the occurrence of extreme phenotypes (stabilizing selection). These constraints propagate to DNA sequence variants associated with traits under selection. The genetic imprints of such evolutionary events can thus be detected via combining effect size estimates from genetic association studies and the corresponding allele frequencies. While this approach has been successfully applied to high-level traits, the prevalence and mode of selection acting on molecular traits remains poorly understood. Here, we estimate the action of natural selection on genetic variants associated with metabolite levels, an important layer of molecular traits. By leveraging summary statistics of published genome-wide association studies with large sample sizes, we find strong evidence of stabilizing selection for 15 out of 97 plasma metabolites. Mendelian randomization analysis revealed that metabolites under stronger stabilizing selection display larger effects on a range of clinically relevant complex traits, suggesting that maintaining a disease-free profile may be an important source of selective constraints on the metabolome. Metabolites under strong stabilizing selection in humans are also more conserved in their concentrations among diverse mammalian species, suggesting shared selective forces across micro and macroevolutionary time scales. Finally, we also found evidence for both disruptive and directional selection on specific lipid metabolites, potentially indicating ongoing evolutionary adaptation in humans. Overall, this study demonstrates that variation in metabolite levels among humans is frequently shaped by natural selection and this may act through their causal impact on disease susceptibility.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"3 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Colibactin leads to a bacteria-specific mutation pattern and self-inflicted DNA damage Colibactin 导致细菌特异性突变模式和自身造成的 DNA 损伤
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-16 DOI: 10.1101/gr.279517.124
Emily Lowry, Yiqing Wang, Tal Dagan, Amir Mitchell
Colibactin produced primarily by Escherichia coli strains of the B2 phylogroup crosslinks DNA and can promote colon cancer in human hosts. We investigated the toxin's impact on colibactin producers and on bacteria co-cultured with producing cells. Using genome-wide genetic screens and mutation accumulation experiments we uncovered the cellular pathways that mitigate colibactin damage and revealed the specific mutations it induces. We discovered that while colibactin targets A/T rich motifs, as observed in human colon cells, it induces a bacteria-unique mutation pattern. Based on this pattern, we predicted that long-term colibactin exposure will culminate in a genomic bias in trinucleotide composition. We tested this prediction by analyzing thousands of E. coli genomes and found that colibactin-producing strains indeed show the predicted skewness in trinucleotide composition. Our work revealed a bacteria-specific mutation pattern and suggests that the resistance protein encoded on the colibactin pathogenicity island is insufficient in preventing self-inflicted DNA damage.
大肠杆菌毒素主要由 B2 系统群的大肠杆菌菌株产生,可交联 DNA 并诱发人类宿主的结肠癌。我们研究了这种毒素对大肠杆菌生产者以及与生产者细胞共培养的细菌的影响。通过全基因组遗传筛选和突变累积实验,我们发现了减轻大肠杆菌毒素损伤的细胞通路,并揭示了其诱导的特定突变。我们发现,正如在人类结肠细胞中观察到的那样,虽然可乐菌素以富含 A/T 的基序为目标,但它会诱导一种细菌特有的突变模式。根据这种模式,我们预测长期暴露于 colibactin 将最终导致三核苷酸组成的基因组偏差。我们通过分析数以千计的大肠杆菌基因组验证了这一预测,发现产生可乐菌素的菌株确实在三核苷酸组成方面表现出预测的偏斜。我们的工作揭示了一种细菌特异性突变模式,并表明在可乐菌素致病性岛上编码的抗性蛋白不足以防止自身造成的 DNA 损伤。
{"title":"Colibactin leads to a bacteria-specific mutation pattern and self-inflicted DNA damage","authors":"Emily Lowry, Yiqing Wang, Tal Dagan, Amir Mitchell","doi":"10.1101/gr.279517.124","DOIUrl":"https://doi.org/10.1101/gr.279517.124","url":null,"abstract":"Colibactin produced primarily by <em>Escherichia coli</em> strains of the B2 phylogroup crosslinks DNA and can promote colon cancer in human hosts. We investigated the toxin's impact on colibactin producers and on bacteria co-cultured with producing cells. Using genome-wide genetic screens and mutation accumulation experiments we uncovered the cellular pathways that mitigate colibactin damage and revealed the specific mutations it induces. We discovered that while colibactin targets A/T rich motifs, as observed in human colon cells, it induces a bacteria-unique mutation pattern. Based on this pattern, we predicted that long-term colibactin exposure will culminate in a genomic bias in trinucleotide composition. We tested this prediction by analyzing thousands of <em>E. coli</em> genomes and found that colibactin-producing strains indeed show the predicted skewness in trinucleotide composition. Our work revealed a bacteria-specific mutation pattern and suggests that the resistance protein encoded on the colibactin pathogenicity island is insufficient in preventing self-inflicted DNA damage.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"96 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Allele specific transcription factor binding across human brain regions offers mechanistic insight into eQTLs 人脑各区域的等位基因特异性转录因子结合提供了对 eQTL 的机理认识
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-16 DOI: 10.1101/gr.278601.123
Ashlyn G Anderson, Belle A Moyers, Jacob M Loupe, Ivan Rodriguez-Nunez, Stephanie A Felker, James M.J. Lawlor, William E Bunney, Blynn G Bunney, Preston M Cartagena, Adolfo Sequeira, Stanley Watson, Huda Akil, Eric M Mendenhall, Gregory M Cooper, Richard M. Myers
Transcription Factors (TFs) regulate gene expression by facilitating or disrupting the formation of transcription initiation machinery at particular genomic loci. Since TF occupancy is driven in part by recognition of DNA sequence, genetic variation can influence TF-DNA associations and gene regulation. To identify variants that impact TF binding in human brain tissues, we assessed allele specific binding (ASB) at heterozygous variants for 94 TFs in 9 brain regions from two donors. Leveraging graph genomes constructed from phased genomic sequence data, we compared ChIP-seq signals between alleles at heterozygous variants within each brain region and identified thousands of variants exhibiting ASB for at least one TF. ASB reproducibility was measured by comparisons between independent experiments both within and between donors. We found that rarer alleles in the general population more frequently led to reduced TF binding, whereas common variation had an equal likelihood of increasing or decreasing binding. Motif analysis revealed TF-specific effects, with ASB variants for certain TFs displaying a greater incidence of motif alterations, as well as enrichments for variants under purifying selection. Notably, neuron-specific cis-regulatory elements (cCREs) showed depletion for ASB variants. We identified 2,670 ASB variants with prior evidence of allele-specific gene expression in the brain from GTEx data and observed increasing eQTL effect direction concordance as ASB significance increases. These results provide a valuable and unique resource for mechanistic analysis of cis-regulatory variation in human brain tissue.
转录因子(TF)通过促进或破坏特定基因组位点转录启动机制的形成来调控基因表达。由于 TF 的占据部分是由 DNA 序列识别驱动的,因此遗传变异会影响 TF-DNA 关联和基因调控。为了确定影响人类脑组织中TF结合的变异,我们评估了两个供体9个脑区94个TF杂合变异的等位基因特异性结合(ASB)。利用分阶段基因组序列数据构建的图谱基因组,我们比较了每个脑区杂合变体等位基因之间的 ChIP-seq 信号,并确定了数千个至少对一种 TF 具有 ASB 的变体。ASB 的可重复性通过供体内部和供体之间独立实验的比较来衡量。我们发现,一般人群中较罕见的等位基因更经常导致 TF 结合力降低,而常见变异增加或减少结合力的可能性相同。基因组分析显示了TF的特异性效应,某些TF的ASB变体显示出更高的基因组改变发生率,以及纯化选择下变体的富集。值得注意的是,神经元特异性顺式调节元件(cCRE)显示出 ASB 变体的耗竭。我们从 GTEx 数据中发现了 2,670 个等位基因特异性基因在大脑中表达的 ASB 变异,并观察到随着 ASB 重要性的增加,eQTL 效应方向的一致性也在增加。这些结果为人类脑组织顺式调节变异的机理分析提供了宝贵而独特的资源。
{"title":"Allele specific transcription factor binding across human brain regions offers mechanistic insight into eQTLs","authors":"Ashlyn G Anderson, Belle A Moyers, Jacob M Loupe, Ivan Rodriguez-Nunez, Stephanie A Felker, James M.J. Lawlor, William E Bunney, Blynn G Bunney, Preston M Cartagena, Adolfo Sequeira, Stanley Watson, Huda Akil, Eric M Mendenhall, Gregory M Cooper, Richard M. Myers","doi":"10.1101/gr.278601.123","DOIUrl":"https://doi.org/10.1101/gr.278601.123","url":null,"abstract":"Transcription Factors (TFs) regulate gene expression by facilitating or disrupting the formation of transcription initiation machinery at particular genomic loci. Since TF occupancy is driven in part by recognition of DNA sequence, genetic variation can influence TF-DNA associations and gene regulation. To identify variants that impact TF binding in human brain tissues, we assessed allele specific binding (ASB) at heterozygous variants for 94 TFs in 9 brain regions from two donors. Leveraging graph genomes constructed from phased genomic sequence data, we compared ChIP-seq signals between alleles at heterozygous variants within each brain region and identified thousands of variants exhibiting ASB for at least one TF. ASB reproducibility was measured by comparisons between independent experiments both within and between donors. We found that rarer alleles in the general population more frequently led to reduced TF binding, whereas common variation had an equal likelihood of increasing or decreasing binding. Motif analysis revealed TF-specific effects, with ASB variants for certain TFs displaying a greater incidence of motif alterations, as well as enrichments for variants under purifying selection. Notably, neuron-specific <em>cis</em>-regulatory elements (cCREs) showed depletion for ASB variants. We identified 2,670 ASB variants with prior evidence of allele-specific gene expression in the brain from GTEx data and observed increasing eQTL effect direction concordance as ASB significance increases. These results provide a valuable and unique resource for mechanistic analysis of <em>cis</em>-regulatory variation in human brain tissue.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"38 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1