Pub Date : 2024-07-10Epub Date: 2024-06-25DOI: 10.1016/j.xgen.2024.100591
Alison A Motsinger-Reif, David M Reif, Farida S Akhtari, John S House, C Ryan Campbell, Kyle P Messier, David C Fargo, Tiffany A Bowen, Srikanth S Nadadur, Charles P Schmitt, Kristianna G Pettibone, David M Balshaw, Cindy P Lawler, Shelia A Newton, Gwen W Collman, Aubrey K Miller, B Alex Merrick, Yuxia Cui, Benedict Anchang, Quaker E Harmon, Kimberly A McAllister, Rick Woychik
Understanding the complex interplay of genetic and environmental factors in disease etiology and the role of gene-environment interactions (GEIs) across human development stages is important. We review the state of GEI research, including challenges in measuring environmental factors and advantages of GEI analysis in understanding disease mechanisms. We discuss the evolution of GEI studies from candidate gene-environment studies to genome-wide interaction studies (GWISs) and the role of multi-omics in mediating GEI effects. We review advancements in GEI analysis methods and the importance of large-scale datasets. We also address the translation of GEI findings into precision environmental health (PEH), showcasing real-world applications in healthcare and disease prevention. Additionally, we highlight societal considerations in GEI research, including environmental justice, the return of results to participants, and data privacy. Overall, we underscore the significance of GEI for disease prediction and prevention and advocate for integrating the exposome into PEH omics studies.
{"title":"Gene-environment interactions within a precision environmental health framework.","authors":"Alison A Motsinger-Reif, David M Reif, Farida S Akhtari, John S House, C Ryan Campbell, Kyle P Messier, David C Fargo, Tiffany A Bowen, Srikanth S Nadadur, Charles P Schmitt, Kristianna G Pettibone, David M Balshaw, Cindy P Lawler, Shelia A Newton, Gwen W Collman, Aubrey K Miller, B Alex Merrick, Yuxia Cui, Benedict Anchang, Quaker E Harmon, Kimberly A McAllister, Rick Woychik","doi":"10.1016/j.xgen.2024.100591","DOIUrl":"10.1016/j.xgen.2024.100591","url":null,"abstract":"<p><p>Understanding the complex interplay of genetic and environmental factors in disease etiology and the role of gene-environment interactions (GEIs) across human development stages is important. We review the state of GEI research, including challenges in measuring environmental factors and advantages of GEI analysis in understanding disease mechanisms. We discuss the evolution of GEI studies from candidate gene-environment studies to genome-wide interaction studies (GWISs) and the role of multi-omics in mediating GEI effects. We review advancements in GEI analysis methods and the importance of large-scale datasets. We also address the translation of GEI findings into precision environmental health (PEH), showcasing real-world applications in healthcare and disease prevention. Additionally, we highlight societal considerations in GEI research, including environmental justice, the return of results to participants, and data privacy. Overall, we underscore the significance of GEI for disease prediction and prevention and advocate for integrating the exposome into PEH omics studies.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100591"},"PeriodicalIF":11.1,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11293590/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Single-cell RNA sequencing (scRNA-seq) datasets contain true single cells, or singlets, in addition to cells that coalesce during the protocol, or doublets. Identifying singlets with high fidelity in scRNA-seq is necessary to avoid false negative and false positive discoveries. Although several methodologies have been proposed, they are typically tested on highly heterogeneous datasets and lack a priori knowledge of true singlets. Here, we leveraged datasets with synthetically introduced DNA barcodes for a hitherto unexplored application: to extract ground-truth singlets. We demonstrated the feasibility of our framework, "singletCode," to evaluate existing doublet detection methods across a range of contexts. We also leveraged our ground-truth singlets to train a proof-of-concept machine learning classifier, which outperformed other doublet detection algorithms. Our integrative framework can identify ground-truth singlets and enable robust doublet detection in non-barcoded datasets.
单细胞 RNA 测序(scRNA-seq)数据集除了包含真正的单细胞(或称单细胞)外,还包含在测序过程中聚合的细胞(或称双细胞)。在 scRNA-seq 中高保真地识别单细胞是避免假阴性和假阳性发现的必要条件。虽然已经提出了几种方法,但它们通常都是在高度异构的数据集上进行测试,缺乏对真正单体的先验知识。在这里,我们利用带有合成引入的 DNA 条形码的数据集进行了一项迄今为止尚未探索过的应用:提取地面真实单体。我们展示了我们的框架 "singletCode "的可行性,以评估各种情况下的现有双码检测方法。我们还利用我们的地面实况单点来训练一个概念验证机器学习分类器,该分类器的性能优于其他双重检测算法。我们的综合框架可以识别地面实况单字,并在非条码数据集中实现稳健的双字检测。
{"title":"Synthetic DNA barcodes identify singlets in scRNA-seq datasets and evaluate doublet algorithms.","authors":"Ziyang Zhang, Madeline E Melzer, Keerthana M Arun, Hanxiao Sun, Carl-Johan Eriksson, Itai Fabian, Sagi Shaashua, Karun Kiani, Yaara Oren, Yogesh Goyal","doi":"10.1016/j.xgen.2024.100592","DOIUrl":"10.1016/j.xgen.2024.100592","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) datasets contain true single cells, or singlets, in addition to cells that coalesce during the protocol, or doublets. Identifying singlets with high fidelity in scRNA-seq is necessary to avoid false negative and false positive discoveries. Although several methodologies have been proposed, they are typically tested on highly heterogeneous datasets and lack a priori knowledge of true singlets. Here, we leveraged datasets with synthetically introduced DNA barcodes for a hitherto unexplored application: to extract ground-truth singlets. We demonstrated the feasibility of our framework, \"singletCode,\" to evaluate existing doublet detection methods across a range of contexts. We also leveraged our ground-truth singlets to train a proof-of-concept machine learning classifier, which outperformed other doublet detection algorithms. Our integrative framework can identify ground-truth singlets and enable robust doublet detection in non-barcoded datasets.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100592"},"PeriodicalIF":11.1,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11293576/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10Epub Date: 2024-06-27DOI: 10.1016/j.xgen.2024.100589
Alex R DeCasien, Kenneth L Chiou, Camille Testard, Arianne Mercer, Josué E Negrón-Del Valle, Samuel E Bauman Surratt, Olga González, Michala K Stock, Angelina V Ruiz-Lambides, Melween I Martínez, Susan C Antón, Christopher S Walker, Jérôme Sallet, Melissa A Wilson, Lauren J N Brent, Michael J Montague, Chet C Sherwood, Michael L Platt, James P Higham, Noah Snyder-Mackler
Humans exhibit sex differences in the prevalence of many neurodevelopmental disorders and neurodegenerative diseases. Here, we generated one of the largest multi-brain-region bulk transcriptional datasets for the rhesus macaque and characterized sex-biased gene expression patterns to investigate the translatability of this species for sex-biased neurological conditions. We identify patterns similar to those in humans, which are associated with overlapping regulatory mechanisms, biological processes, and genes implicated in sex-biased human disorders, including autism. We also show that sex-biased genes exhibit greater genetic variance for expression and more tissue-specific expression patterns, which may facilitate rapid evolution of sex-biased genes. Our findings provide insights into the biological mechanisms underlying sex-biased disease and support the rhesus macaque model for the translational study of these conditions.
{"title":"Evolutionary and biomedical implications of sex differences in the primate brain transcriptome.","authors":"Alex R DeCasien, Kenneth L Chiou, Camille Testard, Arianne Mercer, Josué E Negrón-Del Valle, Samuel E Bauman Surratt, Olga González, Michala K Stock, Angelina V Ruiz-Lambides, Melween I Martínez, Susan C Antón, Christopher S Walker, Jérôme Sallet, Melissa A Wilson, Lauren J N Brent, Michael J Montague, Chet C Sherwood, Michael L Platt, James P Higham, Noah Snyder-Mackler","doi":"10.1016/j.xgen.2024.100589","DOIUrl":"10.1016/j.xgen.2024.100589","url":null,"abstract":"<p><p>Humans exhibit sex differences in the prevalence of many neurodevelopmental disorders and neurodegenerative diseases. Here, we generated one of the largest multi-brain-region bulk transcriptional datasets for the rhesus macaque and characterized sex-biased gene expression patterns to investigate the translatability of this species for sex-biased neurological conditions. We identify patterns similar to those in humans, which are associated with overlapping regulatory mechanisms, biological processes, and genes implicated in sex-biased human disorders, including autism. We also show that sex-biased genes exhibit greater genetic variance for expression and more tissue-specific expression patterns, which may facilitate rapid evolution of sex-biased genes. Our findings provide insights into the biological mechanisms underlying sex-biased disease and support the rhesus macaque model for the translational study of these conditions.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100589"},"PeriodicalIF":11.1,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11293591/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141473138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10Epub Date: 2024-06-27DOI: 10.1016/j.xgen.2024.100586
Christoffer Bugge Harder, Shingo Miyauchi, Máté Virágh, Alan Kuo, Ella Thoen, Bill Andreopoulos, Dabao Lu, Inger Skrede, Elodie Drula, Bernard Henrissat, Emmanuelle Morin, Annegret Kohler, Kerrie Barry, Kurt LaButti, Asaf Salamov, Anna Lipzen, Zsolt Merényi, Botond Hegedüs, Petr Baldrian, Martina Stursova, Hedda Weitz, Andy Taylor, Maxim Koriabine, Emily Savage, Igor V Grigoriev, László G Nagy, Francis Martin, Håvard Kauserud
Mycena s.s. is a ubiquitous mushroom genus whose members degrade multiple dead plant substrates and opportunistically invade living plant roots. Having sequenced the nuclear genomes of 24 Mycena species, we find them to defy the expected patterns for fungi based on both their traditionally perceived saprotrophic ecology and substrate specializations. Mycena displayed massive genome expansions overall affecting all gene families, driven by novel gene family emergence, gene duplications, enlarged secretomes encoding polysaccharide degradation enzymes, transposable element (TE) proliferation, and horizontal gene transfers. Mainly due to TE proliferation, Arctic Mycena species display genomes of up to 502 Mbp (2-8× the temperate Mycena), the largest among mushroom-forming Agaricomycetes, indicating a possible evolutionary convergence to genomic expansions sometimes seen in Arctic plants. Overall, Mycena show highly unusual, varied mosaic-like genomic structures adaptable to multiple lifestyles, providing genomic illustration for the growing realization that fungal niche adaptations can be far more fluid than traditionally believed.
{"title":"Extreme overall mushroom genome expansion in Mycena s.s. irrespective of plant hosts or substrate specializations.","authors":"Christoffer Bugge Harder, Shingo Miyauchi, Máté Virágh, Alan Kuo, Ella Thoen, Bill Andreopoulos, Dabao Lu, Inger Skrede, Elodie Drula, Bernard Henrissat, Emmanuelle Morin, Annegret Kohler, Kerrie Barry, Kurt LaButti, Asaf Salamov, Anna Lipzen, Zsolt Merényi, Botond Hegedüs, Petr Baldrian, Martina Stursova, Hedda Weitz, Andy Taylor, Maxim Koriabine, Emily Savage, Igor V Grigoriev, László G Nagy, Francis Martin, Håvard Kauserud","doi":"10.1016/j.xgen.2024.100586","DOIUrl":"10.1016/j.xgen.2024.100586","url":null,"abstract":"<p><p>Mycena s.s. is a ubiquitous mushroom genus whose members degrade multiple dead plant substrates and opportunistically invade living plant roots. Having sequenced the nuclear genomes of 24 Mycena species, we find them to defy the expected patterns for fungi based on both their traditionally perceived saprotrophic ecology and substrate specializations. Mycena displayed massive genome expansions overall affecting all gene families, driven by novel gene family emergence, gene duplications, enlarged secretomes encoding polysaccharide degradation enzymes, transposable element (TE) proliferation, and horizontal gene transfers. Mainly due to TE proliferation, Arctic Mycena species display genomes of up to 502 Mbp (2-8× the temperate Mycena), the largest among mushroom-forming Agaricomycetes, indicating a possible evolutionary convergence to genomic expansions sometimes seen in Arctic plants. Overall, Mycena show highly unusual, varied mosaic-like genomic structures adaptable to multiple lifestyles, providing genomic illustration for the growing realization that fungal niche adaptations can be far more fluid than traditionally believed.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100586"},"PeriodicalIF":11.1,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11293592/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141473195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CRISPR mutagenesis screens conducted with SpCas9 and other nucleases have identified certain cis-regulatory elements and genetic variants but at a limited resolution due to the absence of protospacer adjacent motif (PAM) sequences. Here, leveraging the broad targeting scope of the near-PAMless SpRY variant, we have demonstrated that saturated SpRY mutagenesis and base editing screens can faithfully identify functional regulatory elements and essential genetic variants for target gene expression at single-base resolution. We further extended this methodology to investigate a genome-wide association study (GWAS) locus at 10q22.1 associated with a red blood cell trait, where we identified potential enhancers regulating HK1 gene expression, despite not all of these enhancers exhibiting typical chromatin signatures. More importantly, our saturated base editing screens pinpoint multiple causal variants within this locus that would otherwise be missed by Bayesian statistical fine-mapping. Our approach is generally applicable to functional interrogation of all non-coding genomic elements while complementing other high-coverage CRISPR screens.
{"title":"SpRY-mediated screens facilitate functional dissection of non-coding sequences at single-base resolution.","authors":"Yao Yao, Zhiwei Zhou, Xiaoling Wang, Zhirui Liu, Yixin Zhai, Xiaolin Chi, Jingyi Du, Liheng Luo, Zhigang Zhao, Xiaoyue Wang, Chaoyou Xue, Shuquan Rao","doi":"10.1016/j.xgen.2024.100583","DOIUrl":"10.1016/j.xgen.2024.100583","url":null,"abstract":"<p><p>CRISPR mutagenesis screens conducted with SpCas9 and other nucleases have identified certain cis-regulatory elements and genetic variants but at a limited resolution due to the absence of protospacer adjacent motif (PAM) sequences. Here, leveraging the broad targeting scope of the near-PAMless SpRY variant, we have demonstrated that saturated SpRY mutagenesis and base editing screens can faithfully identify functional regulatory elements and essential genetic variants for target gene expression at single-base resolution. We further extended this methodology to investigate a genome-wide association study (GWAS) locus at 10q22.1 associated with a red blood cell trait, where we identified potential enhancers regulating HK1 gene expression, despite not all of these enhancers exhibiting typical chromatin signatures. More importantly, our saturated base editing screens pinpoint multiple causal variants within this locus that would otherwise be missed by Bayesian statistical fine-mapping. Our approach is generally applicable to functional interrogation of all non-coding genomic elements while complementing other high-coverage CRISPR screens.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100583"},"PeriodicalIF":11.1,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11293580/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141422072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10Epub Date: 2024-06-18DOI: 10.1016/j.xgen.2024.100587
Katie L Burnham, Nikhil Milind, Wanseon Lee, Andrew J Kwok, Kiki Cano-Gamez, Yuxin Mi, Cyndi G Geoghegan, Ping Zhang, Stuart McKechnie, Nicole Soranzo, Charles J Hinds, Julian C Knight, Emma E Davenport
Sepsis is a clinical syndrome of life-threatening organ dysfunction caused by a dysregulated response to infection, for which disease heterogeneity is a major obstacle to developing targeted treatments. We have previously identified gene-expression-based patient subgroups (sepsis response signatures [SRS]) informative for outcome and underlying pathophysiology. Here, we aimed to investigate the role of genetic variation in determining the host transcriptomic response and to delineate regulatory networks underlying SRS. Using genotyping and RNA-sequencing data on 638 adult sepsis patients, we report 16,049 independent expression (eQTLs) and 32 co-expression module (modQTLs) quantitative trait loci in this disease context. We identified significant interactions between SRS and genotype for 1,578 SNP-gene pairs and combined transcription factor (TF) binding site information (SNP2TFBS) and predicted regulon activity (DoRothEA) to identify candidate upstream regulators. Overall, these approaches identified putative mechanistic links between host genetic variation, cell subtypes, and the individual transcriptomic response to infection.
{"title":"eQTLs identify regulatory networks and drivers of variation in the individual response to sepsis.","authors":"Katie L Burnham, Nikhil Milind, Wanseon Lee, Andrew J Kwok, Kiki Cano-Gamez, Yuxin Mi, Cyndi G Geoghegan, Ping Zhang, Stuart McKechnie, Nicole Soranzo, Charles J Hinds, Julian C Knight, Emma E Davenport","doi":"10.1016/j.xgen.2024.100587","DOIUrl":"10.1016/j.xgen.2024.100587","url":null,"abstract":"<p><p>Sepsis is a clinical syndrome of life-threatening organ dysfunction caused by a dysregulated response to infection, for which disease heterogeneity is a major obstacle to developing targeted treatments. We have previously identified gene-expression-based patient subgroups (sepsis response signatures [SRS]) informative for outcome and underlying pathophysiology. Here, we aimed to investigate the role of genetic variation in determining the host transcriptomic response and to delineate regulatory networks underlying SRS. Using genotyping and RNA-sequencing data on 638 adult sepsis patients, we report 16,049 independent expression (eQTLs) and 32 co-expression module (modQTLs) quantitative trait loci in this disease context. We identified significant interactions between SRS and genotype for 1,578 SNP-gene pairs and combined transcription factor (TF) binding site information (SNP2TFBS) and predicted regulon activity (DoRothEA) to identify candidate upstream regulators. Overall, these approaches identified putative mechanistic links between host genetic variation, cell subtypes, and the individual transcriptomic response to infection.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100587"},"PeriodicalIF":11.1,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11293594/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141428438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-12Epub Date: 2024-05-20DOI: 10.1016/j.xgen.2024.100563
Yocelyn Recinos, Suying Bao, Xiaojian Wang, Brittany L Phillips, Yow-Tyng Yeh, Sebastien M Weyn-Vanhentenryck, Maurice S Swanson, Chaolin Zhang
Divergence of precursor messenger RNA (pre-mRNA) alternative splicing (AS) is widespread in mammals, including primates, but the underlying mechanisms and functional impact are poorly understood. Here, we modeled cassette exon inclusion in primate brains as a quantitative trait and identified 1,170 (∼3%) exons with lineage-specific splicing shifts under stabilizing selection. Among them, microtubule-associated protein tau (MAPT) exons 2 and 10 underwent anticorrelated, two-step evolutionary shifts in the catarrhine and hominoid lineages, leading to their present inclusion levels in humans. The developmental-stage-specific divergence of exon 10 splicing, whose dysregulation can cause frontotemporal lobar degeneration (FTLD), is mediated by divergent distal intronic MBNL-binding sites. Competitive binding of these sites by CRISPR-dCas13d/gRNAs effectively reduces exon 10 inclusion, potentially providing a therapeutically compatible approach to modulate tau isoform expression. Our data suggest adaptation of MAPT function and, more generally, a role for AS in the evolutionary expansion of the primate brain.
{"title":"Lineage-specific splicing regulation of MAPT gene in the primate brain.","authors":"Yocelyn Recinos, Suying Bao, Xiaojian Wang, Brittany L Phillips, Yow-Tyng Yeh, Sebastien M Weyn-Vanhentenryck, Maurice S Swanson, Chaolin Zhang","doi":"10.1016/j.xgen.2024.100563","DOIUrl":"10.1016/j.xgen.2024.100563","url":null,"abstract":"<p><p>Divergence of precursor messenger RNA (pre-mRNA) alternative splicing (AS) is widespread in mammals, including primates, but the underlying mechanisms and functional impact are poorly understood. Here, we modeled cassette exon inclusion in primate brains as a quantitative trait and identified 1,170 (∼3%) exons with lineage-specific splicing shifts under stabilizing selection. Among them, microtubule-associated protein tau (MAPT) exons 2 and 10 underwent anticorrelated, two-step evolutionary shifts in the catarrhine and hominoid lineages, leading to their present inclusion levels in humans. The developmental-stage-specific divergence of exon 10 splicing, whose dysregulation can cause frontotemporal lobar degeneration (FTLD), is mediated by divergent distal intronic MBNL-binding sites. Competitive binding of these sites by CRISPR-dCas13d/gRNAs effectively reduces exon 10 inclusion, potentially providing a therapeutically compatible approach to modulate tau isoform expression. Our data suggest adaptation of MAPT function and, more generally, a role for AS in the evolutionary expansion of the primate brain.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100563"},"PeriodicalIF":11.1,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11228892/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141077375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-12Epub Date: 2024-05-22DOI: 10.1016/j.xgen.2024.100565
Senlin Lin, Yan Cui, Fangyuan Zhao, Zhidong Yang, Jiangning Song, Jianhua Yao, Yu Zhao, Bin-Zhi Qian, Yi Zhao, Zhiyuan Yuan
Spatially resolved transcriptomics (SRT) technologies have revolutionized the study of tissue organization. We introduce a graph convolutional network with an attention and positive emphasis mechanism, termed BINARY, relying exclusively on binarized SRT data to accurately delineate spatial domains. BINARY outperforms existing methods across various SRT data types while using significantly less input information. Our study suggests that precise gene expression quantification may not always be essential, inspiring further exploration of the broader applications of spatially resolved binarized gene expression data.
{"title":"Complete spatially resolved gene expression is not necessary for identifying spatial domains.","authors":"Senlin Lin, Yan Cui, Fangyuan Zhao, Zhidong Yang, Jiangning Song, Jianhua Yao, Yu Zhao, Bin-Zhi Qian, Yi Zhao, Zhiyuan Yuan","doi":"10.1016/j.xgen.2024.100565","DOIUrl":"10.1016/j.xgen.2024.100565","url":null,"abstract":"<p><p>Spatially resolved transcriptomics (SRT) technologies have revolutionized the study of tissue organization. We introduce a graph convolutional network with an attention and positive emphasis mechanism, termed BINARY, relying exclusively on binarized SRT data to accurately delineate spatial domains. BINARY outperforms existing methods across various SRT data types while using significantly less input information. Our study suggests that precise gene expression quantification may not always be essential, inspiring further exploration of the broader applications of spatially resolved binarized gene expression data.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100565"},"PeriodicalIF":11.1,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11228956/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141089387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-12Epub Date: 2024-05-29DOI: 10.1016/j.xgen.2024.100580
Unnati Sonawala, Helen Beasley, Peter Thorpe, Kyriakos Varypatakis, Beatrice Senatori, John T Jones, Lida Derevnina, Sebastian Eves-van den Akker
Pathogens are engaged in a fierce evolutionary arms race with their host. The genes at the forefront of the engagement between kingdoms are often part of diverse and highly mutable gene families. Even in this context, we discovered unprecedented variation in the hyper-variable (HYP) effectors of plant-parasitic nematodes. HYP effectors are single-gene loci that potentially harbor thousands of alleles. Alleles vary in the organization, as well as the number, of motifs within a central hyper-variable domain (HVD). We dramatically expand the HYP repertoire of two plant-parasitic nematodes and define distinct species-specific "rules" underlying the apparently flawless genetic rearrangements. Finally, by analyzing the HYPs in 68 individual nematodes, we unexpectedly found that despite the huge number of alleles, most individuals are germline homozygous. These data support a mechanism of programmed genetic variation, termed HVD editing, where alterations are locus specific, strictly governed by rules, and theoretically produce thousands of variants without errors.
{"title":"A gene with a thousand alleles: The hyper-variable effectors of plant-parasitic nematodes.","authors":"Unnati Sonawala, Helen Beasley, Peter Thorpe, Kyriakos Varypatakis, Beatrice Senatori, John T Jones, Lida Derevnina, Sebastian Eves-van den Akker","doi":"10.1016/j.xgen.2024.100580","DOIUrl":"10.1016/j.xgen.2024.100580","url":null,"abstract":"<p><p>Pathogens are engaged in a fierce evolutionary arms race with their host. The genes at the forefront of the engagement between kingdoms are often part of diverse and highly mutable gene families. Even in this context, we discovered unprecedented variation in the hyper-variable (HYP) effectors of plant-parasitic nematodes. HYP effectors are single-gene loci that potentially harbor thousands of alleles. Alleles vary in the organization, as well as the number, of motifs within a central hyper-variable domain (HVD). We dramatically expand the HYP repertoire of two plant-parasitic nematodes and define distinct species-specific \"rules\" underlying the apparently flawless genetic rearrangements. Finally, by analyzing the HYPs in 68 individual nematodes, we unexpectedly found that despite the huge number of alleles, most individuals are germline homozygous. These data support a mechanism of programmed genetic variation, termed HVD editing, where alterations are locus specific, strictly governed by rules, and theoretically produce thousands of variants without errors.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100580"},"PeriodicalIF":11.1,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11228951/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141181730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-12Epub Date: 2024-05-31DOI: 10.1016/j.xgen.2024.100581
Jesus Gonzalez-Ferrer, Julian Lehrer, Ash O'Farrell, Benedict Paten, Mircea Teodorescu, David Haussler, Vanessa D Jonsson, Mohammed A Mostajo-Radji
Cell atlases serve as vital references for automating cell labeling in new samples, yet existing classification algorithms struggle with accuracy. Here we introduce SIMS (scalable, interpretable machine learning for single cell), a low-code data-efficient pipeline for single-cell RNA classification. We benchmark SIMS against datasets from different tissues and species. We demonstrate SIMS's efficacy in classifying cells in the brain, achieving high accuracy even with small training sets (<3,500 cells) and across different samples. SIMS accurately predicts neuronal subtypes in the developing brain, shedding light on genetic changes during neuronal differentiation and postmitotic fate refinement. Finally, we apply SIMS to single-cell RNA datasets of cortical organoids to predict cell identities and uncover genetic variations between cell lines. SIMS identifies cell-line differences and misannotated cell lineages in human cortical organoids derived from different pluripotent stem cell lines. Altogether, we show that SIMS is a versatile and robust tool for cell-type classification from single-cell datasets.
{"title":"SIMS: A deep-learning label transfer tool for single-cell RNA sequencing analysis.","authors":"Jesus Gonzalez-Ferrer, Julian Lehrer, Ash O'Farrell, Benedict Paten, Mircea Teodorescu, David Haussler, Vanessa D Jonsson, Mohammed A Mostajo-Radji","doi":"10.1016/j.xgen.2024.100581","DOIUrl":"10.1016/j.xgen.2024.100581","url":null,"abstract":"<p><p>Cell atlases serve as vital references for automating cell labeling in new samples, yet existing classification algorithms struggle with accuracy. Here we introduce SIMS (scalable, interpretable machine learning for single cell), a low-code data-efficient pipeline for single-cell RNA classification. We benchmark SIMS against datasets from different tissues and species. We demonstrate SIMS's efficacy in classifying cells in the brain, achieving high accuracy even with small training sets (<3,500 cells) and across different samples. SIMS accurately predicts neuronal subtypes in the developing brain, shedding light on genetic changes during neuronal differentiation and postmitotic fate refinement. Finally, we apply SIMS to single-cell RNA datasets of cortical organoids to predict cell identities and uncover genetic variations between cell lines. SIMS identifies cell-line differences and misannotated cell lineages in human cortical organoids derived from different pluripotent stem cell lines. Altogether, we show that SIMS is a versatile and robust tool for cell-type classification from single-cell datasets.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100581"},"PeriodicalIF":11.1,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11228957/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141186955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}