Background: The quiescent center of plant roots plays a critical role in maintaining stemness and proliferative activity of surrounding meristem cells that drive root growth. However, it remains difficult to distinguish and isolate quiescent center cells for transcriptomic analysis.
Results: To overcome the challenges, we develop a protocol for isolating intact quiescent center cells from Arabidopsis root tips and perform single-cell long-read RNA sequencing of quiescent center cells across two developmental stages. The analysis reveals a transition from hormone signaling and proliferation toward differentiation between 5 and 10 days after germination. We capture 6,713 previously unannotated transcripts, including several isoform types. We validate ROOT MERISTEM GROWTH FACTOR 10 (RGF10) as a quiescent center specific marker gene. We also detect enrichment in transcripts associated with cell growth, auxin response, and cell division at 5 days, but lost at 10 days, after germination. Functional analysis suggests that the quiescent center marker, TERMINAL EAR1-like (TEL1), is involved in maintaining stem cell identity via regulation of WUSCHEL-RELATED HOMEOBOX 5 (WOX5). Comparative analysis of rice quiescent center transcriptomes reveals distinct auxin biosynthesis and signaling pathways compared to that in Arabidopsis.
Conclusions: This study provides a method for isolating rare cell types and generates long- and short-read transcriptomic atlases, offering novel mechanistic insights into stem cell maintenance in monocot and eudicot roots. The findings have implications for improving crop root systems.
{"title":"Single-cell transcriptomic analysis of plant quiescent center by third-generation sequencing reveals developmental trajectories.","authors":"Guihua Hu, Cong Li, Weijun Guo, Dongwei Li, Liwen Yang, Xiaofeng Gu","doi":"10.1186/s13059-026-03989-0","DOIUrl":"https://doi.org/10.1186/s13059-026-03989-0","url":null,"abstract":"<p><strong>Background: </strong>The quiescent center of plant roots plays a critical role in maintaining stemness and proliferative activity of surrounding meristem cells that drive root growth. However, it remains difficult to distinguish and isolate quiescent center cells for transcriptomic analysis.</p><p><strong>Results: </strong>To overcome the challenges, we develop a protocol for isolating intact quiescent center cells from Arabidopsis root tips and perform single-cell long-read RNA sequencing of quiescent center cells across two developmental stages. The analysis reveals a transition from hormone signaling and proliferation toward differentiation between 5 and 10 days after germination. We capture 6,713 previously unannotated transcripts, including several isoform types. We validate ROOT MERISTEM GROWTH FACTOR 10 (RGF10) as a quiescent center specific marker gene. We also detect enrichment in transcripts associated with cell growth, auxin response, and cell division at 5 days, but lost at 10 days, after germination. Functional analysis suggests that the quiescent center marker, TERMINAL EAR1-like (TEL1), is involved in maintaining stem cell identity via regulation of WUSCHEL-RELATED HOMEOBOX 5 (WOX5). Comparative analysis of rice quiescent center transcriptomes reveals distinct auxin biosynthesis and signaling pathways compared to that in Arabidopsis.</p><p><strong>Conclusions: </strong>This study provides a method for isolating rare cell types and generates long- and short-read transcriptomic atlases, offering novel mechanistic insights into stem cell maintenance in monocot and eudicot roots. The findings have implications for improving crop root systems.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146149509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Ovarian cancer remains highly lethal due to late-stage diagnosis and aggressive metastatic potential. Lysosomal acidification plays a critical role in tumor metastasis, regulated by multiple genetic factors. While GATA4 is a well-established transcriptional activator in cardiomyogenesis, its function in carcinogenesis remains ambiguous, particularly in ovarian cancer, and its impact on lysosomal regulation is poorly understood. Therefore, we aim to elucidate the function of GATA4 in ovarian cancer metastasis and the underlying molecular mechanisms.
Results: We find that downregulation of GATA4 in ovarian cancer correlates with enhanced lysosomal acidification, increased cell proliferation, and elevated lung and abdominal metastasis both in vitro and in vivo. Mechanistically, GATA4 interacts with the CRL4B complex and undergoes CUL4B-mediated ubiquitination at lysine residues 329 and 404. Integrated RNA-seq and CUT&Tag, ChIP-qPCR, and dual-luciferase assays reveal that GATA4 activates H3K27ac modification at tumor suppressor gene TRIM22, consequently modulating epithelial-mesenchymal transition and lysosomal acidification. These findings demonstrate that GATA4 suppresses lysosomal acidification and epithelial-mesenchymal transition, while its CRL4B-mediated ubiquitination and degradation in metastatic ovarian cancer cells leads to reduced H3K27ac modification at tumor suppressor genes.
Conclusions: Our study elucidates the tumor-suppressive role of GATA4 in regulating lysosomal acidification and epithelial-mesenchymal transition through ubiquitination-dependent mechanisms and histone acetylation modulation. The findings identify GATA4 as a promising therapeutic target and diagnostic marker for ovarian cancer intervention.
{"title":"Ubiquitination degradation of GATA4 by CUL4B promotes ovarian cancer metastasis by inducing lysosomal acidification.","authors":"Xin Yin, Rufei Gao, Yanqing Geng, Xinyi Mu, Yan Zhang, Yidan Ma, Xuemei Chen, Fei Han, Zhuxiu Chen, Fangfang Li, Junlin He","doi":"10.1186/s13059-026-03982-7","DOIUrl":"https://doi.org/10.1186/s13059-026-03982-7","url":null,"abstract":"<p><strong>Background: </strong>Ovarian cancer remains highly lethal due to late-stage diagnosis and aggressive metastatic potential. Lysosomal acidification plays a critical role in tumor metastasis, regulated by multiple genetic factors. While GATA4 is a well-established transcriptional activator in cardiomyogenesis, its function in carcinogenesis remains ambiguous, particularly in ovarian cancer, and its impact on lysosomal regulation is poorly understood. Therefore, we aim to elucidate the function of GATA4 in ovarian cancer metastasis and the underlying molecular mechanisms.</p><p><strong>Results: </strong>We find that downregulation of GATA4 in ovarian cancer correlates with enhanced lysosomal acidification, increased cell proliferation, and elevated lung and abdominal metastasis both in vitro and in vivo. Mechanistically, GATA4 interacts with the CRL4B complex and undergoes CUL4B-mediated ubiquitination at lysine residues 329 and 404. Integrated RNA-seq and CUT&Tag, ChIP-qPCR, and dual-luciferase assays reveal that GATA4 activates H3K27ac modification at tumor suppressor gene TRIM22, consequently modulating epithelial-mesenchymal transition and lysosomal acidification. These findings demonstrate that GATA4 suppresses lysosomal acidification and epithelial-mesenchymal transition, while its CRL4B-mediated ubiquitination and degradation in metastatic ovarian cancer cells leads to reduced H3K27ac modification at tumor suppressor genes.</p><p><strong>Conclusions: </strong>Our study elucidates the tumor-suppressive role of GATA4 in regulating lysosomal acidification and epithelial-mesenchymal transition through ubiquitination-dependent mechanisms and histone acetylation modulation. The findings identify GATA4 as a promising therapeutic target and diagnostic marker for ovarian cancer intervention.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146149527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1186/s13059-026-03958-7
Judit García-González, Alanna C Cote, Saul Garcia-Gonzalez, Lathan Liou, Paul F O'Reilly
Background: Fine-mapping and gene-prioritisation techniques applied to the latest genome-wide association study (GWAS) results have prioritised hundreds of genes as causally associated with disease. Here we leverage these recently compiled lists of high-confidence causal genes to interrogate where in the body disease genes operate, providing a more direct approach than previous studies, which have primarily relied on the enrichment of GWAS signals among genes with cell- or tissue-specific expression.
Results: By integrating GWAS summary statistics, gene prioritisation results, and RNA-seq data from 46 tissues and 204 cell types, we directly analyse the gene expression of putative disease genes across the body in relation to 11 major diseases and cancers. In tissues and cell types with established disease relevance, disease genes show higher and more specific gene expression compared to control genes. Moreover, we detect elevated expression in tissues and cell types without previous links to the corresponding disease. While some of these results may be explained by cell types that span multiple tissues, such as macrophages in brain, blood, lung and spleen in relation to Alzheimer's disease (P-values < 10-3), the cause for others is unclear and warrants further investigation. To support functional follow-up studies of disease genes, we identify technical and biological factors influencing their expression. Finally, we highlight tissue-disease pairs in which significantly elevated expression is associated with increased odds of inclusion in drug development programmes.
Conclusions: We provide our systematic testing framework as an open-source, publicly available tool that can be utilised to offer novel insights into the genes, tissues and cell types involved in any disease, with the potential for informing drug development and delivery strategies.
{"title":"The gene expression landscape of disease genes.","authors":"Judit García-González, Alanna C Cote, Saul Garcia-Gonzalez, Lathan Liou, Paul F O'Reilly","doi":"10.1186/s13059-026-03958-7","DOIUrl":"https://doi.org/10.1186/s13059-026-03958-7","url":null,"abstract":"<p><strong>Background: </strong>Fine-mapping and gene-prioritisation techniques applied to the latest genome-wide association study (GWAS) results have prioritised hundreds of genes as causally associated with disease. Here we leverage these recently compiled lists of high-confidence causal genes to interrogate where in the body disease genes operate, providing a more direct approach than previous studies, which have primarily relied on the enrichment of GWAS signals among genes with cell- or tissue-specific expression.</p><p><strong>Results: </strong>By integrating GWAS summary statistics, gene prioritisation results, and RNA-seq data from 46 tissues and 204 cell types, we directly analyse the gene expression of putative disease genes across the body in relation to 11 major diseases and cancers. In tissues and cell types with established disease relevance, disease genes show higher and more specific gene expression compared to control genes. Moreover, we detect elevated expression in tissues and cell types without previous links to the corresponding disease. While some of these results may be explained by cell types that span multiple tissues, such as macrophages in brain, blood, lung and spleen in relation to Alzheimer's disease (P-values < 10<sup>-3</sup>), the cause for others is unclear and warrants further investigation. To support functional follow-up studies of disease genes, we identify technical and biological factors influencing their expression. Finally, we highlight tissue-disease pairs in which significantly elevated expression is associated with increased odds of inclusion in drug development programmes.</p><p><strong>Conclusions: </strong>We provide our systematic testing framework as an open-source, publicly available tool that can be utilised to offer novel insights into the genes, tissues and cell types involved in any disease, with the potential for informing drug development and delivery strategies.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146149584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: The complex historical phenomenon known as Greek colonization refers to the strategic establishment of new settlements (colonies) from the 8th to the early fourth century BCE. Unlike earlier migrations, this process was planned and driven by the need to expand trade, access resources, and develop economic as well as political networks. Corinth, a prominent commercial center in southern Greece, constitutes a prominent example for initiating colonization. By founding colonies, Corinth established a safe and continuous route for moving goods along the coasts of western mainland Greece and the Adriatic. Amvrakia was one of Corinth's principal colonies along this route in northwestern Greece. Founded in the seventh century BCE, Amvrakia was characterized by a strong dependence on its metropolis (Corinth). Here, we aim to investigate the genetic relationships between the Corinthian metropolis and the Amvrakia colony, the contribution of the local population to the founding genetic pool, as well as the demography of Amvrakia in subsequent periods.
Results: During its foundation in the Archaic period, Amvrakia appears to have been shaped by genetic influences from a single source. This source migrated from the Corinth territory, represented by the Archaic Tenea population and is supported via an Identity By Descent (IBD) analysis. A direct ancestry from Late Bronze Age (LBA) Greece, including a local LBA population represented by the Ammotopos site located in close proximity to Amvrakia, was not inferred despite conducting a plethora of independent population genomics analyses. During the subsequent Classical and Hellenistic periods, the population of Amvrakia appears to have only slightly differentiated and evidence of genetic continuity over time is observed.
Conclusions: The migration of Corinthians to Amvrakia was the major contributor to the initial genetic pool of the colony, indicating that the Corinthian colonization included both genetic and cultural transmission between the metropolis and its colony.
{"title":"Genetic affinities between the ancient Greek colony of Amvrakia and its metropolis.","authors":"Nikolaos Psonis, Eugenia Tabakaki, Despoina Vassou, Stefanos Papadantonakis, Angelos Souleles, Argyro Nafplioti, Georgios Kousis Tsampazis, Angeliki Papadopoulou, Kiriakos Xanthopoulos, Panagiotis Panailidis, Angeliki Georgiadou, Dimitra Papakosta, Sevasti Koursioti, Maria Evangelinou, Varvara Papadopoulou, Paraskevi Evaggeloglou, Elena Korka, Ioannis Christidis, Michael Ioannou, Theodora Kontogianni, Athanasios Arkoumanis, Alexandros Stamatakis, Nikos Poulakakis, Christina Papageorgopoulou, Pavlos Pavlidis","doi":"10.1186/s13059-026-03968-5","DOIUrl":"https://doi.org/10.1186/s13059-026-03968-5","url":null,"abstract":"<p><strong>Background: </strong>The complex historical phenomenon known as Greek colonization refers to the strategic establishment of new settlements (colonies) from the 8th to the early fourth century BCE. Unlike earlier migrations, this process was planned and driven by the need to expand trade, access resources, and develop economic as well as political networks. Corinth, a prominent commercial center in southern Greece, constitutes a prominent example for initiating colonization. By founding colonies, Corinth established a safe and continuous route for moving goods along the coasts of western mainland Greece and the Adriatic. Amvrakia was one of Corinth's principal colonies along this route in northwestern Greece. Founded in the seventh century BCE, Amvrakia was characterized by a strong dependence on its metropolis (Corinth). Here, we aim to investigate the genetic relationships between the Corinthian metropolis and the Amvrakia colony, the contribution of the local population to the founding genetic pool, as well as the demography of Amvrakia in subsequent periods.</p><p><strong>Results: </strong>During its foundation in the Archaic period, Amvrakia appears to have been shaped by genetic influences from a single source. This source migrated from the Corinth territory, represented by the Archaic Tenea population and is supported via an Identity By Descent (IBD) analysis. A direct ancestry from Late Bronze Age (LBA) Greece, including a local LBA population represented by the Ammotopos site located in close proximity to Amvrakia, was not inferred despite conducting a plethora of independent population genomics analyses. During the subsequent Classical and Hellenistic periods, the population of Amvrakia appears to have only slightly differentiated and evidence of genetic continuity over time is observed.</p><p><strong>Conclusions: </strong>The migration of Corinthians to Amvrakia was the major contributor to the initial genetic pool of the colony, indicating that the Corinthian colonization included both genetic and cultural transmission between the metropolis and its colony.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146137429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-07DOI: 10.1186/s13059-026-03947-w
Xiaopu Zhang, Idil Yet, Sergio Villicaña, Juan Castillo-Fernandez, Massimo Mangino, Jouke Jan Hottenga, Pei-Chien Tsai, Josine L Min, Mario Falchi, Andrew Wong, Dorret I Boomsma, Ken K Ong, Jenny van Dongen, Jordana T Bell
Background: Genetic variants that are associated with phenotypic variability, or variance quantitative trait loci (vQTLs), have been detected for multiple human traits. Gene-environment interactions can lead to differential phenotypic variability across genotype groups, therefore, genetic variants that interact with environmental exposures can manifest as vQTLs. Although changes in DNA methylation variability have been observed in several diseases, vQTLs for methylation levels (vmeQTL) have not yet been explored in depth.
Results: We optimize the value of monozygotic twin studies to identify and replicate vmeQTLs for blood DNA methylation variance at 358 CpGs in 988 adult monozygotic twin pairs from two European twin registries. Over a third of vmeQTLs capture identical vmeQTL-environmental factor interactions in both datasets, and the majority of interactions are observed with blood cell counts. Correspondingly, over 60% of CpGs affected by genotype-monocyte and genotype-T cell interactions replicate as CpGs affected by genetic effects in the relevant cell type in an independent dataset. Most vmeQTLs also replicate in 1,348 UK non-twin adults and show longitudinal stability in a sample subset. Integrating gene expression and phenotype association results identifies multiple vmeQTLs that capture GxE effects relevant to human health. Examples include vmeQTLs interacting with blood cell type to influence DNA methylation in FAM65A, NAPRT, and CSGALNACT1 underlying immune disease susceptibility and progression.
Conclusions: Our findings identify novel genetic effects on human DNA methylation variability within a unique monozygotic twin study design. The results show the potential of vmeQTLs to identify gene-environment interactions and provide novel insights into complex traits.
{"title":"Genetic impacts on within-pair DNA methylation variance in monozygotic twins capture gene-environment interactions and cell-type effects.","authors":"Xiaopu Zhang, Idil Yet, Sergio Villicaña, Juan Castillo-Fernandez, Massimo Mangino, Jouke Jan Hottenga, Pei-Chien Tsai, Josine L Min, Mario Falchi, Andrew Wong, Dorret I Boomsma, Ken K Ong, Jenny van Dongen, Jordana T Bell","doi":"10.1186/s13059-026-03947-w","DOIUrl":"https://doi.org/10.1186/s13059-026-03947-w","url":null,"abstract":"<p><strong>Background: </strong>Genetic variants that are associated with phenotypic variability, or variance quantitative trait loci (vQTLs), have been detected for multiple human traits. Gene-environment interactions can lead to differential phenotypic variability across genotype groups, therefore, genetic variants that interact with environmental exposures can manifest as vQTLs. Although changes in DNA methylation variability have been observed in several diseases, vQTLs for methylation levels (vmeQTL) have not yet been explored in depth.</p><p><strong>Results: </strong>We optimize the value of monozygotic twin studies to identify and replicate vmeQTLs for blood DNA methylation variance at 358 CpGs in 988 adult monozygotic twin pairs from two European twin registries. Over a third of vmeQTLs capture identical vmeQTL-environmental factor interactions in both datasets, and the majority of interactions are observed with blood cell counts. Correspondingly, over 60% of CpGs affected by genotype-monocyte and genotype-T cell interactions replicate as CpGs affected by genetic effects in the relevant cell type in an independent dataset. Most vmeQTLs also replicate in 1,348 UK non-twin adults and show longitudinal stability in a sample subset. Integrating gene expression and phenotype association results identifies multiple vmeQTLs that capture GxE effects relevant to human health. Examples include vmeQTLs interacting with blood cell type to influence DNA methylation in FAM65A, NAPRT, and CSGALNACT1 underlying immune disease susceptibility and progression.</p><p><strong>Conclusions: </strong>Our findings identify novel genetic effects on human DNA methylation variability within a unique monozygotic twin study design. The results show the potential of vmeQTLs to identify gene-environment interactions and provide novel insights into complex traits.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146137585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-07DOI: 10.1186/s13059-026-03984-5
Somia Saidi, Mathieu Blaison, María Del Pilar Rodríguez-Ordóñez, Johann Confais, Hadi Quesneville
Background: The role of transposable elements (TEs) in host adaptation has gained interest in recent years. Individuals of the same species undergo independent TE insertions, providing genetic variability within populations, upon which natural selection can act to foster adaptation to environmental conditions.
Results: As de novo assembled genomes are becoming increasingly affordable, helping to overcome the bias introduced by relying on a single reference genome, there is a growing need for suitable pangenomic tools to explore the genomic diversity within a species. We developed a new pipeline called panREPET that identifies TE insertions shared by groups of individuals. Unlike other pangenomic tools, panREPET operates independently of a reference genome and provides the precise sequence and genomic coordinates of each TE copy for each genome.
Conclusions: We showcase the potential of this tool by identifying TE insertions shared among 42 Brachypodium distachyon genomes and by comparing our results with those of existing tools to demonstrate its advantages. Using panREPET, we were able to date two major TE bursts corresponding to major climate events: 22 kya during the Last Glacial Maximum and 10 kya during the Holocene, showing a potential link between environmental stress and TE activity.
{"title":"A reference-free pipeline for detecting shared transposable elements from pan-genomes to retrace their dynamics in a species.","authors":"Somia Saidi, Mathieu Blaison, María Del Pilar Rodríguez-Ordóñez, Johann Confais, Hadi Quesneville","doi":"10.1186/s13059-026-03984-5","DOIUrl":"https://doi.org/10.1186/s13059-026-03984-5","url":null,"abstract":"<p><strong>Background: </strong>The role of transposable elements (TEs) in host adaptation has gained interest in recent years. Individuals of the same species undergo independent TE insertions, providing genetic variability within populations, upon which natural selection can act to foster adaptation to environmental conditions.</p><p><strong>Results: </strong>As de novo assembled genomes are becoming increasingly affordable, helping to overcome the bias introduced by relying on a single reference genome, there is a growing need for suitable pangenomic tools to explore the genomic diversity within a species. We developed a new pipeline called panREPET that identifies TE insertions shared by groups of individuals. Unlike other pangenomic tools, panREPET operates independently of a reference genome and provides the precise sequence and genomic coordinates of each TE copy for each genome.</p><p><strong>Conclusions: </strong>We showcase the potential of this tool by identifying TE insertions shared among 42 Brachypodium distachyon genomes and by comparing our results with those of existing tools to demonstrate its advantages. Using panREPET, we were able to date two major TE bursts corresponding to major climate events: 22 kya during the Last Glacial Maximum and 10 kya during the Holocene, showing a potential link between environmental stress and TE activity.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146137439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-07DOI: 10.1186/s13059-026-03966-7
Yongxin Ji, Jiaojiao Guan, Herui Liao, Jiayu Shang, Yanni Sun
Plasmids play a pivotal role in the emergence of multidrug-resistant and pathogenic bacteria, posing significant clinical challenges. However, the rapidly growing number of unannotated plasmids necessitates comprehensive characterization of their diverse properties. Here, we present PlasRAG, a tool that integrates multi-faceted property characterization of query plasmids and plasmid DNA retrieval based on textual queries. PlasRAG employs a bidirectional multi-modal information retrieval model that aligns DNA sequences with textual data, effectively overcoming the limitations of traditional approaches. Rigorous experiments demonstrate that PlasRAG delivers robust performance and enhanced analytical capabilities, underscoring the effectiveness of its architectural design.
{"title":"PlasRAG: comprehensive plasmid characterization and retrieval through sequence-text alignment.","authors":"Yongxin Ji, Jiaojiao Guan, Herui Liao, Jiayu Shang, Yanni Sun","doi":"10.1186/s13059-026-03966-7","DOIUrl":"https://doi.org/10.1186/s13059-026-03966-7","url":null,"abstract":"<p><p>Plasmids play a pivotal role in the emergence of multidrug-resistant and pathogenic bacteria, posing significant clinical challenges. However, the rapidly growing number of unannotated plasmids necessitates comprehensive characterization of their diverse properties. Here, we present PlasRAG, a tool that integrates multi-faceted property characterization of query plasmids and plasmid DNA retrieval based on textual queries. PlasRAG employs a bidirectional multi-modal information retrieval model that aligns DNA sequences with textual data, effectively overcoming the limitations of traditional approaches. Rigorous experiments demonstrate that PlasRAG delivers robust performance and enhanced analytical capabilities, underscoring the effectiveness of its architectural design.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146137588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1186/s13059-026-03959-6
Cai-Jin Chen, Xiao-Xu Pang, Ya-Mei Ding, Wei-Ping Zhang, Yang Yang, Jie Liu, Anush Nersesyan, Bo-Wen Zhang, Susanne S Renner, Da-Yong Zhang, Wei-Ning Bai
Background: The inference of population structure in domestication studies is prone to biases whenever sampling is unbalanced and effective population sizes (Ne) differ across populations. Such biases can lead to the misclassification of large ancestral populations as admixed, particularly under single-origin domestication scenarios.
Results: We propose a novel parameterization strategy for the STRUCTURE software, combining the F model and alternative ancestry prior (along with a smaller initial ALPHA value), and simulations demonstrate that the strategy mitigates unbalanced sampling and unequal population size biases. We apply our strategy to the domestication history of the common walnut (Juglans regia), using whole-genome resequencing data from 298 individuals from across its range. The results support an origin of J. regia in South Asia, where walnut populations are characterized by high genetic diversity, extensive private allele content, low mutation load, and demographic stability. Building on this demographic framework, we further identify genomic regions under recent positive selection and candidate domestication genes involved in shell structure, pollen development, and lipid transport.
Conclusions: Our results clarify the long-standing debate on the geographic origin of walnut domestication and demonstrate that an optimized, model-aware use of STRUCTURE can substantially improve population-genetic inference in domestication studies and other systems characterized by complex demography.
{"title":"Resolving sampling and population-size biases in domestication genomics supports a South Asian origin of walnuts.","authors":"Cai-Jin Chen, Xiao-Xu Pang, Ya-Mei Ding, Wei-Ping Zhang, Yang Yang, Jie Liu, Anush Nersesyan, Bo-Wen Zhang, Susanne S Renner, Da-Yong Zhang, Wei-Ning Bai","doi":"10.1186/s13059-026-03959-6","DOIUrl":"https://doi.org/10.1186/s13059-026-03959-6","url":null,"abstract":"<p><strong>Background: </strong>The inference of population structure in domestication studies is prone to biases whenever sampling is unbalanced and effective population sizes (N<sub>e</sub>) differ across populations. Such biases can lead to the misclassification of large ancestral populations as admixed, particularly under single-origin domestication scenarios.</p><p><strong>Results: </strong>We propose a novel parameterization strategy for the STRUCTURE software, combining the F model and alternative ancestry prior (along with a smaller initial ALPHA value), and simulations demonstrate that the strategy mitigates unbalanced sampling and unequal population size biases. We apply our strategy to the domestication history of the common walnut (Juglans regia), using whole-genome resequencing data from 298 individuals from across its range. The results support an origin of J. regia in South Asia, where walnut populations are characterized by high genetic diversity, extensive private allele content, low mutation load, and demographic stability. Building on this demographic framework, we further identify genomic regions under recent positive selection and candidate domestication genes involved in shell structure, pollen development, and lipid transport.</p><p><strong>Conclusions: </strong>Our results clarify the long-standing debate on the geographic origin of walnut domestication and demonstrate that an optimized, model-aware use of STRUCTURE can substantially improve population-genetic inference in domestication studies and other systems characterized by complex demography.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146092784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1186/s13059-026-03935-0
Han Yuan, Johannes Linder, David R Kelley
Background: DNA sequence deep learning models can accurately predict epigenetic and transcriptional profiles, enabling analysis of gene regulation and genetic variant effects. While large-scale models like Enformer and Borzoi are trained on abundant data, they cannot cover all cell states and assays, necessitating training new model to analyze gene regulation in novel contexts. However, training models from scratch for new datasets is computationally expensive.
Results: In this study, we systematically develop and evaluate a transfer learning framework based on parameter-efficient fine-tuning for supervised regulatory sequence models. Using the state-of-the-art model Borzoi, our framework enables accurate model transfer while significantly reducing runtime and memory requirements. Across bulk and single cell RNA-seq datasets, the transferred models effectively predict held-out gene expression changes, identify regulatory drivers in perturbation conditions, and predict cell-type-specific variant effects. We further demonstrate that transferring Borzoi to relevant cell types facilitates mechanistic interpretation of fine-mapped GWAS variants.
Conclusions: Our framework offers a scalable and practical solution for extending large sequence models to novel biological contexts, enabling mechanistic insight into gene regulation and variant effects.
{"title":"Parameter-efficient fine-tuning enables scalable transfer of regulatory sequence models to novel contexts.","authors":"Han Yuan, Johannes Linder, David R Kelley","doi":"10.1186/s13059-026-03935-0","DOIUrl":"https://doi.org/10.1186/s13059-026-03935-0","url":null,"abstract":"<p><strong>Background: </strong>DNA sequence deep learning models can accurately predict epigenetic and transcriptional profiles, enabling analysis of gene regulation and genetic variant effects. While large-scale models like Enformer and Borzoi are trained on abundant data, they cannot cover all cell states and assays, necessitating training new model to analyze gene regulation in novel contexts. However, training models from scratch for new datasets is computationally expensive.</p><p><strong>Results: </strong>In this study, we systematically develop and evaluate a transfer learning framework based on parameter-efficient fine-tuning for supervised regulatory sequence models. Using the state-of-the-art model Borzoi, our framework enables accurate model transfer while significantly reducing runtime and memory requirements. Across bulk and single cell RNA-seq datasets, the transferred models effectively predict held-out gene expression changes, identify regulatory drivers in perturbation conditions, and predict cell-type-specific variant effects. We further demonstrate that transferring Borzoi to relevant cell types facilitates mechanistic interpretation of fine-mapped GWAS variants.</p><p><strong>Conclusions: </strong>Our framework offers a scalable and practical solution for extending large sequence models to novel biological contexts, enabling mechanistic insight into gene regulation and variant effects.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146092832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1186/s13059-025-03908-9
Thomas Defard, Alice Blondel, Sebastien Bellow, Anthony Coleon, Guilherme Dias de Melo, Florian Mueller, Thomas Walter
Imaging-based spatial transcriptomics enables high-resolution spatial mapping of RNA species. A key challenge in imaging-based spatial transcriptomics is accurate cell segmentation to assign each RNA molecule to the right cell. Here, we present RNA2seg, a novel segmentation algorithm trained on over 4 million cells from MERFISH and CosMx datasets across seven organs using a teacher-student training scheme. RNA2seg integrates RNA point clouds and all available membrane and nuclear stainings. Validation on manually annotated data shows superior performance including in zero-shot and few-shot settings.
{"title":"RNA2seg: a generalist model for cell segmentation in image-based spatial transcriptomics.","authors":"Thomas Defard, Alice Blondel, Sebastien Bellow, Anthony Coleon, Guilherme Dias de Melo, Florian Mueller, Thomas Walter","doi":"10.1186/s13059-025-03908-9","DOIUrl":"https://doi.org/10.1186/s13059-025-03908-9","url":null,"abstract":"<p><p>Imaging-based spatial transcriptomics enables high-resolution spatial mapping of RNA species. A key challenge in imaging-based spatial transcriptomics is accurate cell segmentation to assign each RNA molecule to the right cell. Here, we present RNA2seg, a novel segmentation algorithm trained on over 4 million cells from MERFISH and CosMx datasets across seven organs using a teacher-student training scheme. RNA2seg integrates RNA point clouds and all available membrane and nuclear stainings. Validation on manually annotated data shows superior performance including in zero-shot and few-shot settings.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146092825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}