Pub Date : 2026-01-13DOI: 10.1186/s13059-025-03916-9
Breeshey Roskams-Hieter, Øyvind Almelid, Chris P Ponting
Human traits vary in part due to genetically-determined change of transcription factor binding affinity within gene regulatory regions. However, few trait-causal variants or mechanisms are known. Here we propose 1,935 variants as strong candidates for causally altering human traits. We discover these through baal-nf which uses chromatin immunoprecipitation-sequencing data to identify allelic imbalance at heterozygous sites for affinity-concordant positions within transcription factor- and co-factor binding motifs. These allele-specific binding sites are evolutionarily conserved and enriched for trait and gene expression associations. baal-nf and these high-quality allele-specific binding sites allow trait variation due to altered transcription factor binding to be investigated.
{"title":"baal-nf identifies motif-disrupting variants that decrease transcription factor binding affinity.","authors":"Breeshey Roskams-Hieter, Øyvind Almelid, Chris P Ponting","doi":"10.1186/s13059-025-03916-9","DOIUrl":"https://doi.org/10.1186/s13059-025-03916-9","url":null,"abstract":"<p><p>Human traits vary in part due to genetically-determined change of transcription factor binding affinity within gene regulatory regions. However, few trait-causal variants or mechanisms are known. Here we propose 1,935 variants as strong candidates for causally altering human traits. We discover these through baal-nf which uses chromatin immunoprecipitation-sequencing data to identify allelic imbalance at heterozygous sites for affinity-concordant positions within transcription factor- and co-factor binding motifs. These allele-specific binding sites are evolutionarily conserved and enriched for trait and gene expression associations. baal-nf and these high-quality allele-specific binding sites allow trait variation due to altered transcription factor binding to be investigated.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1186/s13059-025-03927-6
Viorel Munteanu, Michael A Saldana, David Dreifuss, Wenhao O Ouyang, Jannatul Ferdous, Fatemeh Mohebbi, Jessica Schlueter Roseberry, Dumitru Ciorba, Viorel Bostan, Victor Gordeev, Nicolae Drabcinski, Justin Maine Su, Nadiia Kasianchuk, Nitesh Kumar Sharma, Sergey Knyazev, Eva Aßmann, Andrei Lobiuc, Mihai Covasa, Keith A Crandall, Nicholas C Wu, Christopher E Mason, Braden T Tierney, Alexander G Lucaci, Roel A Ophoff, Cynthia Gibas, Piotr Rzymski, Pavel Skums, Helena Solo-Gabriele, Beerenwinkel Niko, Alex Zelikovsky, Martin Hölzer, Adam Smith, Serghei Mangul
Wastewater-based genomic surveillance (WWGS) has proven effective for monitoring SARS-CoV-2 and other viruses within communities. It enables rapid detection of known and emerging mutations and provides insights into circulating lineages. Despite its advantages, WWGS faces challenges in sample processing and computational analysis, particularly in distinguishing similar lineages and identifying novel ones. Recent methods for wastewater sequencing (WWS) analysis remain largely untested amid declining clinical surveillance and ongoing viral evolution. This review examines opportunities and limitations of WWGS, focusing on sample preparation, sequencing technologies, and bioinformatics approaches, and highlights its potential to strengthen public health monitoring systems.
{"title":"SARS-CoV-2 wastewater genomic surveillance: approaches, challenges, and opportunities.","authors":"Viorel Munteanu, Michael A Saldana, David Dreifuss, Wenhao O Ouyang, Jannatul Ferdous, Fatemeh Mohebbi, Jessica Schlueter Roseberry, Dumitru Ciorba, Viorel Bostan, Victor Gordeev, Nicolae Drabcinski, Justin Maine Su, Nadiia Kasianchuk, Nitesh Kumar Sharma, Sergey Knyazev, Eva Aßmann, Andrei Lobiuc, Mihai Covasa, Keith A Crandall, Nicholas C Wu, Christopher E Mason, Braden T Tierney, Alexander G Lucaci, Roel A Ophoff, Cynthia Gibas, Piotr Rzymski, Pavel Skums, Helena Solo-Gabriele, Beerenwinkel Niko, Alex Zelikovsky, Martin Hölzer, Adam Smith, Serghei Mangul","doi":"10.1186/s13059-025-03927-6","DOIUrl":"10.1186/s13059-025-03927-6","url":null,"abstract":"<p><p>Wastewater-based genomic surveillance (WWGS) has proven effective for monitoring SARS-CoV-2 and other viruses within communities. It enables rapid detection of known and emerging mutations and provides insights into circulating lineages. Despite its advantages, WWGS faces challenges in sample processing and computational analysis, particularly in distinguishing similar lineages and identifying novel ones. Recent methods for wastewater sequencing (WWS) analysis remain largely untested amid declining clinical surveillance and ongoing viral evolution. This review examines opportunities and limitations of WWGS, focusing on sample preparation, sequencing technologies, and bioinformatics approaches, and highlights its potential to strengthen public health monitoring systems.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"27 1","pages":"1"},"PeriodicalIF":10.1,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12794521/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1186/s13059-025-03928-5
Xing Tian Yang, Chun Hing She, CaiCai Zhang, Daniel Leung, Jing Yang, Koon-Wing Chan, Jaime S Rosa Duque, Yu Lung Lau, Wanling Yang
Variant calling in segmental duplications is challenging for short-read sequencing because of ambiguous read origins. We present SDrecall, a method for sensitive variant detection in these regions. Upon constructing a network of homologous sequences, SDrecall realigns reads to each segmental duplication from its homologous counterparts. Realignments are phased and assembled into haplotypes via graph-based algorithms, followed by integer linear programming to retain the two most plausible haplotypes. Tested against long-read benchmarks, SDrecall achieved 95% sensitivity, while maintaining manageable false positives for short variants. SDrecall thus offers significant value for molecular diagnosis in terms of causal mutation detection within homologous regions.
{"title":"SDrecall: a sensitive approach for variant detection in segmental duplications.","authors":"Xing Tian Yang, Chun Hing She, CaiCai Zhang, Daniel Leung, Jing Yang, Koon-Wing Chan, Jaime S Rosa Duque, Yu Lung Lau, Wanling Yang","doi":"10.1186/s13059-025-03928-5","DOIUrl":"https://doi.org/10.1186/s13059-025-03928-5","url":null,"abstract":"<p><p>Variant calling in segmental duplications is challenging for short-read sequencing because of ambiguous read origins. We present SDrecall, a method for sensitive variant detection in these regions. Upon constructing a network of homologous sequences, SDrecall realigns reads to each segmental duplication from its homologous counterparts. Realignments are phased and assembled into haplotypes via graph-based algorithms, followed by integer linear programming to retain the two most plausible haplotypes. Tested against long-read benchmarks, SDrecall achieved 95% sensitivity, while maintaining manageable false positives for short variants. SDrecall thus offers significant value for molecular diagnosis in terms of causal mutation detection within homologous regions.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1186/s13059-026-03937-y
Sergi Cervilla, Daniela Grases, Elena Perez, Francisco X Real, Eva Musulen, Julieta Aprea, Manel Esteller, Eduard Porta-Pardo
Background: Spatial transcriptomics (ST) technologies are reshaping our understanding of tissue organization and cellular context in health and disease. However, technical benchmarking across platforms remains limited, particularly in formalin-fixed, paraffin-embedded (FFPE) clinical samples, which represent the most common tissue format in oncology.
Results: Here, we systematically benchmark five commercial ST platforms (Visium v1, Visium v2/CytAssist, Visium HD, Xenium, and CosMx) using matched FFPE human tumor sections from six cancer types. Uniquely, our study includes both sequencing-based and imaging-based platforms profiled on the same samples, enabling direct technical comparisons across spatial capture modalities. We evaluate platform performance across multiple dimensions, including transcript and UMI detection, gene-histology concordance, cell type recovery, and integration with a targeted protein panel (Visium v2, 30 proteins), enabling spatial multi-omics. We also quantify the impact of sampling strategies and area coverage on cell type estimation, revealing trade-offs in spatial resolution versus tissue context. Notably, we present the first same-sample comparison of Xenium Multi-Tissue (377 genes) and Xenium Prime (5,000 genes), highlighting key differences in transcript recovery and spatial signal despite shared chemistry and imaging infrastructure. Finally, we integrate Visium targeted protein data with matched RNA profiles, uncovering widespread RNA-protein decoupling and spatial heterogeneity in concordance.
Conclusions: Collectively, this work provides a harmonized dataset and technical reference for the spatial transcriptomics community, offering insight into the relative strengths, limitations, and design considerations associated with high-throughput spatial profiling of FFPE tumors.
{"title":"A technical comparison of spatial transcriptomics platforms across six cancer types.","authors":"Sergi Cervilla, Daniela Grases, Elena Perez, Francisco X Real, Eva Musulen, Julieta Aprea, Manel Esteller, Eduard Porta-Pardo","doi":"10.1186/s13059-026-03937-y","DOIUrl":"https://doi.org/10.1186/s13059-026-03937-y","url":null,"abstract":"<p><strong>Background: </strong>Spatial transcriptomics (ST) technologies are reshaping our understanding of tissue organization and cellular context in health and disease. However, technical benchmarking across platforms remains limited, particularly in formalin-fixed, paraffin-embedded (FFPE) clinical samples, which represent the most common tissue format in oncology.</p><p><strong>Results: </strong>Here, we systematically benchmark five commercial ST platforms (Visium v1, Visium v2/CytAssist, Visium HD, Xenium, and CosMx) using matched FFPE human tumor sections from six cancer types. Uniquely, our study includes both sequencing-based and imaging-based platforms profiled on the same samples, enabling direct technical comparisons across spatial capture modalities. We evaluate platform performance across multiple dimensions, including transcript and UMI detection, gene-histology concordance, cell type recovery, and integration with a targeted protein panel (Visium v2, 30 proteins), enabling spatial multi-omics. We also quantify the impact of sampling strategies and area coverage on cell type estimation, revealing trade-offs in spatial resolution versus tissue context. Notably, we present the first same-sample comparison of Xenium Multi-Tissue (377 genes) and Xenium Prime (5,000 genes), highlighting key differences in transcript recovery and spatial signal despite shared chemistry and imaging infrastructure. Finally, we integrate Visium targeted protein data with matched RNA profiles, uncovering widespread RNA-protein decoupling and spatial heterogeneity in concordance.</p><p><strong>Conclusions: </strong>Collectively, this work provides a harmonized dataset and technical reference for the spatial transcriptomics community, offering insight into the relative strengths, limitations, and design considerations associated with high-throughput spatial profiling of FFPE tumors.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1186/s13059-025-03867-1
Ana Gabriela Vasconcelos, Daniel McGuire, Noah Simon, Patrick Danaher, Ali Shojaie
Differential expression is a key application of imaging spatial transcriptomics, moving analysis beyond cell type localization to examining cell state responses to microenvironments. However, spatial data poses new challenges to differential expression: segmentation errors cause bias in fold-change estimates, and correlation among neighboring cells leads standard models to inflate statistical significance. We find that ignoring these issues can result in considerable false discoveries that greatly outnumber true findings. We present a suite of solutions to these fundamental challenges, and implement them in the R package smiDE.
{"title":"Differential expression analysis for spatially correlated data using smiDE.","authors":"Ana Gabriela Vasconcelos, Daniel McGuire, Noah Simon, Patrick Danaher, Ali Shojaie","doi":"10.1186/s13059-025-03867-1","DOIUrl":"https://doi.org/10.1186/s13059-025-03867-1","url":null,"abstract":"<p><p>Differential expression is a key application of imaging spatial transcriptomics, moving analysis beyond cell type localization to examining cell state responses to microenvironments. However, spatial data poses new challenges to differential expression: segmentation errors cause bias in fold-change estimates, and correlation among neighboring cells leads standard models to inflate statistical significance. We find that ignoring these issues can result in considerable false discoveries that greatly outnumber true findings. We present a suite of solutions to these fundamental challenges, and implement them in the R package smiDE.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145948683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1186/s13059-025-03914-x
Abhishek Gogna, Bahareh Kamali, Valentin Wimmer, Renate H Schmidt, Ehsan Eyshi Rezaei, Wera Maria Eckhoff, Jochen C Reif, Yusheng Zhao
Background: Breeding programs prioritize the average performance of a genotype across environments and may overlook promising candidates for specific environments. To address this challenge, we propose a genomic prediction framework to select high-yielding genotypes tailored to individual environments.
Results: We compiled winter wheat grain yield data from 13,285 genotypes-6,766 lines and 6,519 hybrids-evaluated in yield plots at 31 central european sites from 2010 to 2022. With integrated genomic data, we show that only as the size of the training dataset increase, convolutional neural networks benchmark competitive to superior compared with traditional genomic best linear unbiased predictions (GBLUP) in predicting average genotype performance of lines. We then extend our prediction models to account for genotype times environment (G × E) interactions by incorporating information about the growth environment. We observe a 23% improvement in predicting environment-specific performance of new hybrids within a network of test environments with GBLUP based models. To better understand the environmental variables driving G × E interactions, we conduct analyses on a core set of 500 genetically diverse lines. Using machine learning, we successfully identify pivotal environment variables driving the clustering of study environments in central europe and highlight the benefit of modelling G × E interactions in selection of enviromically adapted varieties.
Conclusions: Our results suggest that big data in combination with machine learning and deep learning methods offers new ways to widen the genetic bottleneck often encountered when advancing candidates from early limited-environment to late stage multi-environment evaluations. This promises faster delivery of breeding progress to farmers' fields.
{"title":"Predicting enviromically adapted varieties with big data.","authors":"Abhishek Gogna, Bahareh Kamali, Valentin Wimmer, Renate H Schmidt, Ehsan Eyshi Rezaei, Wera Maria Eckhoff, Jochen C Reif, Yusheng Zhao","doi":"10.1186/s13059-025-03914-x","DOIUrl":"https://doi.org/10.1186/s13059-025-03914-x","url":null,"abstract":"<p><strong>Background: </strong>Breeding programs prioritize the average performance of a genotype across environments and may overlook promising candidates for specific environments. To address this challenge, we propose a genomic prediction framework to select high-yielding genotypes tailored to individual environments.</p><p><strong>Results: </strong>We compiled winter wheat grain yield data from 13,285 genotypes-6,766 lines and 6,519 hybrids-evaluated in yield plots at 31 central european sites from 2010 to 2022. With integrated genomic data, we show that only as the size of the training dataset increase, convolutional neural networks benchmark competitive to superior compared with traditional genomic best linear unbiased predictions (GBLUP) in predicting average genotype performance of lines. We then extend our prediction models to account for genotype times environment (G × E) interactions by incorporating information about the growth environment. We observe a 23% improvement in predicting environment-specific performance of new hybrids within a network of test environments with GBLUP based models. To better understand the environmental variables driving G × E interactions, we conduct analyses on a core set of 500 genetically diverse lines. Using machine learning, we successfully identify pivotal environment variables driving the clustering of study environments in central europe and highlight the benefit of modelling G × E interactions in selection of enviromically adapted varieties.</p><p><strong>Conclusions: </strong>Our results suggest that big data in combination with machine learning and deep learning methods offers new ways to widen the genetic bottleneck often encountered when advancing candidates from early limited-environment to late stage multi-environment evaluations. This promises faster delivery of breeding progress to farmers' fields.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145911103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1186/s13059-025-03926-7
Shiron Drusinsky, Sean Whalen, Katherine S Pollard
Background: Models that predict gene expression levels from DNA sequence struggle to predict differences between individuals when given their personal genome sequences. These models are generally trained on reference genome sequences, and thus have never observed examples of genetic variation at any locus during training, which may explain their lack of generalizability to personal genome sequences that do contain variation.
Results: We utilize fine-tuning with personal genomes and matched tissue-specific gene expression values to develop Variformer, a deep sequence-based neural network. Across held-out people, Variformer predicts expression with accuracy that approaches the cis-heritability of most genes and prioritizes genetic variants across the allele frequency spectrum that are enriched for motif disruption and other functional annotations. We highlight how Variformer fails to generalize to unseen genes.
Conclusions: Our work suggests that fine-tuning with personal genomes corrects previously reported shortcomings of gene expression prediction across unseen individuals, but does not learn a regulatory grammar that generalizes to unseen loci. Fine-tuned deep expression models thus share similar performance and limitations of state-of-the-art linear models, highlighting a gap for the field.
{"title":"Deep-learning prediction of gene expression from personal genomes.","authors":"Shiron Drusinsky, Sean Whalen, Katherine S Pollard","doi":"10.1186/s13059-025-03926-7","DOIUrl":"https://doi.org/10.1186/s13059-025-03926-7","url":null,"abstract":"<p><strong>Background: </strong>Models that predict gene expression levels from DNA sequence struggle to predict differences between individuals when given their personal genome sequences. These models are generally trained on reference genome sequences, and thus have never observed examples of genetic variation at any locus during training, which may explain their lack of generalizability to personal genome sequences that do contain variation.</p><p><strong>Results: </strong>We utilize fine-tuning with personal genomes and matched tissue-specific gene expression values to develop Variformer, a deep sequence-based neural network. Across held-out people, Variformer predicts expression with accuracy that approaches the cis-heritability of most genes and prioritizes genetic variants across the allele frequency spectrum that are enriched for motif disruption and other functional annotations. We highlight how Variformer fails to generalize to unseen genes.</p><p><strong>Conclusions: </strong>Our work suggests that fine-tuning with personal genomes corrects previously reported shortcomings of gene expression prediction across unseen individuals, but does not learn a regulatory grammar that generalizes to unseen loci. Fine-tuned deep expression models thus share similar performance and limitations of state-of-the-art linear models, highlighting a gap for the field.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145911105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02DOI: 10.1186/s13059-025-03909-8
Isobel Ronai, Rodrigo de Paula Baptista, Nicole S Paulat, Julia C Frederick, Tal Azagi, Julian W Bakker, Katie C Dillon, Hein Sprong, David A Ray, Travis C Glenn
Background: Ticks are obligate blood-feeding parasites associated with a huge diversity of diseases globally. The hard tick Ixodes ricinus is the key vector of Lyme borreliosis and tick-borne encephalitis in Western Eurasia. Ixodes ticks have large and repetitive genomes that are not yet well characterized.
Results: Here we generate two high-quality I. ricinus genome assemblies, with haploid genome assembly sizes of approximately 2.15 Gbp. We find transposable elements comprise at least 69% of the two I. ricinus genome assemblies, amongst the highest proportions found in animals. The transposable elements in ticks are highly diverse and novel, so we constructed a repeat library for ticks using our I. ricinus genome assemblies and the high-quality genome assembly of I. scapularis, another major tick vector of Lyme borreliosis. To understand the impact of transposable elements on tick genomes we compared their accumulation in the two Ixodes sister species. We find transposable elements in these two species to have distinctive post-speciation patterns, suggesting transposable elements are drivers of genome evolution in ticks.
Conclusions: The I. ricinus genome assemblies and our tick repeat library will be valuable resources for biological insights into these important ectoparasites. Our findings highlight that further research into the impact of transposable elements on the genomes of blood-feeding parasites is required.
{"title":"The repetitive genome of the Ixodes ricinus tick reveals transposable elements have driven genome evolution in ticks.","authors":"Isobel Ronai, Rodrigo de Paula Baptista, Nicole S Paulat, Julia C Frederick, Tal Azagi, Julian W Bakker, Katie C Dillon, Hein Sprong, David A Ray, Travis C Glenn","doi":"10.1186/s13059-025-03909-8","DOIUrl":"https://doi.org/10.1186/s13059-025-03909-8","url":null,"abstract":"<p><strong>Background: </strong>Ticks are obligate blood-feeding parasites associated with a huge diversity of diseases globally. The hard tick Ixodes ricinus is the key vector of Lyme borreliosis and tick-borne encephalitis in Western Eurasia. Ixodes ticks have large and repetitive genomes that are not yet well characterized.</p><p><strong>Results: </strong>Here we generate two high-quality I. ricinus genome assemblies, with haploid genome assembly sizes of approximately 2.15 Gbp. We find transposable elements comprise at least 69% of the two I. ricinus genome assemblies, amongst the highest proportions found in animals. The transposable elements in ticks are highly diverse and novel, so we constructed a repeat library for ticks using our I. ricinus genome assemblies and the high-quality genome assembly of I. scapularis, another major tick vector of Lyme borreliosis. To understand the impact of transposable elements on tick genomes we compared their accumulation in the two Ixodes sister species. We find transposable elements in these two species to have distinctive post-speciation patterns, suggesting transposable elements are drivers of genome evolution in ticks.</p><p><strong>Conclusions: </strong>The I. ricinus genome assemblies and our tick repeat library will be valuable resources for biological insights into these important ectoparasites. Our findings highlight that further research into the impact of transposable elements on the genomes of blood-feeding parasites is required.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145896419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02DOI: 10.1186/s13059-025-03907-w
Miaomiao Wen, Xiaodong Liang, Keke Shi, Liangdan Fei, Yijie Wang, Yu Zhou, Kun Wang
Background: RNA-RNA spatial interactions play crucial roles in various cellular processes, including gene transcriptional and post-transcriptional regulations. However, research on this area of RNA regulation remains limited in plants.
Results: Here, we adapt the global RNA-RNA interaction mapping method for plants and develop plant RNA in situ conformation sequencing, pRIC-seq, to generate comprehensive RNA-RNA spatial interaction maps for diploid and tetraploid cotton. We also perform global nuclear run-on followed by cap-selection assay, GRO-cap, and integrate multi-omics data to construct enhancer landscapes in these cotton species. Focusing on enhancer-promoter (E-P) RNA interactions, we find that tetraploid cotton, following polyploidy, innovates numerous novel E-P RNA interactions, thereby increasing its genomic regulatory complexity. Comparative analyses between wild-type and mutant fuzzless/lintless in tetraploid cotton reveal that RNA-RNA interactions, including E-P RNA interactions, play pivotal roles in fiber development. Our study also identifies short tandem repeats and transposable elements as potential mediators of E-P RNA interactions through base pairing within the cotton genome. Finally, integrating with genome-wide association studies (GWAS) and eQTLs from previous studies, we observe that our RNA-RNA interactions are significantly enriched near those functional mutation sites. Importantly, by using RAP-qPCR, we confirm that GWAS related enhancers interact with the promoters of protein-coding genes, explaining their regulatory mechanisms in fiber trait control.
Conclusions: Our results provide the first genome-wide RNA-RNA interaction map in higher plants and offer valuable insights into the enhancer-regulated pathway and targets for future breeding studies.
{"title":"Global atlas of enhancer-promoter interactome in cotton genome revealed by profiling RNA-RNA spatial interactions.","authors":"Miaomiao Wen, Xiaodong Liang, Keke Shi, Liangdan Fei, Yijie Wang, Yu Zhou, Kun Wang","doi":"10.1186/s13059-025-03907-w","DOIUrl":"https://doi.org/10.1186/s13059-025-03907-w","url":null,"abstract":"<p><strong>Background: </strong>RNA-RNA spatial interactions play crucial roles in various cellular processes, including gene transcriptional and post-transcriptional regulations. However, research on this area of RNA regulation remains limited in plants.</p><p><strong>Results: </strong>Here, we adapt the global RNA-RNA interaction mapping method for plants and develop plant RNA in situ conformation sequencing, pRIC-seq, to generate comprehensive RNA-RNA spatial interaction maps for diploid and tetraploid cotton. We also perform global nuclear run-on followed by cap-selection assay, GRO-cap, and integrate multi-omics data to construct enhancer landscapes in these cotton species. Focusing on enhancer-promoter (E-P) RNA interactions, we find that tetraploid cotton, following polyploidy, innovates numerous novel E-P RNA interactions, thereby increasing its genomic regulatory complexity. Comparative analyses between wild-type and mutant fuzzless/lintless in tetraploid cotton reveal that RNA-RNA interactions, including E-P RNA interactions, play pivotal roles in fiber development. Our study also identifies short tandem repeats and transposable elements as potential mediators of E-P RNA interactions through base pairing within the cotton genome. Finally, integrating with genome-wide association studies (GWAS) and eQTLs from previous studies, we observe that our RNA-RNA interactions are significantly enriched near those functional mutation sites. Importantly, by using RAP-qPCR, we confirm that GWAS related enhancers interact with the promoters of protein-coding genes, explaining their regulatory mechanisms in fiber trait control.</p><p><strong>Conclusions: </strong>Our results provide the first genome-wide RNA-RNA interaction map in higher plants and offer valuable insights into the enhancer-regulated pathway and targets for future breeding studies.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145896454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02DOI: 10.1186/s13059-025-03917-8
Yang Dong, Tao Cheng, Xiang Liu, Xin-Xin Fu, Yang Hu, Xian-Fa Yang, Ling-En Yang, Hao-Ran Li, Zhi-Wen Bian, Naihe Jing, Jie Liao, Xiaohui Fan, Peng-Fei Xu
Background: Elucidating the spatiotemporal dynamics of gene expression is essential for understanding complex physiological and pathological processes. Current spatial transcriptomics techniques are hindered by low read depths and limited gene detection.
Results: Here, we introduce Palette, a pipeline that infers detailed spatial gene expression patterns from bulk RNA-seq data, utilizing existing spatial transcriptomics data as the sole reference. This method identifies more precise expression patterns by smoothing, imputing and adjusting gene expressions. We apply Palette to reconstruct the zebrafish SpatioTemporal Expression Profiles (zSTEP) by integrating 53-slice serial bulk RNA-seq data from three developmental stages with existing spatial transcriptomics and image references. zSTEP provides a comprehensive cartographic resource for examining gene expression and investigating developmental events within zebrafish embryos. Utilizing machine learning-based screening, we identify key morphogens and transcription factors essential for anteroposterior axis development and characterized their dynamic distribution throughout embryogenesis. In addition, among these transcription factors, Hox family genes are found to be pivotal in anteroposterior axis refinement. Their expression is closely correlated with cellular anteroposterior identities, and hoxb genes may act as central regulators in this process.
Conclusions: This study presents Palette, a pipeline for integrating bulk RNA-seq data and spatial transcriptomics data, and zSTEP, a comprehensive cartographic resource for investigating zebrafish early embryonic development. In addition, key morphogens and transcriptional factors essential for anteroposterior axis establishment and refinement are identified.
{"title":"Unravelling the progression of the zebrafish primary body axis with reconstructed spatiotemporal transcriptomics.","authors":"Yang Dong, Tao Cheng, Xiang Liu, Xin-Xin Fu, Yang Hu, Xian-Fa Yang, Ling-En Yang, Hao-Ran Li, Zhi-Wen Bian, Naihe Jing, Jie Liao, Xiaohui Fan, Peng-Fei Xu","doi":"10.1186/s13059-025-03917-8","DOIUrl":"https://doi.org/10.1186/s13059-025-03917-8","url":null,"abstract":"<p><strong>Background: </strong>Elucidating the spatiotemporal dynamics of gene expression is essential for understanding complex physiological and pathological processes. Current spatial transcriptomics techniques are hindered by low read depths and limited gene detection.</p><p><strong>Results: </strong>Here, we introduce Palette, a pipeline that infers detailed spatial gene expression patterns from bulk RNA-seq data, utilizing existing spatial transcriptomics data as the sole reference. This method identifies more precise expression patterns by smoothing, imputing and adjusting gene expressions. We apply Palette to reconstruct the zebrafish SpatioTemporal Expression Profiles (zSTEP) by integrating 53-slice serial bulk RNA-seq data from three developmental stages with existing spatial transcriptomics and image references. zSTEP provides a comprehensive cartographic resource for examining gene expression and investigating developmental events within zebrafish embryos. Utilizing machine learning-based screening, we identify key morphogens and transcription factors essential for anteroposterior axis development and characterized their dynamic distribution throughout embryogenesis. In addition, among these transcription factors, Hox family genes are found to be pivotal in anteroposterior axis refinement. Their expression is closely correlated with cellular anteroposterior identities, and hoxb genes may act as central regulators in this process.</p><p><strong>Conclusions: </strong>This study presents Palette, a pipeline for integrating bulk RNA-seq data and spatial transcriptomics data, and zSTEP, a comprehensive cartographic resource for investigating zebrafish early embryonic development. In addition, key morphogens and transcriptional factors essential for anteroposterior axis establishment and refinement are identified.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":" ","pages":""},"PeriodicalIF":10.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145896468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}