Pub Date : 2026-02-01Epub Date: 2025-12-04DOI: 10.1038/s44320-025-00128-y
Antoni Matyjaszkiewicz, James Sharpe
Successful computational modelling of complex biological phenomena will depend on the seamless sharing of models and hypotheses between researchers of all backgrounds-experimental and theoretical. LimbNET, a new online tool for modelling, simulating and visualising spatiotemporal patterning in limb development, aims to facilitate this process within the limb development community. LimbNET enables remote users to define and simulate arbitrary gene regulatory network (GRN) models of 2D spatiotemporal developmental patterning processes. Researchers can test and compare each others' hypotheses within a common framework. A database of previously created models empowers users to simulate, explore, and extend each others' work. Spatiotemporally varying gene expression intensities, derived from image-based data, are mapped into a standardised computational description of limb growth, integrated within our modelling framework. This enables direct comparison not only between datasets but between data and simulation outputs, closing the feedback loop between experiments and simulation via parameter optimisation. All functionality is accessible through a web browser ( https://limbnet.embl.es ), requiring no special software, and opening the field of image-driven modelling to the full scientific community.
{"title":"LimbNET: collaborative platform for simulating spatial patterns of gene networks in limb development.","authors":"Antoni Matyjaszkiewicz, James Sharpe","doi":"10.1038/s44320-025-00128-y","DOIUrl":"10.1038/s44320-025-00128-y","url":null,"abstract":"<p><p>Successful computational modelling of complex biological phenomena will depend on the seamless sharing of models and hypotheses between researchers of all backgrounds-experimental and theoretical. LimbNET, a new online tool for modelling, simulating and visualising spatiotemporal patterning in limb development, aims to facilitate this process within the limb development community. LimbNET enables remote users to define and simulate arbitrary gene regulatory network (GRN) models of 2D spatiotemporal developmental patterning processes. Researchers can test and compare each others' hypotheses within a common framework. A database of previously created models empowers users to simulate, explore, and extend each others' work. Spatiotemporally varying gene expression intensities, derived from image-based data, are mapped into a standardised computational description of limb growth, integrated within our modelling framework. This enables direct comparison not only between datasets but between data and simulation outputs, closing the feedback loop between experiments and simulation via parameter optimisation. All functionality is accessible through a web browser ( https://limbnet.embl.es ), requiring no special software, and opening the field of image-driven modelling to the full scientific community.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":"228-240"},"PeriodicalIF":7.7,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12864987/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145678247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-12-10DOI: 10.1038/s44320-025-00172-8
Elisa Balmas, Maria L Ratto, Kirsten E Snijders, Silvia Becca, Carla Liaci, Irene Ricca, Giorgio R Merlo, Raffaele A Calogero, Luca Alessandrì, Sasha Mendjan, Alessandro Bertero
Functional genomics screens in human induced pluripotent stem cells (hiPSCs) remain challenging despite their transformative potential. We developed iPS2-seq: an inducible, clone-aware screening platform that enables phenotype-agnostic, single-cell resolved dissection of loss-of-function effects in hiPSC derivatives, including complex multicellular models such as organoids. iPS2-seq distinguishes true perturbation effects from genetic and epigenetic variability. It supports pooled and arrayed formats, integrates with microfluidic or split-pool single-cell RNA sequencing, and extends to multi-omic profiling of chromatin and proteins. A dedicated pipeline, catcheR, streamlines design and analysis. The platform enables stage-specific follow-up dissection of screen hits. We demonstrate this by targeting congenital heart disease-associated genes in monolayer cardiomyocytes and organoids. This reveals that epigenetic neuroectodermal priming interferes with germ layer differentiation in specific clones. Accounting for this bias, we show that SMAD2 controls cardiac progenitor specification, with knockdown redirecting cells toward fibroblast and epicardial fates. iPS2-seq unlocks rigorous functional genomics in hiPSC-based models.
{"title":"Single cell transcriptional perturbome in pluripotent stem cell models.","authors":"Elisa Balmas, Maria L Ratto, Kirsten E Snijders, Silvia Becca, Carla Liaci, Irene Ricca, Giorgio R Merlo, Raffaele A Calogero, Luca Alessandrì, Sasha Mendjan, Alessandro Bertero","doi":"10.1038/s44320-025-00172-8","DOIUrl":"10.1038/s44320-025-00172-8","url":null,"abstract":"<p><p>Functional genomics screens in human induced pluripotent stem cells (hiPSCs) remain challenging despite their transformative potential. We developed iPS2-seq: an inducible, clone-aware screening platform that enables phenotype-agnostic, single-cell resolved dissection of loss-of-function effects in hiPSC derivatives, including complex multicellular models such as organoids. iPS2-seq distinguishes true perturbation effects from genetic and epigenetic variability. It supports pooled and arrayed formats, integrates with microfluidic or split-pool single-cell RNA sequencing, and extends to multi-omic profiling of chromatin and proteins. A dedicated pipeline, catcheR, streamlines design and analysis. The platform enables stage-specific follow-up dissection of screen hits. We demonstrate this by targeting congenital heart disease-associated genes in monolayer cardiomyocytes and organoids. This reveals that epigenetic neuroectodermal priming interferes with germ layer differentiation in specific clones. Accounting for this bias, we show that SMAD2 controls cardiac progenitor specification, with knockdown redirecting cells toward fibroblast and epicardial fates. iPS2-seq unlocks rigorous functional genomics in hiPSC-based models.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":"179-227"},"PeriodicalIF":7.7,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12864791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145724852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1038/s44320-026-00190-0
Duc Tung Vu, William Sibran, Andreas Metousis, Laurine Vandewynckel, Basak Eraslan, Liesel Goveas, Ericka Cm Itang, Claire Deldycke, Adriana Figueroa-Garcia, Réginald Lefèbvre, Johannes Bruno Müller-Reif, Sebastian Virreira Winter, Marie-Christine Chartier-Harlin, Jean-Marc Taymans, Matthias Mann, Ozge Karayel
Pathogenic mutations in Leucine-rich repeat kinase 2 (LRRK2) are the predominant genetic cause of Parkinson's disease (PD) and often increase kinase activity, making LRRK2 inhibitors promising treatment options. Although LRRK2 kinase inhibitors are advancing clinically, non-invasive readouts of LRRK2-linked pathway modulation remain limited. Profiling urinary proteomes from 1215 individuals across three cohorts and integrating whole-genome sequencing from >500 participants to map genotype-proteome associations, we identified 177 urinary proteins associated with pathogenic LRRK2, enriched for lysosomal/glycosphingolipid, immune, and membrane-trafficking pathways. Machine learning narrowed the features to a cohort-agnostic 30-protein panel that classified G2019S carriers with a mean ROC AUC of 0.91 across independent tests. To evaluate translation, we performed multi-organ and urinary proteomics in rat gain- and loss-of-function models (BAC-LRRK2G2019S and Lrrk2KO) and after Lrrk2 inhibition (MLi-2 and PF-475), revealing tissue-specific responses-strongest in kidney-and cross-species overlap, including 24 brain proteins detectable in human urine. Rat-derived perturbations predicted LRRK2 mutation status in patients (AUC 0.75) and reversed with Lrrk2 inhibition, supporting their pharmacodynamic utility. Together, our findings establish urine as a scalable, non-invasive matrix that captures systemic and brain-relevant consequences of LRRK2 dysfunction and nominate candidate pharmacodynamic markers set to support LRRK2-directed trials.
{"title":"Multi-cohort, cross-species urinary proteomics reveals signatures of LRRK2 dysfunction in Parkinson's disease.","authors":"Duc Tung Vu, William Sibran, Andreas Metousis, Laurine Vandewynckel, Basak Eraslan, Liesel Goveas, Ericka Cm Itang, Claire Deldycke, Adriana Figueroa-Garcia, Réginald Lefèbvre, Johannes Bruno Müller-Reif, Sebastian Virreira Winter, Marie-Christine Chartier-Harlin, Jean-Marc Taymans, Matthias Mann, Ozge Karayel","doi":"10.1038/s44320-026-00190-0","DOIUrl":"https://doi.org/10.1038/s44320-026-00190-0","url":null,"abstract":"<p><p>Pathogenic mutations in Leucine-rich repeat kinase 2 (LRRK2) are the predominant genetic cause of Parkinson's disease (PD) and often increase kinase activity, making LRRK2 inhibitors promising treatment options. Although LRRK2 kinase inhibitors are advancing clinically, non-invasive readouts of LRRK2-linked pathway modulation remain limited. Profiling urinary proteomes from 1215 individuals across three cohorts and integrating whole-genome sequencing from >500 participants to map genotype-proteome associations, we identified 177 urinary proteins associated with pathogenic LRRK2, enriched for lysosomal/glycosphingolipid, immune, and membrane-trafficking pathways. Machine learning narrowed the features to a cohort-agnostic 30-protein panel that classified G2019S carriers with a mean ROC AUC of 0.91 across independent tests. To evaluate translation, we performed multi-organ and urinary proteomics in rat gain- and loss-of-function models (BAC-LRRK2<sup>G2019S</sup> and Lrrk2<sup>KO</sup>) and after Lrrk2 inhibition (MLi-2 and PF-475), revealing tissue-specific responses-strongest in kidney-and cross-species overlap, including 24 brain proteins detectable in human urine. Rat-derived perturbations predicted LRRK2 mutation status in patients (AUC 0.75) and reversed with Lrrk2 inhibition, supporting their pharmacodynamic utility. Together, our findings establish urine as a scalable, non-invasive matrix that captures systemic and brain-relevant consequences of LRRK2 dysfunction and nominate candidate pharmacodynamic markers set to support LRRK2-directed trials.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146086478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-22DOI: 10.1038/s44320-026-00187-9
Erik Marcel Heller, Karen Barthel, Markus Räschle, Klaske M Schukken, Jason M Sheltzer, Zuzana Storchová
Aneuploidy, a hallmark of cancer, alters chromosome copy numbers and with that the abundance of hundreds of proteins. Evidence suggests that levels of proteins encoded on affected chromosomes are often buffered toward their abundances observed in diploids. Despite its prevalence, the molecular mechanisms driving this protein dosage compensation remain largely unknown. It is unclear whether all proteins are buffered similarly, what factors determine buffering, and whether dosage compensation varies across different cell lines or tumor types. Moreover, its potential adaptive advantage and therapeutic relevance remain unexplored. We established a novel approach to quantify protein dosage buffering in a gene copy number-dependent manner, showing that dosage compensation is widespread but variable in cancer samples. By developing multifactorial machine learning models, we identify gene dependency, protein complex participation, haploinsufficiency, and mRNA decay as key predictors of buffering. We show that dosage compensation affects oncogenic potential and that higher buffering correlates with reduced proteotoxic stress and increased drug resistance. These findings highlight protein dosage compensation as a crucial regulatory mechanism with therapeutic potential in aneuploid cancers.
{"title":"Protein buffering of aneuploidy is driven by coordinated factors identified through machine learning.","authors":"Erik Marcel Heller, Karen Barthel, Markus Räschle, Klaske M Schukken, Jason M Sheltzer, Zuzana Storchová","doi":"10.1038/s44320-026-00187-9","DOIUrl":"https://doi.org/10.1038/s44320-026-00187-9","url":null,"abstract":"<p><p>Aneuploidy, a hallmark of cancer, alters chromosome copy numbers and with that the abundance of hundreds of proteins. Evidence suggests that levels of proteins encoded on affected chromosomes are often buffered toward their abundances observed in diploids. Despite its prevalence, the molecular mechanisms driving this protein dosage compensation remain largely unknown. It is unclear whether all proteins are buffered similarly, what factors determine buffering, and whether dosage compensation varies across different cell lines or tumor types. Moreover, its potential adaptive advantage and therapeutic relevance remain unexplored. We established a novel approach to quantify protein dosage buffering in a gene copy number-dependent manner, showing that dosage compensation is widespread but variable in cancer samples. By developing multifactorial machine learning models, we identify gene dependency, protein complex participation, haploinsufficiency, and mRNA decay as key predictors of buffering. We show that dosage compensation affects oncogenic potential and that higher buffering correlates with reduced proteotoxic stress and increased drug resistance. These findings highlight protein dosage compensation as a crucial regulatory mechanism with therapeutic potential in aneuploid cancers.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146030393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Human viruses rely on host translation resources, including the cellular tRNA pool, because they lack tRNA genes. Using tRNA sequencing, we profiled mature tRNAs during infections with human cytomegalovirus (HCMV) and SARS-CoV-2. HCMV-induced alterations in mature tRNA levels were predominantly virus-driven, with minimal influence from the cellular immune response. Certain post-transcriptional modifications, correlated with tRNA stability, were actively manipulated by HCMV. By contrast, SARS-CoV-2 caused minimal changes in mature tRNA levels or modifications. Comparing viral codon usage with proliferation- versus differentiation-associated codon-usage signatures in human genes revealed striking divergence. HCMV genes aligned with differentiation codon usage, whereas SARS-CoV-2 genes matched proliferation codon usage. Structural and gene-expression genes in both viruses showed strong adaptation to host tRNA pools. Finally, a systematic CRISPR screen of human tRNA genes and tRNA-modifying enzymes identified specific tRNAs and enzymes that either enhanced or restricted HCMV infectivity and influenced cellular growth. Together, these data define a dynamic interplay between the host tRNA landscape and viral infection, illuminating the mechanisms governing host-virus interactions.
{"title":"Essentiality and dynamic expression of the human tRNA pool during viral infection.","authors":"Noa Aharon-Hefetz, Michal Schwartz, Einav Aharon, Noam Stern-Ginossar, Orna Dahan, Yitzhak Pilpel","doi":"10.1038/s44320-025-00181-7","DOIUrl":"https://doi.org/10.1038/s44320-025-00181-7","url":null,"abstract":"<p><p>Human viruses rely on host translation resources, including the cellular tRNA pool, because they lack tRNA genes. Using tRNA sequencing, we profiled mature tRNAs during infections with human cytomegalovirus (HCMV) and SARS-CoV-2. HCMV-induced alterations in mature tRNA levels were predominantly virus-driven, with minimal influence from the cellular immune response. Certain post-transcriptional modifications, correlated with tRNA stability, were actively manipulated by HCMV. By contrast, SARS-CoV-2 caused minimal changes in mature tRNA levels or modifications. Comparing viral codon usage with proliferation- versus differentiation-associated codon-usage signatures in human genes revealed striking divergence. HCMV genes aligned with differentiation codon usage, whereas SARS-CoV-2 genes matched proliferation codon usage. Structural and gene-expression genes in both viruses showed strong adaptation to host tRNA pools. Finally, a systematic CRISPR screen of human tRNA genes and tRNA-modifying enzymes identified specific tRNAs and enzymes that either enhanced or restricted HCMV infectivity and influenced cellular growth. Together, these data define a dynamic interplay between the host tRNA landscape and viral infection, illuminating the mechanisms governing host-virus interactions.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146011481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1038/s44320-026-00189-7
Horia Todor, Lili M Kim, Jürgen Jänes, Hannah N Burkhart, Seth A Darst, Pedro Beltrao, Carol A Gross
Accurate prediction of protein complex structures by AlphaFold3 and similar programs has been used to predict the presence of protein-protein interactions (PPIs), but this technique has never been applied to an entire genome due to onerous computational requirements and questionable utility. Here we present pooled-PPI prediction, a technique that dramatically improves the accuracy of genome-scale screens compared to a paired approach while simultaneously reducing inference time (~twofold) and the number of jobs (~100-fold). We use this technique to predict the structure of all 113,050 pairwise PPIs in Mycoplasma genitalium using only 2027 AlphaFold3 jobs. This unbiased and comprehensive dataset was highly predictive of known interactions, revealed a previously unappreciated but widespread size bias in AlphaFold interface scores, correctly identified protein-protein interfaces in macromolecular complexes, and uncovered new biology in M. genitalium. This work establishes pooled-PPI prediction as a highly scalable method for uncovering protein-protein interactions and a powerful addition to the functional genomics toolkit.
{"title":"Predicting the protein interaction landscape of a free-living bacterium with pooled-AlphaFold3.","authors":"Horia Todor, Lili M Kim, Jürgen Jänes, Hannah N Burkhart, Seth A Darst, Pedro Beltrao, Carol A Gross","doi":"10.1038/s44320-026-00189-7","DOIUrl":"10.1038/s44320-026-00189-7","url":null,"abstract":"<p><p>Accurate prediction of protein complex structures by AlphaFold3 and similar programs has been used to predict the presence of protein-protein interactions (PPIs), but this technique has never been applied to an entire genome due to onerous computational requirements and questionable utility. Here we present pooled-PPI prediction, a technique that dramatically improves the accuracy of genome-scale screens compared to a paired approach while simultaneously reducing inference time (~twofold) and the number of jobs (~100-fold). We use this technique to predict the structure of all 113,050 pairwise PPIs in Mycoplasma genitalium using only 2027 AlphaFold3 jobs. This unbiased and comprehensive dataset was highly predictive of known interactions, revealed a previously unappreciated but widespread size bias in AlphaFold interface scores, correctly identified protein-protein interfaces in macromolecular complexes, and uncovered new biology in M. genitalium. This work establishes pooled-PPI prediction as a highly scalable method for uncovering protein-protein interactions and a powerful addition to the functional genomics toolkit.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146011445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1038/s44320-025-00186-2
Viola Hollek, Francisca Böhning, Catalina Florez Vargas, Anja Sieber, Markus Morkel, Nils Blüthgen
Oncogenic mutations shape colorectal cancer (CRC) biology, yet their impact on transcriptional phenotypes remains incompletely understood, and their individual prognostic value is limited. Here, we perform a pooled single-cell transcriptomic screen of over 100,000 CRC cells with a comprehensive barcoded library of oncogenic variants across genetically diverse CRC lines. Using a variational autoencoder-based interpretable factor model, we identify ten conserved oncogene-driven transcriptional modules (TMOs) representing core cancer phenotypes such as cellular plasticity, inflammatory response, replicative stress, and epithelial-to-mesenchymal transition. Engagement of these modules can be context-dependent, reflecting interactions between oncogene-induced driver pathways and background genetics. TMO activity in patient tumors stratifies CRC cohorts into high- and low-risk groups, improving relapse-free survival prediction beyond existing classification systems. Our study systematically links oncogenic signaling to transcriptional states and clinical outcomes, establishing a functional framework for module-based patient stratification in precision oncology.
{"title":"Pooled single-cell screen in colorectal cancer defines transcriptional modules linked to oncogenes.","authors":"Viola Hollek, Francisca Böhning, Catalina Florez Vargas, Anja Sieber, Markus Morkel, Nils Blüthgen","doi":"10.1038/s44320-025-00186-2","DOIUrl":"https://doi.org/10.1038/s44320-025-00186-2","url":null,"abstract":"<p><p>Oncogenic mutations shape colorectal cancer (CRC) biology, yet their impact on transcriptional phenotypes remains incompletely understood, and their individual prognostic value is limited. Here, we perform a pooled single-cell transcriptomic screen of over 100,000 CRC cells with a comprehensive barcoded library of oncogenic variants across genetically diverse CRC lines. Using a variational autoencoder-based interpretable factor model, we identify ten conserved oncogene-driven transcriptional modules (TMOs) representing core cancer phenotypes such as cellular plasticity, inflammatory response, replicative stress, and epithelial-to-mesenchymal transition. Engagement of these modules can be context-dependent, reflecting interactions between oncogene-induced driver pathways and background genetics. TMO activity in patient tumors stratifies CRC cohorts into high- and low-risk groups, improving relapse-free survival prediction beyond existing classification systems. Our study systematically links oncogenic signaling to transcriptional states and clinical outcomes, establishing a functional framework for module-based patient stratification in precision oncology.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146003796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1038/s44320-025-00184-4
Marcell Veiner, Fran Supek
Following their success in natural language processing and protein biology, pretrained large language models have started appearing in genomics in large numbers. These genomic language models (gLMs), trained on diverse DNA and RNA sequences, promise improved performance on a variety of downstream prediction and understanding tasks. In this review, we trace the rapid evolution of gLMs, analyze current trends, and offer an overview of their application in genomic research. We investigate each gLM component in detail, from training data curation to the architecture, and highlight the present trends of increasing model complexity. We review major benchmarking efforts, suggesting that no single model dominates, and that task-specific design and pretraining data often outweigh general model scale or architecture. In addition, we discuss requirements for making gLMs practically useful for genomic research. While several applications, ranging from genome annotation to DNA sequence generation, showcase the potential of gLMs, their use highlights gaps and pitfalls that remain unresolved. This guide aims to equip researchers with a grounded understanding of gLM capabilities, limitations, and best practices for their effective use in genomics.
{"title":"The DNA dialect: a comprehensive guide to pretrained genomic language models.","authors":"Marcell Veiner, Fran Supek","doi":"10.1038/s44320-025-00184-4","DOIUrl":"https://doi.org/10.1038/s44320-025-00184-4","url":null,"abstract":"<p><p>Following their success in natural language processing and protein biology, pretrained large language models have started appearing in genomics in large numbers. These genomic language models (gLMs), trained on diverse DNA and RNA sequences, promise improved performance on a variety of downstream prediction and understanding tasks. In this review, we trace the rapid evolution of gLMs, analyze current trends, and offer an overview of their application in genomic research. We investigate each gLM component in detail, from training data curation to the architecture, and highlight the present trends of increasing model complexity. We review major benchmarking efforts, suggesting that no single model dominates, and that task-specific design and pretraining data often outweigh general model scale or architecture. In addition, we discuss requirements for making gLMs practically useful for genomic research. While several applications, ranging from genome annotation to DNA sequence generation, showcase the potential of gLMs, their use highlights gaps and pitfalls that remain unresolved. This guide aims to equip researchers with a grounded understanding of gLM capabilities, limitations, and best practices for their effective use in genomics.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146003753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15DOI: 10.1038/s44320-026-00188-8
Yu-Long Zhao, Yi-Ming Zhao, Yi-Fang Yan, Ning Yang, Si-Nan Ma, Rui-Jia Wang, Gui-Hai Feng, Zhi-Kun Li, Wei Li, Li-Bin Wang
Why eukaryotic genomes are universally divided among multiple chromosomes remains an unresolved question. Although yeast and mouse cells can tolerate chromosomal fusions without impairing viability, we show here that chromosome length in mammalian cells is constrained by a biophysical limit governed by spindle geometry. Using engineered mouse cells carrying fused chromosomes of defined sizes, we identify ~308 Mb as the maximal length tolerated for faithful mitosis. Chromosomes exceeding this threshold disrupt segregation, leading to daughter cell re-coalescence and polyploidization. Aurora B kinase regulates this process by modulating spindle elongation; its inhibition induces mitotic failure even in chromosome configurations within the tolerated threshold of ~308 Mb. These findings explain the structural basis for genome fragmentation in animals and reveal a general mechanism linking chromosome size, spindle dynamics, and genome stability.
{"title":"Chromosome length is constrained by spindle scaling to ensure faithful mitosis in mammals.","authors":"Yu-Long Zhao, Yi-Ming Zhao, Yi-Fang Yan, Ning Yang, Si-Nan Ma, Rui-Jia Wang, Gui-Hai Feng, Zhi-Kun Li, Wei Li, Li-Bin Wang","doi":"10.1038/s44320-026-00188-8","DOIUrl":"https://doi.org/10.1038/s44320-026-00188-8","url":null,"abstract":"<p><p>Why eukaryotic genomes are universally divided among multiple chromosomes remains an unresolved question. Although yeast and mouse cells can tolerate chromosomal fusions without impairing viability, we show here that chromosome length in mammalian cells is constrained by a biophysical limit governed by spindle geometry. Using engineered mouse cells carrying fused chromosomes of defined sizes, we identify ~308 Mb as the maximal length tolerated for faithful mitosis. Chromosomes exceeding this threshold disrupt segregation, leading to daughter cell re-coalescence and polyploidization. Aurora B kinase regulates this process by modulating spindle elongation; its inhibition induces mitotic failure even in chromosome configurations within the tolerated threshold of ~308 Mb. These findings explain the structural basis for genome fragmentation in animals and reveal a general mechanism linking chromosome size, spindle dynamics, and genome stability.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145989942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1038/s44320-025-00185-3
Juan Carlos Nunez-Rodriguez, Miquel Àngel Schikora-Tamarit, Toni Gabaldón
The increasing prevalence of antifungal resistance represents a major clinical challenge. To explore potential new therapeutic avenues, we investigated fitness trade-offs associated with azole and echinocandin resistance in Nakaseomyces glabratus (syn. Candida glabrata), a priority yeast pathogen showing growing incidence of drug and multidrug resistance. For this, we comprehensively phenotyped a large collection (n = 77) of azole- and echinocandin-resistant strains to uncover resistance-associated stress sensitivity trade-offs. Our results show that increased stress sensitivity is a common trade-off of drug resistance in this species, with 98% of resistant strains exhibiting reduced fitness under at least one of six assayed stresses. Despite the diversity of genetic backgrounds and resistance mechanisms represented by our collection, we identified consistent trends in some resistance-associated vulnerabilities. Using multivariate modeling we uncovered complex genetic interactions underlying these trade-offs. As a proof of concept for therapeutic potential, we experimentally validated the inhibitory effects of targeting some fitness trade-offs. Cyclosporin A selectively inhibited anidulafungin-resistant strains, while NaCl effectively suppressed the emergence of fluconazole resistance. This study highlights the widespread occurrence of fitness costs associated with antifungal resistance and emphasizes their potential as a novel therapeutic strategy against this growing threat.
{"title":"Uncovering actionable trade-offs of antifungal resistance in a yeast pathogen.","authors":"Juan Carlos Nunez-Rodriguez, Miquel Àngel Schikora-Tamarit, Toni Gabaldón","doi":"10.1038/s44320-025-00185-3","DOIUrl":"https://doi.org/10.1038/s44320-025-00185-3","url":null,"abstract":"<p><p>The increasing prevalence of antifungal resistance represents a major clinical challenge. To explore potential new therapeutic avenues, we investigated fitness trade-offs associated with azole and echinocandin resistance in Nakaseomyces glabratus (syn. Candida glabrata), a priority yeast pathogen showing growing incidence of drug and multidrug resistance. For this, we comprehensively phenotyped a large collection (n = 77) of azole- and echinocandin-resistant strains to uncover resistance-associated stress sensitivity trade-offs. Our results show that increased stress sensitivity is a common trade-off of drug resistance in this species, with 98% of resistant strains exhibiting reduced fitness under at least one of six assayed stresses. Despite the diversity of genetic backgrounds and resistance mechanisms represented by our collection, we identified consistent trends in some resistance-associated vulnerabilities. Using multivariate modeling we uncovered complex genetic interactions underlying these trade-offs. As a proof of concept for therapeutic potential, we experimentally validated the inhibitory effects of targeting some fitness trade-offs. Cyclosporin A selectively inhibited anidulafungin-resistant strains, while NaCl effectively suppressed the emergence of fluconazole resistance. This study highlights the widespread occurrence of fitness costs associated with antifungal resistance and emphasizes their potential as a novel therapeutic strategy against this growing threat.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145958800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}