Pub Date : 2026-01-07DOI: 10.1093/genetics/iyaf219
Brieuc Lehmann, Hanbin Lee, Luke Anderson-Trocmé, Jerome Kelleher, Gregor Gorjanc, Peter L Ralph
Genetic relatedness is a central concept in genetics, underpinning studies of population and quantitative genetics in human, animal, and plant settings. It is typically stored as a genetic relatedness matrix, whose elements are pairwise relatedness values between individuals. This relatedness has been defined in various contexts based on pedigree, genotype, phylogeny, coalescent times, and, recently, ancestral recombination graph. For some downstream applications, including association studies, using ancestral recombination graph-based genetic relatedness matrices has led to better performance relative to the genotype genetic relatedness matrix. However, they present computational challenges due to their inherent quadratic time and space complexity. Here, we first discuss the different definitions of relatedness in a unifying context, making use of the additive model of a quantitative trait to provide a definition of "branch relatedness" and the corresponding "branch genetic relatedness matrix". We explore the relationship between branch relatedness and pedigree relatedness (i.e. kinship) through a case study of French-Canadian individuals that have a known pedigree. Through the tree sequence encoding of an ancestral recombination graph, we then derive an efficient algorithm for computing products between the branch genetic relatedness matrix and a general vector, without explicitly forming the branch genetic relatedness matrix. This algorithm leverages the sparse encoding of genomes with the tree sequence and hence enables large-scale computations with the branch genetic relatedness matrix. We demonstrate the power of this algorithm by developing a randomized principal components algorithm for tree sequences that easily scales to millions of genomes. All algorithms are implemented in the open source tskit Python package. Taken together, this work consolidates the different notions of relatedness as branch relatedness and, by leveraging the tree sequence encoding of an ancestral recombination graph, provides efficient algorithms that enable computations with the branch genetic relatedness matrix that scale to mega-scale genomic datasets.
{"title":"On ARGs, pedigrees, and genetic relatedness matrices.","authors":"Brieuc Lehmann, Hanbin Lee, Luke Anderson-Trocmé, Jerome Kelleher, Gregor Gorjanc, Peter L Ralph","doi":"10.1093/genetics/iyaf219","DOIUrl":"10.1093/genetics/iyaf219","url":null,"abstract":"<p><p>Genetic relatedness is a central concept in genetics, underpinning studies of population and quantitative genetics in human, animal, and plant settings. It is typically stored as a genetic relatedness matrix, whose elements are pairwise relatedness values between individuals. This relatedness has been defined in various contexts based on pedigree, genotype, phylogeny, coalescent times, and, recently, ancestral recombination graph. For some downstream applications, including association studies, using ancestral recombination graph-based genetic relatedness matrices has led to better performance relative to the genotype genetic relatedness matrix. However, they present computational challenges due to their inherent quadratic time and space complexity. Here, we first discuss the different definitions of relatedness in a unifying context, making use of the additive model of a quantitative trait to provide a definition of \"branch relatedness\" and the corresponding \"branch genetic relatedness matrix\". We explore the relationship between branch relatedness and pedigree relatedness (i.e. kinship) through a case study of French-Canadian individuals that have a known pedigree. Through the tree sequence encoding of an ancestral recombination graph, we then derive an efficient algorithm for computing products between the branch genetic relatedness matrix and a general vector, without explicitly forming the branch genetic relatedness matrix. This algorithm leverages the sparse encoding of genomes with the tree sequence and hence enables large-scale computations with the branch genetic relatedness matrix. We demonstrate the power of this algorithm by developing a randomized principal components algorithm for tree sequences that easily scales to millions of genomes. All algorithms are implemented in the open source tskit Python package. Taken together, this work consolidates the different notions of relatedness as branch relatedness and, by leveraging the tree sequence encoding of an ancestral recombination graph, provides efficient algorithms that enable computations with the branch genetic relatedness matrix that scale to mega-scale genomic datasets.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12774834/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145253359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1093/genetics/iyaf224
Michael J Stinchfield, Sudhindra R Gadagkar, Michael B O'Connor, Stuart J Newfeld
Human ApolipoproteinB (ApoB) exists in two isoforms that are packaged into low density lipoprotein particles and are major contributors to atherosclerosis. Alternatively, Drosophila Apolipoprotein Lipophorin (ApoLpp) also exists in two isoforms packaged into lipoprotein particles that cross the blood-brain barrier (BBB) in second instar larvae where they deliver lipids to neuroblasts. To extend our understanding of ApoLpp function to adult brains and suggest new hypotheses for human ApoB, we document evolutionary conservation between the two N-terminal isoforms human ApoB48 and fly ApoLppII. Then our tissue-specific analyses including rescue of apolpp lethality and apolpp RNAi showed that apolpp expression in the fat body is both necessary and sufficient for survival to adulthood. Our imaging studies of ApoLpp in the adult brain employed endogenous isoform-specific tagged proteins generated by the Fourth Chromosome Resource Project. Images revealed that both ApoLpp isoforms are present in the adult brain with ApoLppII accumulation prominent near glia. Nanobody morphotrap experiments that blocked tagged ApoLpp at the BBB demonstrated that ApoLpp detected inside the adult brain is exogenous. An N- and C-terminal tagged ApoLpp transgene expressed solely in the fat body facilitated tracking of each isoform from fat body secretion to the BBB and then inside the adult brain. Overall, our data suggest that the known role of ApoLpp in lipid delivery to larval brains likely continues in adults. Strong conservation between ApoLppII and ApoB48 supports the hypothesis that ApoB48 may have a role in the brain outside the circulatory system.
{"title":"Both isoforms of Drosophila ApoLpp (ApoB) cross the blood-brain barrier in adults.","authors":"Michael J Stinchfield, Sudhindra R Gadagkar, Michael B O'Connor, Stuart J Newfeld","doi":"10.1093/genetics/iyaf224","DOIUrl":"10.1093/genetics/iyaf224","url":null,"abstract":"<p><p>Human ApolipoproteinB (ApoB) exists in two isoforms that are packaged into low density lipoprotein particles and are major contributors to atherosclerosis. Alternatively, Drosophila Apolipoprotein Lipophorin (ApoLpp) also exists in two isoforms packaged into lipoprotein particles that cross the blood-brain barrier (BBB) in second instar larvae where they deliver lipids to neuroblasts. To extend our understanding of ApoLpp function to adult brains and suggest new hypotheses for human ApoB, we document evolutionary conservation between the two N-terminal isoforms human ApoB48 and fly ApoLppII. Then our tissue-specific analyses including rescue of apolpp lethality and apolpp RNAi showed that apolpp expression in the fat body is both necessary and sufficient for survival to adulthood. Our imaging studies of ApoLpp in the adult brain employed endogenous isoform-specific tagged proteins generated by the Fourth Chromosome Resource Project. Images revealed that both ApoLpp isoforms are present in the adult brain with ApoLppII accumulation prominent near glia. Nanobody morphotrap experiments that blocked tagged ApoLpp at the BBB demonstrated that ApoLpp detected inside the adult brain is exogenous. An N- and C-terminal tagged ApoLpp transgene expressed solely in the fat body facilitated tracking of each isoform from fat body secretion to the BBB and then inside the adult brain. Overall, our data suggest that the known role of ApoLpp in lipid delivery to larval brains likely continues in adults. Strong conservation between ApoLppII and ApoB48 supports the hypothesis that ApoB48 may have a role in the brain outside the circulatory system.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12774823/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145304000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1093/genetics/iyaf235
Scott A Keith, Ananda A Kalukin, Dana S Vargas Solivan, Melanie R Smee, Brian P Lazzaro
The ability to direct tissue-specific overexpression of transgenic proteins in genetically tractable organisms like Drosophila melanogaster has facilitated innumerable biological discoveries. However, transgenic proteins can themselves impact cellular and physiological processes in ways that are often ignored or poorly defined. Here we discovered that the yolk-GAL4 transgene, which directs strong expression of the yeast GAL4 transcription factor in the Drosophila fat body, induces significant physiological defects in adult female flies. We found that yolk-GAL4 disrupts adipose tissue integrity and reduces fat body lipid stores, egg production, and resistance to systemic bacterial infections. Knocking down GAL4 expression in yolk-GAL4 heterozygotes using RNAi fully suppressed each of these defects, thus confirming that the GAL4 transgene product induces these phenotypes. Comparing a panel of additional fat body driver lines, we found that GAL4 expression levels directly correlate with infection susceptibility, but not with fat levels or egg production. To determine whether other transgenic proteins can impair fat body function, we constructed new fly lines in which the yolk enhancer directs expression of either cytoplasmic or nuclear-localized mCherry, or an alternative transactivator, LexA. We found that only nuclear-localized mCherry and LexA increased infection susceptibility similarly to GAL4, suggesting that intranuclear transgenic proteins in general can curtail the fat body's induced immune response in a manner highly sensitive to transgene expression strength. Additionally, these new lines can be valuable tools for future studies. More broadly, our findings highlight the potential for transgenes to substantially impact organismal biology and emphasize the importance of rigorously characterizing genetic tools to optimally leverage model systems like Drosophila.
{"title":"Strong GAL4 expression compromises Drosophila fat body function.","authors":"Scott A Keith, Ananda A Kalukin, Dana S Vargas Solivan, Melanie R Smee, Brian P Lazzaro","doi":"10.1093/genetics/iyaf235","DOIUrl":"10.1093/genetics/iyaf235","url":null,"abstract":"<p><p>The ability to direct tissue-specific overexpression of transgenic proteins in genetically tractable organisms like Drosophila melanogaster has facilitated innumerable biological discoveries. However, transgenic proteins can themselves impact cellular and physiological processes in ways that are often ignored or poorly defined. Here we discovered that the yolk-GAL4 transgene, which directs strong expression of the yeast GAL4 transcription factor in the Drosophila fat body, induces significant physiological defects in adult female flies. We found that yolk-GAL4 disrupts adipose tissue integrity and reduces fat body lipid stores, egg production, and resistance to systemic bacterial infections. Knocking down GAL4 expression in yolk-GAL4 heterozygotes using RNAi fully suppressed each of these defects, thus confirming that the GAL4 transgene product induces these phenotypes. Comparing a panel of additional fat body driver lines, we found that GAL4 expression levels directly correlate with infection susceptibility, but not with fat levels or egg production. To determine whether other transgenic proteins can impair fat body function, we constructed new fly lines in which the yolk enhancer directs expression of either cytoplasmic or nuclear-localized mCherry, or an alternative transactivator, LexA. We found that only nuclear-localized mCherry and LexA increased infection susceptibility similarly to GAL4, suggesting that intranuclear transgenic proteins in general can curtail the fat body's induced immune response in a manner highly sensitive to transgene expression strength. Additionally, these new lines can be valuable tools for future studies. More broadly, our findings highlight the potential for transgenes to substantially impact organismal biology and emphasize the importance of rigorously characterizing genetic tools to optimally leverage model systems like Drosophila.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12774853/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145423200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sequence-specific transcription factors (TFs) are key regulators of many biological processes, controlling the expression of their target genes (TGs) through binding to the cis-regulatory regions such as promoters and enhancers. Each TF has unique DNA binding site motifs, and large-scale experiments have been conducted to characterize TF-DNA binding preferences. However, no comprehensive resource currently integrates these datasets for Drosophila. To address this need, we developed TF2TG ("transcription factor" to "target gene"), a comprehensive resource that combines both in vitro and in vivo datasets to link TFs to their TGs based on TF-DNA binding preferences along with the protein-protein interaction data, tissue-specific transcriptomic data, and chromatin accessibility data. Although the genome offers numerous potential binding sites for each TF, only a subset is actually bound in vivo, and of these, only a fraction is functionally relevant. For instance, some TFs bind to their specific sites due to synergistic interactions with other factors nearby. This integration provides users with a comprehensive list of potential candidates as well as aids users in ranking candidate genes and determining condition-specific TF binding for studying transcriptional regulation in Drosophila.
{"title":"TF2TG: an online resource mining the potential gene targets of transcription factors in Drosophila.","authors":"Yanhui Hu, Jonathan Rodiger, Yifang Liu, Chenxi Gao, Ying Liu, Mujeeb Qadiri, Austin Veal, Martha Leonia Bulyk, Norbert Perrimon","doi":"10.1093/genetics/iyaf082","DOIUrl":"10.1093/genetics/iyaf082","url":null,"abstract":"<p><p>Sequence-specific transcription factors (TFs) are key regulators of many biological processes, controlling the expression of their target genes (TGs) through binding to the cis-regulatory regions such as promoters and enhancers. Each TF has unique DNA binding site motifs, and large-scale experiments have been conducted to characterize TF-DNA binding preferences. However, no comprehensive resource currently integrates these datasets for Drosophila. To address this need, we developed TF2TG (\"transcription factor\" to \"target gene\"), a comprehensive resource that combines both in vitro and in vivo datasets to link TFs to their TGs based on TF-DNA binding preferences along with the protein-protein interaction data, tissue-specific transcriptomic data, and chromatin accessibility data. Although the genome offers numerous potential binding sites for each TF, only a subset is actually bound in vivo, and of these, only a fraction is functionally relevant. For instance, some TFs bind to their specific sites due to synergistic interactions with other factors nearby. This integration provides users with a comprehensive list of potential candidates as well as aids users in ranking candidate genes and determining condition-specific TF binding for studying transcriptional regulation in Drosophila.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12774851/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144026596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Protein translation regulation is critical for cellular responses and development, yet how elongation stage disruptions shape these processes remains incompletely understood. Here, we identify a single amino acid substitution (P55Q) in the ribosomal protein RPL-36A of Caenorhabditis elegans that confers complete resistance to the elongation inhibitor cycloheximide (CHX). Heterozygous animals carrying both wild-type RPL-36A and RPL-36A(P55Q) develop normally but show intermediate CHX resistance, indicating a partial dominant effect. Leveraging RPL-36A(P55Q) as a single-copy positive selection marker for CRISPR-based genome editing, we introduced targeted modifications into multiple ribosomal protein genes, confirming its broad utility for altering essential loci. In L4-stage heterozygotes, where CHX-sensitive and CHX-resistant ribosomes coexist, ribosome profiling revealed increased start-codon occupancy, reduced disome formation, and no codon-specific pausing. Surprisingly, chronic CHX treatment did not activate canonical stress pathways (ribosome quality control, integrated stress response, and ribotoxic stress response), as indicated by the absence of RPS-10 ubiquitination, eIF2α or PMK-1 phosphorylation, or ATF-4 induction. Instead, RNA-normalized ribosome footprints revealed selective changes in translation efficiency (TE), with reduced nucleolar/P-granule components and increased oocyte development genes. Consistently, premature oocyte development was observed in L4 animals. These findings suggest that partial inhibition of translation elongation disrupts developmental timing across tissues, likely by altering TE.
{"title":"Cycloheximide-resistant ribosomes reveal adaptive translation dynamics in C. elegans.","authors":"Qiuxia Zhao, Blythe Bolton, Reed Rothe, Reiko Tachibana, Can Cenik, Elif Sarinay Cenik","doi":"10.1093/genetics/iyaf189","DOIUrl":"10.1093/genetics/iyaf189","url":null,"abstract":"<p><p>Protein translation regulation is critical for cellular responses and development, yet how elongation stage disruptions shape these processes remains incompletely understood. Here, we identify a single amino acid substitution (P55Q) in the ribosomal protein RPL-36A of Caenorhabditis elegans that confers complete resistance to the elongation inhibitor cycloheximide (CHX). Heterozygous animals carrying both wild-type RPL-36A and RPL-36A(P55Q) develop normally but show intermediate CHX resistance, indicating a partial dominant effect. Leveraging RPL-36A(P55Q) as a single-copy positive selection marker for CRISPR-based genome editing, we introduced targeted modifications into multiple ribosomal protein genes, confirming its broad utility for altering essential loci. In L4-stage heterozygotes, where CHX-sensitive and CHX-resistant ribosomes coexist, ribosome profiling revealed increased start-codon occupancy, reduced disome formation, and no codon-specific pausing. Surprisingly, chronic CHX treatment did not activate canonical stress pathways (ribosome quality control, integrated stress response, and ribotoxic stress response), as indicated by the absence of RPS-10 ubiquitination, eIF2α or PMK-1 phosphorylation, or ATF-4 induction. Instead, RNA-normalized ribosome footprints revealed selective changes in translation efficiency (TE), with reduced nucleolar/P-granule components and increased oocyte development genes. Consistently, premature oocyte development was observed in L4 animals. These findings suggest that partial inhibition of translation elongation disrupts developmental timing across tissues, likely by altering TE.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12477835/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1093/genetics/iyaf250
Julia Beets, Julia Höglund, Bernard Y Kim, Jacintha Ellers, Katja M Hoedjes, Mirte Bosse
Understanding how genetic variants drive phenotypic differences is a major challenge in molecular biology. Single nucleotide polymorphisms form the vast majority of genetic variation and play critical roles in complex, polygenic phenotypes, yet their functional impact is poorly understood from traditional gene-level analyses. In-depth knowledge about the impact of single nucleotide polymorphisms has broad applications in health and disease, population genomic, and evolution studies. The wealth of genomic data and available functional genetic tools make Drosophila melanogaster an ideal model species for studies at single nucleotide resolution. However, to leverage these resources for genotype-phenotype research and potentially combine it with the power of functional genetics, it is essential to develop techniques to predict functional impact and causality of single nucleotide variants. Here, we present FlyCADD, a functional impact prediction tool for single nucleotide variants in D. melanogaster. FlyCADD, based on the Combined Annotation-Dependent Depletion (CADD) framework, integrates over 650 genomic features-including conservation scores, GC content, and DNA secondary structure-into a single metric reflecting a variant's predicted impact on evolutionary fitness. FlyCADD provides impact prediction scores for any single nucleotide variant on the D. melanogaster genome. We demonstrate the power of FlyCADD for typical applications, such as the ranking of phenotype-associated variants to prioritize variants for follow-up studies, evaluation of naturally occurring polymorphisms, and refining of CRISPR-Cas9 experimental design. FlyCADD provides a powerful framework for interpreting the functional impact of any single nucleotide variant in D. melanogaster, thereby improving our understanding of genotype-phenotype connections.
{"title":"Predicting the functional impact of single nucleotide variants in Drosophila melanogaster with FlyCADD.","authors":"Julia Beets, Julia Höglund, Bernard Y Kim, Jacintha Ellers, Katja M Hoedjes, Mirte Bosse","doi":"10.1093/genetics/iyaf250","DOIUrl":"10.1093/genetics/iyaf250","url":null,"abstract":"<p><p>Understanding how genetic variants drive phenotypic differences is a major challenge in molecular biology. Single nucleotide polymorphisms form the vast majority of genetic variation and play critical roles in complex, polygenic phenotypes, yet their functional impact is poorly understood from traditional gene-level analyses. In-depth knowledge about the impact of single nucleotide polymorphisms has broad applications in health and disease, population genomic, and evolution studies. The wealth of genomic data and available functional genetic tools make Drosophila melanogaster an ideal model species for studies at single nucleotide resolution. However, to leverage these resources for genotype-phenotype research and potentially combine it with the power of functional genetics, it is essential to develop techniques to predict functional impact and causality of single nucleotide variants. Here, we present FlyCADD, a functional impact prediction tool for single nucleotide variants in D. melanogaster. FlyCADD, based on the Combined Annotation-Dependent Depletion (CADD) framework, integrates over 650 genomic features-including conservation scores, GC content, and DNA secondary structure-into a single metric reflecting a variant's predicted impact on evolutionary fitness. FlyCADD provides impact prediction scores for any single nucleotide variant on the D. melanogaster genome. We demonstrate the power of FlyCADD for typical applications, such as the ranking of phenotype-associated variants to prioritize variants for follow-up studies, evaluation of naturally occurring polymorphisms, and refining of CRISPR-Cas9 experimental design. FlyCADD provides a powerful framework for interpreting the functional impact of any single nucleotide variant in D. melanogaster, thereby improving our understanding of genotype-phenotype connections.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12774831/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145574790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1093/genetics/iyaf191
Mengyue Liu, Bu Zi, Hebin Zhang, Hong Zhang
Codon usage bias refers to the nonequal usage of synonymous codons. This phenomenon is fundamentally important in biology as it is jointly shaped by mutation, genetic drift, and natural selection, and influences translation rate, decoding accuracy, and mRNA stability. However, popular tools for codon usage bias analysis are not flexible nor efficient enough and fail to incorporate recent advancements in this field. To address these issues, we developed the Codon Usage Bias Analysis in R (cubar) package. Cubar is highly modular and can calculate common codon usage indexes in a user-friendly manner. In addition, it can perform sliding-window analyses of codon usage, assess differential usage between gene sets, and optimize user-provided genes based on the codon usage of a target organism. Furthermore, cubar is highly efficient and can analyze millions of coding sequences within a few minutes on a laptop.
{"title":"Cubar: a versatile package for codon usage bias analysis in R.","authors":"Mengyue Liu, Bu Zi, Hebin Zhang, Hong Zhang","doi":"10.1093/genetics/iyaf191","DOIUrl":"10.1093/genetics/iyaf191","url":null,"abstract":"<p><p>Codon usage bias refers to the nonequal usage of synonymous codons. This phenomenon is fundamentally important in biology as it is jointly shaped by mutation, genetic drift, and natural selection, and influences translation rate, decoding accuracy, and mRNA stability. However, popular tools for codon usage bias analysis are not flexible nor efficient enough and fail to incorporate recent advancements in this field. To address these issues, we developed the Codon Usage Bias Analysis in R (cubar) package. Cubar is highly modular and can calculate common codon usage indexes in a user-friendly manner. In addition, it can perform sliding-window analyses of codon usage, assess differential usage between gene sets, and optimize user-provided genes based on the codon usage of a target organism. Furthermore, cubar is highly efficient and can analyze millions of coding sequences within a few minutes on a laptop.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145179713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1093/genetics/iyaf234
Indira Krishnan, Lev Y Yampolsky, Kseniya Petrova, Leonid Peshkin
Detailed knowledge of transcriptional responses to environmental cues or developmental stimuli requires single-cell resolution. We performed 2 single-cell RNAseq experiments of adult females and males of Daphnia magna, a freshwater plankton crustacean which is both a classic and emerging new model for eco-physiology, toxicology, and evolutionary genomics. We were able to identify >25 distinct cell types about half of which could be functionally annotated. First, we identified ovaries- and testis-related cell types by focusing on female- and male-specific clusters. Second, we compared markers between cell clusters and bulk RNAseq data on transcriptional profiles of early embryos, circulating hemocytes, midgut, heads (containing brain, eyes, muscles, and hepatic caeca), antennae II, and carapace. Finally, we compared transcriptional profiles of Daphnia cell clusters with orthologous markers of 200+ cell types annotated in Drosophila cell atlas. This allowed us to recognize striated myocytes, enterocytes, cuticular cells, as well as 9 different neuron types, including photoreceptors. Several well-defined clusters showed a significant enrichment in markers of both hemocytes and either fat body, or ovaries, or certain neuron types of Drosophila, but not with bulk RNAseq data from circulating hemocytes. This allowed us to hypothesize the existence of noncirculating, fat body-, ovary-, or neuron-associated populations of hemocytes in Daphnia. The circulating hemocytes express numerous cuticular proteins suggesting their role, in addition to macrophagy, in wound repair. Our data will be useful as a baseline resource for researchers using Daphnia to answer questions in ecophysiology, toxicology and biology of adaptation to changing environment.
{"title":"Single-cell transcriptome defines cell-type repertoire of adult Daphnia magna.","authors":"Indira Krishnan, Lev Y Yampolsky, Kseniya Petrova, Leonid Peshkin","doi":"10.1093/genetics/iyaf234","DOIUrl":"10.1093/genetics/iyaf234","url":null,"abstract":"<p><p>Detailed knowledge of transcriptional responses to environmental cues or developmental stimuli requires single-cell resolution. We performed 2 single-cell RNAseq experiments of adult females and males of Daphnia magna, a freshwater plankton crustacean which is both a classic and emerging new model for eco-physiology, toxicology, and evolutionary genomics. We were able to identify >25 distinct cell types about half of which could be functionally annotated. First, we identified ovaries- and testis-related cell types by focusing on female- and male-specific clusters. Second, we compared markers between cell clusters and bulk RNAseq data on transcriptional profiles of early embryos, circulating hemocytes, midgut, heads (containing brain, eyes, muscles, and hepatic caeca), antennae II, and carapace. Finally, we compared transcriptional profiles of Daphnia cell clusters with orthologous markers of 200+ cell types annotated in Drosophila cell atlas. This allowed us to recognize striated myocytes, enterocytes, cuticular cells, as well as 9 different neuron types, including photoreceptors. Several well-defined clusters showed a significant enrichment in markers of both hemocytes and either fat body, or ovaries, or certain neuron types of Drosophila, but not with bulk RNAseq data from circulating hemocytes. This allowed us to hypothesize the existence of noncirculating, fat body-, ovary-, or neuron-associated populations of hemocytes in Daphnia. The circulating hemocytes express numerous cuticular proteins suggesting their role, in addition to macrophagy, in wound repair. Our data will be useful as a baseline resource for researchers using Daphnia to answer questions in ecophysiology, toxicology and biology of adaptation to changing environment.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145379571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1093/genetics/iyaf036
Arttu Arjas, Kalle Leppälä, Mikko J Sillanpää
Many quantitative traits can be measured from a single individual only once, making acquisition of longitudinal data impossible. In this paper, we present Gaussian process restricted Bayesian estimation, a new method tailored for estimating posterior distributions of longitudinal variance components from data where each individual contributes only 1 measurement at a single time point to the study. However, by collecting all time points together, one can think data to be longitudinal at the population level which makes it possible to estimate longitudinal variance components. The method can be also applied for reaction norm problems where it is common that a value of continuous environmental condition (e.g. temperature) is measured only once per individual. The work is based on Bayesian framework, Markov chain Monte Carlo estimation, and assuming Gaussian process-based smoothing priors for the variance components. The performance of the method is illustrated with simulated and real data sets as well as compared with a random regression model. Our method is very stable and it is flexible in handling any kind of smooth curves. Uncertainty around the variance curves is represented with 95% credible interval curves computed from the posterior distribution. The code is available at the GitHub repository https://github.com/aarjas/GP-REBE.
{"title":"Posterior estimation of longitudinal variance components from nonlongitudinal data using Bayesian Gaussian process model.","authors":"Arttu Arjas, Kalle Leppälä, Mikko J Sillanpää","doi":"10.1093/genetics/iyaf036","DOIUrl":"10.1093/genetics/iyaf036","url":null,"abstract":"<p><p>Many quantitative traits can be measured from a single individual only once, making acquisition of longitudinal data impossible. In this paper, we present Gaussian process restricted Bayesian estimation, a new method tailored for estimating posterior distributions of longitudinal variance components from data where each individual contributes only 1 measurement at a single time point to the study. However, by collecting all time points together, one can think data to be longitudinal at the population level which makes it possible to estimate longitudinal variance components. The method can be also applied for reaction norm problems where it is common that a value of continuous environmental condition (e.g. temperature) is measured only once per individual. The work is based on Bayesian framework, Markov chain Monte Carlo estimation, and assuming Gaussian process-based smoothing priors for the variance components. The performance of the method is illustrated with simulated and real data sets as well as compared with a random regression model. Our method is very stable and it is flexible in handling any kind of smooth curves. Uncertainty around the variance curves is represented with 95% credible interval curves computed from the posterior distribution. The code is available at the GitHub repository https://github.com/aarjas/GP-REBE.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12774850/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1093/genetics/iyaf130
Changyuan Wang, Denis F Faerberg, Stanislav Y Shvartsman, Robert A Marmion
Studies in Drosophila have contributed a great deal to our understanding of developmental mechanisms. Indeed, familiar names of critical signaling components, such as Hedgehog and Notch, have their origins in the readily identifiable morphological phenotypes of Drosophila. Most studies that led to the identification of these and many other highly conserved genes were based on the end-point phenotypes, such as the larval cuticle or the adult wing. Additional information can be extracted from longitudinal studies, which can reveal how the phenotypes emerge over time. Here we present the Fruit Fly Auxodrome, an experimental setup that enables monitoring and quantitative analysis of the entirety of development of 96 individually housed Drosophila from hatching to eclosion. The Auxodrome combines an inexpensive live imaging setup and a computer vision pipeline that provides access to a wide range of quantitative information, such as the times of hatching and pupation, as well as dynamic patterns of larval activity. We demonstrate the Auxodrome in action by recapitulating several previously reported features of wild-type development as well as developmental delay in a Drosophila model of a human disease. The scalability of the presented design makes it readily suitable for large-scale longitudinal studies in multiple developmental contexts.
{"title":"The Fruit Fly Auxodrome: a computer vision setup for longitudinal studies of Drosophila development.","authors":"Changyuan Wang, Denis F Faerberg, Stanislav Y Shvartsman, Robert A Marmion","doi":"10.1093/genetics/iyaf130","DOIUrl":"10.1093/genetics/iyaf130","url":null,"abstract":"<p><p>Studies in Drosophila have contributed a great deal to our understanding of developmental mechanisms. Indeed, familiar names of critical signaling components, such as Hedgehog and Notch, have their origins in the readily identifiable morphological phenotypes of Drosophila. Most studies that led to the identification of these and many other highly conserved genes were based on the end-point phenotypes, such as the larval cuticle or the adult wing. Additional information can be extracted from longitudinal studies, which can reveal how the phenotypes emerge over time. Here we present the Fruit Fly Auxodrome, an experimental setup that enables monitoring and quantitative analysis of the entirety of development of 96 individually housed Drosophila from hatching to eclosion. The Auxodrome combines an inexpensive live imaging setup and a computer vision pipeline that provides access to a wide range of quantitative information, such as the times of hatching and pupation, as well as dynamic patterns of larval activity. We demonstrate the Auxodrome in action by recapitulating several previously reported features of wild-type development as well as developmental delay in a Drosophila model of a human disease. The scalability of the presented design makes it readily suitable for large-scale longitudinal studies in multiple developmental contexts.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12774832/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144612417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}