András Szabó, Zsolt Bánfai, Katalin Sümegi, Valerián Ádám, Ferenc Gallyas, Miklós Kásler, Béla Melegh
Background/objectives: The Seklers are a Hungarian-speaking regional population in Transylvania, Romania, with a long and complex history, yet comprehensive genome-wide studies remain limited. Our aim was to characterize the genetic background of multiple Sekler communities using high-density autosomal data and to place them in a broader Central and Eastern European context.
Methods: Here we analyzed genome-wide autosomal SNP data obtained from 17 Sekler groups. Allele frequency- and haplotype-based approaches were applied to assess overall genetic structure, ancestry patterns, recent shared ancestry, and signals of demographic history.
Results: Analyses based on overall allele-frequency patterns showed that Sekler groups fit into a single, coherent genetic cluster shared with Hungarians. No major differences were detected among the Sekler communities at this broader genomic level, and their genetic profiles were largely indistinguishable from one another. Using haplotype-based methods, most Sekler groups again formed a compact cluster. However, two villages, Deményháza and Nyárádszentimre, showed clear signs of increased within-group relatedness and subtle separation. These patterns might indicate that both communities experienced stronger local drift and reduced effective population size, while other Sekler groups showed no comparable deviation from the general regional pattern.
Conclusions: Although a small number of villages display modest signs of localized demographic drift, our results support that the Seklers represent a regionally distinct and internally cohesive population, whose genetic structure is shaped mainly by common historical and linguistic ties, with minor village-level variation, forming a uniform part of the Hungarian-speaking population of the East-Central European region.
{"title":"Uncovering the Genetic Structure of the Sekler Population in Transylvania Through Genome-Wide Autosomal Data.","authors":"András Szabó, Zsolt Bánfai, Katalin Sümegi, Valerián Ádám, Ferenc Gallyas, Miklós Kásler, Béla Melegh","doi":"10.3390/genes17010030","DOIUrl":"10.3390/genes17010030","url":null,"abstract":"<p><strong>Background/objectives: </strong>The Seklers are a Hungarian-speaking regional population in Transylvania, Romania, with a long and complex history, yet comprehensive genome-wide studies remain limited. Our aim was to characterize the genetic background of multiple Sekler communities using high-density autosomal data and to place them in a broader Central and Eastern European context.</p><p><strong>Methods: </strong>Here we analyzed genome-wide autosomal SNP data obtained from 17 Sekler groups. Allele frequency- and haplotype-based approaches were applied to assess overall genetic structure, ancestry patterns, recent shared ancestry, and signals of demographic history.</p><p><strong>Results: </strong>Analyses based on overall allele-frequency patterns showed that Sekler groups fit into a single, coherent genetic cluster shared with Hungarians. No major differences were detected among the Sekler communities at this broader genomic level, and their genetic profiles were largely indistinguishable from one another. Using haplotype-based methods, most Sekler groups again formed a compact cluster. However, two villages, Deményháza and Nyárádszentimre, showed clear signs of increased within-group relatedness and subtle separation. These patterns might indicate that both communities experienced stronger local drift and reduced effective population size, while other Sekler groups showed no comparable deviation from the general regional pattern.</p><p><strong>Conclusions: </strong>Although a small number of villages display modest signs of localized demographic drift, our results support that the Seklers represent a regionally distinct and internally cohesive population, whose genetic structure is shaped mainly by common historical and linguistic ties, with minor village-level variation, forming a uniform part of the Hungarian-speaking population of the East-Central European region.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12840628/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gabriel Monteiro de Lima, Mônica Andressa Leite Rodrigues, Rômulo Veiga Paixão, Ítalo Lutz, Manoel Alessandro Borges Aviz, Janieli do Socorro Amorim da Luz Sousa, Bruna Ramalho Maciel, Luciano Domingues Queiroz, Carlos Murilo Tenório Maciel, Iracilda Sampaio, Eduardo Sousa Varela, Cristiana Ramalho Maciel
Background/Objectives: The selection and validation of species-specific housekeeping genes (HKGs) have become increasingly common in functional genomics, with application of quantitative Polymerase Chain Reaction (qPCR) or cDNA-based qPCR (RT-qPCR). Despite the Macrobrachium amazonicum having RNA-seq studies available, there are still no data on the most stable and consistent HKGs for use in relative gene expression analyses. Therefore, the present study aimed to identify and validate seven HKGs in M. amazonicum: Eukaryotic Translation Initiation Factor (EIF), 18S ribosomal RNA (18S), Ribosomal Protein L18 (RPL18), β-actin, α-tubulin (α-tub), Elongation Factor 1-α (EF-1α), and Glyceraldehyde-3-phosphate Dehydrogenase (GAPDH). Methods: The HKGs were identified in the M. amazonicum transcriptome, characterized for identity confirmation, and compared against public databases. Subsequently, RT-qPCR assays were prepared using muscle, hepatopancreas, gills, testis, androgenic gland, and ovary to assess the stability of the HKG markers, employing the comparative ∆Ct, BestKeeper, NormFinder, and GeNorm methods. Results: All candidate HKGs identified showed high similarity with other decapods. Reactions performed with these markers demonstrated high specificity, PCR efficiency, and elevated coefficients of determination. The comprehensive ranking, indicated that no single HKG was stable across all tissues, with HKGs showing the best stability being tissue-specific. The most stable HKGs were RPL18 and 18S. GAPDH, historically used as an HKG, showed the poorest performance in stability ranking for most tissues tested, whereas β-actin was most suitable only for ovarian. Conclusions: These data reinforce the need for species-specific HKG validation and provide an appropriate panel of reference markers for gene expression studies in the M. amazonicum.
{"title":"Identification and Validation of Tissue-Specific Housekeeping Markers for the Amazon River Prawn <i>Macrobrachium amazonicum</i> (Heller, 1862).","authors":"Gabriel Monteiro de Lima, Mônica Andressa Leite Rodrigues, Rômulo Veiga Paixão, Ítalo Lutz, Manoel Alessandro Borges Aviz, Janieli do Socorro Amorim da Luz Sousa, Bruna Ramalho Maciel, Luciano Domingues Queiroz, Carlos Murilo Tenório Maciel, Iracilda Sampaio, Eduardo Sousa Varela, Cristiana Ramalho Maciel","doi":"10.3390/genes17010026","DOIUrl":"10.3390/genes17010026","url":null,"abstract":"<p><p><b>Background/Objectives</b>: The selection and validation of species-specific housekeeping genes (HKGs) have become increasingly common in functional genomics, with application of quantitative Polymerase Chain Reaction (qPCR) or cDNA-based qPCR (RT-qPCR). Despite the <i>Macrobrachium amazonicum</i> having RNA-seq studies available, there are still no data on the most stable and consistent HKGs for use in relative gene expression analyses. Therefore, the present study aimed to identify and validate seven HKGs in <i>M. amazonicum</i>: Eukaryotic Translation Initiation Factor (EIF), 18S ribosomal RNA (18S), Ribosomal Protein L18 (RPL18), β-actin, α-tubulin (α-tub), Elongation Factor 1-α (EF-1α), and Glyceraldehyde-3-phosphate Dehydrogenase (GAPDH). <b>Methods</b>: The HKGs were identified in the <i>M. amazonicum</i> transcriptome, characterized for identity confirmation, and compared against public databases. Subsequently, RT-qPCR assays were prepared using muscle, hepatopancreas, gills, testis, androgenic gland, and ovary to assess the stability of the HKG markers, employing the comparative ∆Ct, BestKeeper, NormFinder, and GeNorm methods. <b>Results</b>: All candidate HKGs identified showed high similarity with other decapods. Reactions performed with these markers demonstrated high specificity, PCR efficiency, and elevated coefficients of determination. The comprehensive ranking, indicated that no single HKG was stable across all tissues, with HKGs showing the best stability being tissue-specific. The most stable HKGs were RPL18 and 18S. GAPDH, historically used as an HKG, showed the poorest performance in stability ranking for most tissues tested, whereas β-actin was most suitable only for ovarian. <b>Conclusions</b>: These data reinforce the need for species-specific HKG validation and provide an appropriate panel of reference markers for gene expression studies in the <i>M. amazonicum</i>.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12840830/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lina Jiang, Ping Sun, Tingting Zhou, Yang Liu, Zihan Kong, Nan Zhang, Hongli He, Xingzheng Zhang
Background: Phosphorus is an essential nutrient for plant growth and development, playing a multifaceted and vital role in plants. Phosphate Transporter 1 (PHO1) is a class of important functional genes involved in plant phosphorus uptake and transport. We identify PHOSPHATE 1 (PHO1) members in mung beans and investigate their response to low phosphorus stress, thereby aiding in the development of stress-tolerant, high-yielding mung bean varieties. Methods: A bioinformatic analysis was performed, which led to the identification of the PHO1 homologue sequence in mung beans. This analysis also elucidated its gene and protein structural characteristics alongside its phylogenetic relationships. qRT-PCR was used to analyze the expression patterns of genes in roots and leaves in response to conditions of prolonged low-phosphorus and phosphorus-deprivation stress. Results: Total PHO1 homologues were identified in mung beans, which can be grouped into 3 groups (Group I-III). Phylogenetic analysis indicates that VrPHO1s shares closer evolutionary relationships with PHO1 in legumes, and exhibits 6 collinear gene pairs with Glycine max (soybean), all with Ka/Ks ratios below 1, suggesting they have undergone purifying selection. The gene promoter region contains multiple cis-acting elements capable of participating in plant growth and development, stress responses, and plant hormone responses. Expression analysis revealed that more VrPHO1 genes responded to phosphorus stress in roots than in leaves; of these, the expression of VrPHO1; H2, VrPHO1; H3, and VrPHO1; H5 genes was significantly induced by continuous phosphorus-deficient stress. Conclusions: This study provides a comprehensive genome-wide identification of the PHO1 family in mung bean and provides valuable candidate gene resources for the future study of their biological functions and regulatory roles in phosphate-deficient stress.
{"title":"Characterization of the <i>PHO1</i> Gene Family in <i>Vigna radiata</i> L. and Its Expression Analysis Under Phosphate-Deficient Stress.","authors":"Lina Jiang, Ping Sun, Tingting Zhou, Yang Liu, Zihan Kong, Nan Zhang, Hongli He, Xingzheng Zhang","doi":"10.3390/genes17010025","DOIUrl":"10.3390/genes17010025","url":null,"abstract":"<p><p><b>Background:</b> Phosphorus is an essential nutrient for plant growth and development, playing a multifaceted and vital role in plants. Phosphate Transporter 1 (PHO1) is a class of important functional genes involved in plant phosphorus uptake and transport. We identify <i>PHOSPHATE 1</i> (<i>PHO1</i>) members in mung beans and investigate their response to low phosphorus stress, thereby aiding in the development of stress-tolerant, high-yielding mung bean varieties. <b>Methods:</b> A bioinformatic analysis was performed, which led to the identification of the <i>PHO1</i> homologue sequence in mung beans. This analysis also elucidated its gene and protein structural characteristics alongside its phylogenetic relationships. qRT-PCR was used to analyze the expression patterns of genes in roots and leaves in response to conditions of prolonged low-phosphorus and phosphorus-deprivation stress. <b>Results:</b> Total <i>PHO1</i> homologues were identified in mung beans, which can be grouped into 3 groups (Group I-III). Phylogenetic analysis indicates that <i>VrPHO1</i>s shares closer evolutionary relationships with <i>PHO1</i> in legumes, and exhibits 6 collinear gene pairs with <i>Glycine max</i> (soybean), all with <i>Ka/Ks</i> ratios below 1, suggesting they have undergone purifying selection. The gene promoter region contains multiple cis-acting elements capable of participating in plant growth and development, stress responses, and plant hormone responses. Expression analysis revealed that more <i>VrPHO1</i> genes responded to phosphorus stress in roots than in leaves; of these, the expression of <i>VrPHO1; H2</i>, <i>VrPHO1; H3</i>, and <i>VrPHO1; H5</i> genes was significantly induced by continuous phosphorus-deficient stress. <b>Conclusions:</b> This study provides a comprehensive genome-wide identification of the <i>PHO1</i> family in mung bean and provides valuable candidate gene resources for the future study of their biological functions and regulatory roles in phosphate-deficient stress.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12841144/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Accurate recognition of promoter sequences in Escherichia coli is fundamental for understanding gene regulation and engineering synthetic biological systems. However, existing computational methods struggle to simultaneously model long-range genomic dependencies and fine-grained local motifs, particularly the degenerate -10 and -35 elements of σ70 promoters. To address this gap, we propose DNABERT2-CAMP, a novel hybrid deep learning framework designed to integrate global contextual understanding with high-resolution local motif detection for robust promoter identification.
Methods: We constructed a balanced dataset of 8720 experimentally validated and negative 81-bp sequences from RegulonDB, literature, and the E. coli K-12 genome. Our model combines a pre-trained DNABERT-2 Transformer for global sequence encoding with a custom CAMP module (CNN-Attention-Mean Pooling) for local feature refinement. We evaluated performance using 5-fold cross-validation and an independent external test set, reporting standard metrics including accuracy, ROC AUC, and Matthews correlation coefficient (MCC).
Results: DNABERT2-CAMP achieved 93.10% accuracy and 97.28% ROC AUC in cross-validation, outperforming existing methods including DNABERT. On an independent test set, it maintained strong generalization (89.83% accuracy, 92.79% ROC AUC). Interpretability analyses confirmed biologically plausible attention over canonical promoter regions and CNN-identified AT-rich/-35-like motifs.
Conclusions: DNABERT2-CAMP demonstrates that synergistically combining pre-trained Transformers with convolutional motif detection significantly improves promoter recognition accuracy and interpretability. This framework offers a powerful, generalizable tool for genomic annotation and synthetic biology applications.
{"title":"DNABERT2-CAMP: A Hybrid Transformer-CNN Model for <i>E. coli</i> Promoter Recognition.","authors":"Hua-Lin Xu, Xiu-Jun Gong, Hua Yu, Ying-Kai Wang","doi":"10.3390/genes17010027","DOIUrl":"10.3390/genes17010027","url":null,"abstract":"<p><strong>Background: </strong>Accurate recognition of promoter sequences in <i>Escherichia coli</i> is fundamental for understanding gene regulation and engineering synthetic biological systems. However, existing computational methods struggle to simultaneously model long-range genomic dependencies and fine-grained local motifs, particularly the degenerate -10 and -35 elements of σ70 promoters. To address this gap, we propose DNABERT2-CAMP, a novel hybrid deep learning framework designed to integrate global contextual understanding with high-resolution local motif detection for robust promoter identification.</p><p><strong>Methods: </strong>We constructed a balanced dataset of 8720 experimentally validated and negative 81-bp sequences from RegulonDB, literature, and the <i>E. coli</i> K-12 genome. Our model combines a pre-trained DNABERT-2 Transformer for global sequence encoding with a custom CAMP module (CNN-Attention-Mean Pooling) for local feature refinement. We evaluated performance using 5-fold cross-validation and an independent external test set, reporting standard metrics including accuracy, ROC AUC, and Matthews correlation coefficient (MCC).</p><p><strong>Results: </strong>DNABERT2-CAMP achieved 93.10% accuracy and 97.28% ROC AUC in cross-validation, outperforming existing methods including DNABERT. On an independent test set, it maintained strong generalization (89.83% accuracy, 92.79% ROC AUC). Interpretability analyses confirmed biologically plausible attention over canonical promoter regions and CNN-identified AT-rich/-35-like motifs.</p><p><strong>Conclusions: </strong>DNABERT2-CAMP demonstrates that synergistically combining pre-trained Transformers with convolutional motif detection significantly improves promoter recognition accuracy and interpretability. This framework offers a powerful, generalizable tool for genomic annotation and synthetic biology applications.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12841110/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background/Objectives: Soroseris hookeriana, a Tibetan medicinal plant endemic to the high-altitude Qinghai-Tibet Plateau, possesses significant pharmacological value but lacks fundamental genomic characterization. This study aims to generate and comparatively analyse its complete chloroplast genome. Methods: Total DNA was sequenced, assembled with GetOrganelle, annotated with CPGAVAS2, and compared with eight Asteraceae species; phylogenetic placement was inferred with IQ-TREE from 21 complete plastomes. Results: The circular chloroplast genome is 152,514 bp with a typical quadripartite structure (LSC 84,168 bp, SSC 18,528 bp, two IRs 24,909 bp each). It contains 132 unique genes (87 protein-coding, 37 tRNA, 8 rRNA; 18 duplicated in IRs yield 150 total copies). Twenty-three genes harbour introns; clpP and ycf3 have two. Overall GC content is 37.73%, elevated in IRs (43.12%). Codon usage shows strong A/U bias at the third position; 172 SSRs and 39 long repeats are detected. IR-SC boundaries exhibit the greatest inter-specific variation, notably in ycf1 and ndhF. Conclusions: The complete plastome robustly supports S. hookeriana and Stebbinsia umbrella as sister species (100% bootstrap) and provides essential genomic resources for species identification, population genetics, and studies of high-altitude adaptation.
{"title":"Complete Chloroplast Genome Sequence and Phylogenetic Analysis of the Tibetan Medicinal Plant <i>Soroseris hookeriana</i>.","authors":"Tian Tian, Xiuying Lin, Yiming Wang, Jiuli Wang","doi":"10.3390/genes17010024","DOIUrl":"10.3390/genes17010024","url":null,"abstract":"<p><p><b>Background/Objectives</b>: <i>Soroseris hookeriana</i>, a Tibetan medicinal plant endemic to the high-altitude Qinghai-Tibet Plateau, possesses significant pharmacological value but lacks fundamental genomic characterization. This study aims to generate and comparatively analyse its complete chloroplast genome. <b>Methods</b>: Total DNA was sequenced, assembled with GetOrganelle, annotated with CPGAVAS2, and compared with eight Asteraceae species; phylogenetic placement was inferred with IQ-TREE from 21 complete plastomes. <b>Results</b>: The circular chloroplast genome is 152,514 bp with a typical quadripartite structure (LSC 84,168 bp, SSC 18,528 bp, two IRs 24,909 bp each). It contains 132 unique genes (87 protein-coding, 37 tRNA, 8 rRNA; 18 duplicated in IRs yield 150 total copies). Twenty-three genes harbour introns; <i>clpP</i> and <i>ycf3</i> have two. Overall GC content is 37.73%, elevated in IRs (43.12%). Codon usage shows strong A/U bias at the third position; 172 SSRs and 39 long repeats are detected. IR-SC boundaries exhibit the greatest inter-specific variation, notably in <i>ycf1</i> and <i>ndhF</i>. <b>Conclusions</b>: The complete plastome robustly supports <i>S. hookeriana</i> and <i>Stebbinsia umbrella</i> as sister species (100% bootstrap) and provides essential genomic resources for species identification, population genetics, and studies of high-altitude adaptation.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12841534/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aizhan Moldakaryzova, Dias Dautov, Saken Khaidarov, Saniya Ossikbayeva, Dilyara Kaidarova
Background: Duchenne muscular dystrophy (DMD) results from pathogenic variants in the DMD gene, one of the most significant and most mutation-prone genes in the human genome. Although global mutation registries are well developed, genetic data from Central Asian populations remain extremely limited, leaving essential gaps in regional epidemiology and in the understanding of genotype-phenotype patterns. Methods: We conducted a retrospective analysis of patients with genetically confirmed dystrophinopathy in Kazakhstan. Variants were identified using multiplex ligation-dependent probe amplification (MLPA) for exon-level copy number alterations and next-generation sequencing (NGS) with Sanger confirmation for sequence-level changes. All variants were classified under ACMG guidelines. Statistical modelling incorporated mutation-class grouping, exon-hotspot mapping, reading-frame status, CPK stratification, chi-squared association testing, Spearman correlations, Kaplan-Meier ambulation survival curves, and multivariable logistic and Cox regression. Results: multi-exon deletions were the predominant mutation class, with a marked concentration within the canonical hotspot spanning exons 44-55. Recurrent deletions affecting exons 46-50 and 45-50 appeared in several unrelated patients. NGS confirmed severe protein-truncating variants, including p. Lys1049* and p. Ser861Ilefs*7. Phenotypic severity followed a consistent hierarchy: hotspot-associated deletions and early truncating variants showed the earliest loss of ambulation, whereas splice-site variants and duplications demonstrated the mildest courses. CPK levels correlated with the extent of genomic involvement, though extreme elevations did not consistently predict early functional decline. Regression models identified hotspot localization and out-of-frame effect as independent predictors of ambulation loss. Conclusions: This study provides the first statistically modelled characterisation of DMD gene mutations in Kazakhstan. While the mutational landscape largely mirrors global patterns, notable variability in clinical severity suggests the presence of population-specific modifiers. Integrating comprehensive molecular diagnostics with statistical-genetics approaches enhances prognostic accuracy and supports the development of mutation-targeted therapeutic strategies in Central Asia.
{"title":"Statistical Genetics of <i>DMD</i> Gene Mutations in a Kazakhstan Cohort: MLPA/NGS Variant Validation and Genotype-Phenotype Modelling.","authors":"Aizhan Moldakaryzova, Dias Dautov, Saken Khaidarov, Saniya Ossikbayeva, Dilyara Kaidarova","doi":"10.3390/genes17010020","DOIUrl":"10.3390/genes17010020","url":null,"abstract":"<p><p><b>Background</b>: Duchenne muscular dystrophy (DMD) results from pathogenic variants in the <i>DMD</i> gene, one of the most significant and most mutation-prone genes in the human genome. Although global mutation registries are well developed, genetic data from Central Asian populations remain extremely limited, leaving essential gaps in regional epidemiology and in the understanding of genotype-phenotype patterns. <b>Methods</b>: We conducted a retrospective analysis of patients with genetically confirmed dystrophinopathy in Kazakhstan. Variants were identified using multiplex ligation-dependent probe amplification (MLPA) for exon-level copy number alterations and next-generation sequencing (NGS) with Sanger confirmation for sequence-level changes. All variants were classified under ACMG guidelines. Statistical modelling incorporated mutation-class grouping, exon-hotspot mapping, reading-frame status, CPK stratification, chi-squared association testing, Spearman correlations, Kaplan-Meier ambulation survival curves, and multivariable logistic and Cox regression. <b>Results</b>: multi-exon deletions were the predominant mutation class, with a marked concentration within the canonical hotspot spanning exons 44-55. Recurrent deletions affecting exons 46-50 and 45-50 appeared in several unrelated patients. NGS confirmed severe protein-truncating variants, including p. Lys1049* and p. Ser861Ilefs*7. Phenotypic severity followed a consistent hierarchy: hotspot-associated deletions and early truncating variants showed the earliest loss of ambulation, whereas splice-site variants and duplications demonstrated the mildest courses. CPK levels correlated with the extent of genomic involvement, though extreme elevations did not consistently predict early functional decline. Regression models identified hotspot localization and out-of-frame effect as independent predictors of ambulation loss. <b>Conclusions</b>: This study provides the first statistically modelled characterisation of <i>DMD</i> gene mutations in Kazakhstan. While the mutational landscape largely mirrors global patterns, notable variability in clinical severity suggests the presence of population-specific modifiers. Integrating comprehensive molecular diagnostics with statistical-genetics approaches enhances prognostic accuracy and supports the development of mutation-targeted therapeutic strategies in Central Asia.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12841199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiting Yang, Siyu Chen, Ziling Hao, Taizeng Zhou, Songquan Guan, Ya Tan, Yan Wang, Xiaofeng Zhou, Lei Chen, Ye Zhao, Linyuan Shen, Li Zhu, Mailin Gan
Background: Testicular development and spermatogenesis are intricate biological processes controlled by a coordinated transcriptional network. However, comprehensive characterization of full-length transcripts and non-coding RNAs (ncRNAs) during porcine testicular sexual maturation remains limited. Methods: This study systematically profiled the transcriptional landscape of pig testes prior to (pre-sexual maturity, PSM) and following (post-sexual maturity, SM) sexual maturity using Oxford Nanopore Technologies (ONT) long-read sequencing. Results: There were 11,060 differentially expressed mRNAs (DEGs), 15,338 differentially expressed transcripts (DETs), 688 differentially expressed lncRNAs (DELs), and 19 differentially expressed circRNAs (DEcircRNAs) between PSM and SM groups among the 9941 mRNAs, 15,339 transcripts, 4136 lncRNAs (58.58% being LincRNAs). These differential RNAs converged on 133 shared GO terms (e.g., spermatogenesis, male gamete generation) and 58 common KEGG pathways (e.g., metabolic pathways, Wnt/MAPK signaling), according to functional enrichment and combined analysis. Core genes (e.g., PRM1, ODF2, GSTM3) demonstrated synergistic expression across gene, transcript, lncRNA-cistarget, and circRNA levels. Furthermore, DELs were associated with steroid biosynthesis and N-glycan biosynthesis, whereas DEcircRNAs, which were mostly upregulated after puberty, were thought to control genes linked to spermatogenesis. Conclusions: This research sheds light on the dynamic transcriptional reprogramming that occurs during the maturation of pig testicles, advances our knowledge of coding and ncRNA regulatory networks in male mammals, and offers useful molecular markers for enhancing pig reproductive efficiency.
{"title":"Nanopore Sequencing Technology Reveals the Transcriptional Expression Characteristics of Male Pig's Testes Before and After Sexual Maturity.","authors":"Yiting Yang, Siyu Chen, Ziling Hao, Taizeng Zhou, Songquan Guan, Ya Tan, Yan Wang, Xiaofeng Zhou, Lei Chen, Ye Zhao, Linyuan Shen, Li Zhu, Mailin Gan","doi":"10.3390/genes17010021","DOIUrl":"10.3390/genes17010021","url":null,"abstract":"<p><p><b>Background</b>: Testicular development and spermatogenesis are intricate biological processes controlled by a coordinated transcriptional network. However, comprehensive characterization of full-length transcripts and non-coding RNAs (ncRNAs) during porcine testicular sexual maturation remains limited. <b>Methods</b>: This study systematically profiled the transcriptional landscape of pig testes prior to (pre-sexual maturity, PSM) and following (post-sexual maturity, SM) sexual maturity using Oxford Nanopore Technologies (ONT) long-read sequencing. <b>Results</b>: There were 11,060 differentially expressed mRNAs (DEGs), 15,338 differentially expressed transcripts (DETs), 688 differentially expressed lncRNAs (DELs), and 19 differentially expressed circRNAs (DEcircRNAs) between PSM and SM groups among the 9941 mRNAs, 15,339 transcripts, 4136 lncRNAs (58.58% being LincRNAs). These differential RNAs converged on 133 shared GO terms (e.g., spermatogenesis, male gamete generation) and 58 common KEGG pathways (e.g., metabolic pathways, Wnt/MAPK signaling), according to functional enrichment and combined analysis. Core genes (e.g., <i>PRM1</i>, <i>ODF2</i>, <i>GSTM3</i>) demonstrated synergistic expression across gene, transcript, lncRNA-cistarget, and circRNA levels. Furthermore, DELs were associated with steroid biosynthesis and N-glycan biosynthesis, whereas DEcircRNAs, which were mostly upregulated after puberty, were thought to control genes linked to spermatogenesis. <b>Conclusions</b>: This research sheds light on the dynamic transcriptional reprogramming that occurs during the maturation of pig testicles, advances our knowledge of coding and ncRNA regulatory networks in male mammals, and offers useful molecular markers for enhancing pig reproductive efficiency.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12840806/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seval Akay, Taha Resid Ozdemir, Ozge Ozer Kaya, Mustafa Degirmenci, Olcun Umit Unal
Background: Germline alterations in DNA damage repair (DDR) genes represent a clinically important subset of prostate cancer (PCa), but real-world data from Middle Eastern and Turkish populations remain limited. We evaluated the prevalence and clinicopathologic associations of germline DDR variants in a single-center Turkish cohort.
Methods: We retrospectively analyzed 122 men with histologically confirmed PCa who underwent germline multigene panel testing. Variants were classified according to ACMG/ClinVar criteria. Patients were grouped as pathogenic/likely pathogenic (P/LP), variants of uncertain significance (VUS), or variant-negative. Patients were grouped as variant-positive (P/LP or VUS/uncategorized) or clinically actionable variant-negative (benign/likely benign or no variant detected). Group comparisons used t-tests, chi-square or Fisher's exact tests as appropriate.
Results: The median age at diagnosis was 65.2 years (mean 64.6 ± 8.78). Overall, 37 patients (30.3%) carried at least one germline variant, including 12 (9.8%) with P/LP alterations and 24 (19.7%) with VUS; one patient (0.8%) harbored an uncategorized variant. The most frequently affected genes were CHEK2 (n = 8), BRCA1 (n = 6), BRCA2 (n = 6), ATM (n = 5), and APC (n = 4). Variant-positive status increased from 10.8% in ISUP 1-2 to 21.6% in ISUP 3 and 76.0% in ISUP 4-5, although this trend was not statistically significant (p = 0.391). Mean age at diagnosis and the prevalence of metastatic disease did not differ between variant-positive and clinically actionable variant-negative patients (64.2 vs. 65.7 years, p = 0.390; 66.7% vs. 64.6%, p = 0.842). Truncating DDR variants (RAD50, BRCA2, MSH3, NBN, CHEK2, ATM) occurred predominantly in ISUP 4-5 tumors.
Conclusions: Germline DDR alterations-most notably in BRCA2, CHEK2, and ATM-were present in a substantial subset of Turkish men with PCa and showed a non-significant trend toward clustering in higher-grade disease. The high prevalence of VUS reflects limited genomic annotation in under-represented populations and underscores the need for longitudinal reinterpretation. These data support the clinical value of incorporating germline DDR testing into risk assessment and familial counseling, while larger cohorts integrating somatic profiling are needed to refine genotype-phenotype associations.
{"title":"Prevalence and Clinical Associations of Germline DDR Variants in Prostate Cancer: Real-World Evidence from a 122-Patient Turkish Cohort.","authors":"Seval Akay, Taha Resid Ozdemir, Ozge Ozer Kaya, Mustafa Degirmenci, Olcun Umit Unal","doi":"10.3390/genes17010023","DOIUrl":"10.3390/genes17010023","url":null,"abstract":"<p><strong>Background: </strong>Germline alterations in DNA damage repair (DDR) genes represent a clinically important subset of prostate cancer (PCa), but real-world data from Middle Eastern and Turkish populations remain limited. We evaluated the prevalence and clinicopathologic associations of germline DDR variants in a single-center Turkish cohort.</p><p><strong>Methods: </strong>We retrospectively analyzed 122 men with histologically confirmed PCa who underwent germline multigene panel testing. Variants were classified according to ACMG/ClinVar criteria. Patients were grouped as pathogenic/likely pathogenic (P/LP), variants of uncertain significance (VUS), or variant-negative. Patients were grouped as variant-positive (P/LP or VUS/uncategorized) or clinically actionable variant-negative (benign/likely benign or no variant detected). Group comparisons used <i>t</i>-tests, chi-square or Fisher's exact tests as appropriate.</p><p><strong>Results: </strong>The median age at diagnosis was 65.2 years (mean 64.6 ± 8.78). Overall, 37 patients (30.3%) carried at least one germline variant, including 12 (9.8%) with P/LP alterations and 24 (19.7%) with VUS; one patient (0.8%) harbored an uncategorized variant. The most frequently affected genes were CHEK2 (<i>n</i> = 8), BRCA1 (<i>n</i> = 6), BRCA2 (<i>n</i> = 6), ATM (<i>n</i> = 5), and APC (<i>n</i> = 4). Variant-positive status increased from 10.8% in ISUP 1-2 to 21.6% in ISUP 3 and 76.0% in ISUP 4-5, although this trend was not statistically significant (<i>p</i> = 0.391). Mean age at diagnosis and the prevalence of metastatic disease did not differ between variant-positive and clinically actionable variant-negative patients (64.2 vs. 65.7 years, <i>p</i> = 0.390; 66.7% vs. 64.6%, <i>p</i> = 0.842). Truncating DDR variants (RAD50, BRCA2, MSH3, NBN, CHEK2, ATM) occurred predominantly in ISUP 4-5 tumors.</p><p><strong>Conclusions: </strong>Germline DDR alterations-most notably in BRCA2, CHEK2, and ATM-were present in a substantial subset of Turkish men with PCa and showed a non-significant trend toward clustering in higher-grade disease. The high prevalence of VUS reflects limited genomic annotation in under-represented populations and underscores the need for longitudinal reinterpretation. These data support the clinical value of incorporating germline DDR testing into risk assessment and familial counseling, while larger cohorts integrating somatic profiling are needed to refine genotype-phenotype associations.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12841403/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zheng Weng, Fan Wang, Xin Wei, Lianjia Zhao, Wei Wang, Jianfeng Lei
Background: Salt stress is a primary abiotic constraint on cotton growth, significantly impairing yield and fiber quality.
Methods: To elucidate the regulatory mechanisms underlying salt stress responses in Gossypium hirsutum, we performed transcriptomic and metabolomic profiling at multiple time points following salt treatment.
Results: We identified 33,975 differentially expressed genes (DEGs), with significant enrichment in pathways related to plant hormone signal transduction, amino acid metabolism, and starch and sucrose metabolism. K-means clustering grouped the DEGs into six expression modules corresponding to distinct response stages. Additionally, UPLC-MS analysis identified 6292 metabolites-spanning lipids, carbohydrates, and amino acids-and revealed substantial metabolic reprogramming with increasing stress duration. An integrated multiomics analysis highlighted the ABC transporter and starch and sucrose metabolism pathways as key regulatory modules for salt tolerance and identified critical genes within them.
Conclusions: Collectively, these findings provide a comprehensive view of the transcriptional and metabolic dynamics of G. hirsutum under salt stress, offering valuable insights for understanding the molecular mechanisms of salt tolerance.
{"title":"Multiomics Profiling Unveils Key Genes and Metabolites Involved in the Salt Tolerance of <i>Gossypium hirsutum</i>.","authors":"Zheng Weng, Fan Wang, Xin Wei, Lianjia Zhao, Wei Wang, Jianfeng Lei","doi":"10.3390/genes17010022","DOIUrl":"10.3390/genes17010022","url":null,"abstract":"<p><strong>Background: </strong>Salt stress is a primary abiotic constraint on cotton growth, significantly impairing yield and fiber quality.</p><p><strong>Methods: </strong>To elucidate the regulatory mechanisms underlying salt stress responses in <i>Gossypium hirsutum</i>, we performed transcriptomic and metabolomic profiling at multiple time points following salt treatment.</p><p><strong>Results: </strong>We identified 33,975 differentially expressed genes (DEGs), with significant enrichment in pathways related to plant hormone signal transduction, amino acid metabolism, and starch and sucrose metabolism. K-means clustering grouped the DEGs into six expression modules corresponding to distinct response stages. Additionally, UPLC-MS analysis identified 6292 metabolites-spanning lipids, carbohydrates, and amino acids-and revealed substantial metabolic reprogramming with increasing stress duration. An integrated multiomics analysis highlighted the ABC transporter and starch and sucrose metabolism pathways as key regulatory modules for salt tolerance and identified critical genes within them.</p><p><strong>Conclusions: </strong>Collectively, these findings provide a comprehensive view of the transcriptional and metabolic dynamics of <i>G. hirsutum</i> under salt stress, offering valuable insights for understanding the molecular mechanisms of salt tolerance.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12840902/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ying Li, Wentao Guo, Hongliang Ji, Weilin Cao, Gaoqing Li, Ruirui Xu, Liming Gan
Background: The ATP-binding cassette (ABC) G subfamily, a key member of the ABC protein family, mediates plant stress responses by transporting metabolites across membranes, but its mechanism of action in tomato (Solanum lycopersicum L.) remains poorly understood.
Methods: We systematically analyzed the evolutionary relationships, structural characteristics, stress-responsive expression patterns, and functional roles in response to saline-alkali stress of the SlABCG gene family in tomato, using a combination of approaches including phylogenetic analysis (MEGA), gene structure and motif analysis (GSDS, MEME), cis-acting element prediction, homology analysis, transcriptome analysis, protein-protein interaction prediction, and qRT-PCR validation.
Results: We identified a total of 41 SlABCG genes from the tomato genome. These genes, together with 43 ABCG genes from Arabidopsis thaliana, were clustered into five distinct clades. There are 35 collinear gene pairs between the SlABCG gene family in tomato and the ABCG gene family in Arabidopsis, while 39 collinear gene pairs exist among ABCG genes within the tomato genome itself.The promoter regions of SlABCG genes contain cis-acting elements associated with responses to salicylic acid, low temperature, and gibberellin stresses. Transcriptome sequencing revealed that six SlABCG genes responded to saline-alkali stress. Gene regulatory network prediction revealed that multiple genes related to saline-alkali stress were regulated. Expression profile analysis of the 25 upregulated genes revealed that all of them were significantly upregulated during the saline-alkali stress treatment.
Conclusions: In summary, our results provide deep insights into the characteristics of the SlABCG subfamily, facilitate the design of effective analysis strategies, and offer data support for exploring the roles of ABCG transporters under different stress conditions.
{"title":"Genome-Wide Characterization of <i>SlABCG</i> Genes in Tomato Reveals Their Role in Saline-Alkali Tolerance.","authors":"Ying Li, Wentao Guo, Hongliang Ji, Weilin Cao, Gaoqing Li, Ruirui Xu, Liming Gan","doi":"10.3390/genes17010019","DOIUrl":"10.3390/genes17010019","url":null,"abstract":"<p><strong>Background: </strong>The ATP-binding cassette (ABC) G subfamily, a key member of the ABC protein family, mediates plant stress responses by transporting metabolites across membranes, but its mechanism of action in tomato (<i>Solanum lycopersicum</i> L.) remains poorly understood.</p><p><strong>Methods: </strong>We systematically analyzed the evolutionary relationships, structural characteristics, stress-responsive expression patterns, and functional roles in response to saline-alkali stress of the <i>SlABCG</i> gene family in tomato, using a combination of approaches including phylogenetic analysis (MEGA), gene structure and motif analysis (GSDS, MEME), cis-acting element prediction, homology analysis, transcriptome analysis, protein-protein interaction prediction, and qRT-PCR validation.</p><p><strong>Results: </strong>We identified a total of 41 <i>SlABCG</i> genes from the tomato genome. These genes, together with 43 <i>ABCG</i> genes from <i>Arabidopsis thaliana</i>, were clustered into five distinct clades. There are 35 collinear gene pairs between the <i>SlABCG</i> gene family in tomato and the <i>ABCG</i> gene family in <i>Arabidopsis</i>, while 39 collinear gene pairs exist among <i>ABCG</i> genes within the tomato genome itself.The promoter regions of <i>SlABCG</i> genes contain cis-acting elements associated with responses to salicylic acid, low temperature, and gibberellin stresses. Transcriptome sequencing revealed that six <i>SlABCG</i> genes responded to saline-alkali stress. Gene regulatory network prediction revealed that multiple genes related to saline-alkali stress were regulated. Expression profile analysis of the 25 upregulated genes revealed that all of them were significantly upregulated during the saline-alkali stress treatment.</p><p><strong>Conclusions: </strong>In summary, our results provide deep insights into the characteristics of the SlABCG subfamily, facilitate the design of effective analysis strategies, and offer data support for exploring the roles of ABCG transporters under different stress conditions.</p>","PeriodicalId":12688,"journal":{"name":"Genes","volume":"17 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12841099/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146062299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}