M Zhao, T E Tyugashev, A T Davletgildeeva, N A Kuznetsov
The ABH2 enzyme belongs to the AlkB-like family of Fe(II)/α-ketoglutarate-dependent dioxygenases. Various non-heme dioxygenases act on a wide range of substrates and have a complex catalytic mechanism involving α-ketoglutarate and an Fe(II) ion as a cofactor. Representatives of the AlkB family catalyze the direct oxidation of alkyl substituents in the nitrogenous bases of DNA and RNA, providing protection against the mutagenic effects of endogenous and exogenous alkylating agents, and also participate in the regulation of the methylation level of some RNAs. DNA dioxygenase ABH2, localized predominantly in the cell nucleus, is specific for double-stranded DNA substrates and, unlike most other human AlkB-like enzymes, has a fairly broad spectrum of substrate specificity, oxidizing alkyl groups of such modified nitrogenous bases as, for example, N 1-methyladenosine, N 3-methylcytidine, 1,N 6-ethenoadenosine and 3,N 4-ethenocytidine. To analyze the mechanism underlying the enzyme's substrate specificity and to clarify the functional role of key active-site amino acid residues, we performed molecular dynamics simulations of complexes of the wild-type ABH2 enzyme and its mutant forms containing amino acid substitutions V99A, F124A and S125A with two types of DNA substrates carrying methylated bases N 1-methyladenine and N 3-methylcytosine, respectively. It was found that the V99A substitution leads to an increase in the mobility of protein loops L1 and L2 involved in binding the DNA substrate and changes the distribution of π-π contacts between the side chain of residue F102 and nitrogenous bases located near the damaged nucleotide. The F124A substitution leads to the loss of π-π stacking with the damaged base, which in turn destabilizes the architecture of the active site, disrupts the interaction with the iron ion and prevents optimal catalytic positioning of α-ketoglutarate in the active site. The S125A substitution leads to the loss of direct interaction of the L2 loop with the 5'-phosphate group of the damaged nucleotide, weakening the binding of the enzyme to the DNA substrate. Thus, the obtained data revealed the functional role of three amino acid residues of the active site and contributed to the understanding of the structural-functional relationships in the recognition of a damaged nucleotide and the formation of a catalytic complex by the human ABH2 enzyme.
{"title":"Molecular dynamic analysis of the functional role of amino acid residues V99, F124 and S125 of human DNA dioxygenase ABH2.","authors":"M Zhao, T E Tyugashev, A T Davletgildeeva, N A Kuznetsov","doi":"10.18699/vjgb-25-111","DOIUrl":"https://doi.org/10.18699/vjgb-25-111","url":null,"abstract":"<p><p>The ABH2 enzyme belongs to the AlkB-like family of Fe(II)/α-ketoglutarate-dependent dioxygenases. Various non-heme dioxygenases act on a wide range of substrates and have a complex catalytic mechanism involving α-ketoglutarate and an Fe(II) ion as a cofactor. Representatives of the AlkB family catalyze the direct oxidation of alkyl substituents in the nitrogenous bases of DNA and RNA, providing protection against the mutagenic effects of endogenous and exogenous alkylating agents, and also participate in the regulation of the methylation level of some RNAs. DNA dioxygenase ABH2, localized predominantly in the cell nucleus, is specific for double-stranded DNA substrates and, unlike most other human AlkB-like enzymes, has a fairly broad spectrum of substrate specificity, oxidizing alkyl groups of such modified nitrogenous bases as, for example, N 1-methyladenosine, N 3-methylcytidine, 1,N 6-ethenoadenosine and 3,N 4-ethenocytidine. To analyze the mechanism underlying the enzyme's substrate specificity and to clarify the functional role of key active-site amino acid residues, we performed molecular dynamics simulations of complexes of the wild-type ABH2 enzyme and its mutant forms containing amino acid substitutions V99A, F124A and S125A with two types of DNA substrates carrying methylated bases N 1-methyladenine and N 3-methylcytosine, respectively. It was found that the V99A substitution leads to an increase in the mobility of protein loops L1 and L2 involved in binding the DNA substrate and changes the distribution of π-π contacts between the side chain of residue F102 and nitrogenous bases located near the damaged nucleotide. The F124A substitution leads to the loss of π-π stacking with the damaged base, which in turn destabilizes the architecture of the active site, disrupts the interaction with the iron ion and prevents optimal catalytic positioning of α-ketoglutarate in the active site. The S125A substitution leads to the loss of direct interaction of the L2 loop with the 5'-phosphate group of the damaged nucleotide, weakening the binding of the enzyme to the DNA substrate. Thus, the obtained data revealed the functional role of three amino acid residues of the active site and contributed to the understanding of the structural-functional relationships in the recognition of a damaged nucleotide and the formation of a catalytic complex by the human ABH2 enzyme.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"29 7","pages":"1062-1072"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12795828/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E A Antropova, I V Yatsyk, P S Demenkov, T V Ivanisenko, V A Ivanisenko
Macrophages are immune system cells that perform various, often opposing, functions in the organism depending on the incoming microenvironment signals. This is possible due to the plasticity of macrophages, which allows them to radically alter their phenotypic characteristics and gene expression profiles, as well as return to their original, non-activated state. Depending on the inductors acting on the cell, macrophages are activated into various functional states. There are five main phenotypes of activated macrophages: M1, M2a, M2b, M2c, and M2d. Although the amount of genome-wide transcriptomic and proteomic data showing differences between major macrophage phenotypes and non-activated macrophages (M0) is rapidly growing, questions regarding the mechanisms regulating gene and protein expression profiles in macrophages of different phenotypes still remain. We compiled lists of proteins associated with the macrophage phenotypes M1, M2a, M2b, M2c, and M2d (phenotype-associated proteins) and analyzed the data on potential mediators of macrophage polarization. Furthermore, using the computational system ANDSystem, we conducted a search and analysis of the relationships between potential regulatory proteins and the genes encoding the proteins associated with the M2 group phenotypes, obtaining estimates of the statistical significance of these relationships. The results indicate that the differences in the M2a, M2b, M2c, and M2d macrophage phenotypes may be attributed to the regulatory effects of the proteins JUN, IL8, NFAC2, CCND1, and YAP1. The expression levels of these proteins vary among the M2 group phenotypes, which in turn leads to different levels of gene expression associated with specific phenotypes.
{"title":"Identification of proteins regulating phenotype-associated genes of M2 macrophages: a bioinformatic analysis.","authors":"E A Antropova, I V Yatsyk, P S Demenkov, T V Ivanisenko, V A Ivanisenko","doi":"10.18699/vjgb-25-104","DOIUrl":"https://doi.org/10.18699/vjgb-25-104","url":null,"abstract":"<p><p>Macrophages are immune system cells that perform various, often opposing, functions in the organism depending on the incoming microenvironment signals. This is possible due to the plasticity of macrophages, which allows them to radically alter their phenotypic characteristics and gene expression profiles, as well as return to their original, non-activated state. Depending on the inductors acting on the cell, macrophages are activated into various functional states. There are five main phenotypes of activated macrophages: M1, M2a, M2b, M2c, and M2d. Although the amount of genome-wide transcriptomic and proteomic data showing differences between major macrophage phenotypes and non-activated macrophages (M0) is rapidly growing, questions regarding the mechanisms regulating gene and protein expression profiles in macrophages of different phenotypes still remain. We compiled lists of proteins associated with the macrophage phenotypes M1, M2a, M2b, M2c, and M2d (phenotype-associated proteins) and analyzed the data on potential mediators of macrophage polarization. Furthermore, using the computational system ANDSystem, we conducted a search and analysis of the relationships between potential regulatory proteins and the genes encoding the proteins associated with the M2 group phenotypes, obtaining estimates of the statistical significance of these relationships. The results indicate that the differences in the M2a, M2b, M2c, and M2d macrophage phenotypes may be attributed to the regulatory effects of the proteins JUN, IL8, NFAC2, CCND1, and YAP1. The expression levels of these proteins vary among the M2 group phenotypes, which in turn leads to different levels of gene expression associated with specific phenotypes.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"29 7","pages":"990-999"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12800646/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, the rapid growth of sequencing data has exacerbated the problem of functional annotation of protein sequences, as traditional homology-based methods face limitations when working with distant homologs, making it difficult to accurately determine protein functions. This paper introduces the OrthoML2GO method for protein function prediction, which integrates homology searches using the USEARCH algorithm, orthogroup analysis based on OrthoDB version 12.0, and a machine learning algorithm (gradient boosting). A key feature of our approach is the use of orthogroup information to account for the evolutionary and functional similarity of proteins and the application of machine learning to refine the assigned GO terms for the target sequence. To select the optimal algorithm for protein annotation, the following approaches were applied sequentially: the k-nearest neighbors (KNN) method; a method based on the annotation of the orthogroup most represented in the k-nearest homologs (OG); a method of verifying the GO terms identified in the previous stage using machine learning algorithms. A comparison of the prediction accuracy of GO terms using the OrthoML2GO method with the Blast2GO and PANNZER2 annotation programs was performed on sequence samples from both individual organisms (humans, Arabidopsis) and a combined sample represented by different taxa. Our results demonstrate that the proposed method is comparable to, and by some evaluation metrics outperforms, these existing methods in terms of the quality of protein function prediction, especially on large and heterogeneous samples of organisms. The greatest performance improvement is achieved by combining information about the closest homologs and orthogroups with verification of terms using machine learning methods. Our approach demonstrates high performance for large-scale automatic protein annotation, and prospects for further development include optimizing machine learning model parameters for specific biological tasks and integrating additional sources of structural and functional information, which will further improve the method's accuracy and versatility. In addition, the introduction of new bioinformatics tools and the expansion of the annotated protein database will contribute to the further improvement of the proposed approach.
近年来,测序数据的快速增长加剧了蛋白质序列的功能标注问题,传统的基于同源性的方法在处理远同源物时存在局限性,难以准确确定蛋白质的功能。本文介绍了用于蛋白质功能预测的OrthoML2GO方法,该方法集成了使用USEARCH算法的同源性搜索、基于OrthoDB version 12.0的正交群分析和机器学习算法(梯度增强)。我们方法的一个关键特征是使用正群信息来解释蛋白质的进化和功能相似性,并应用机器学习来优化目标序列的GO术语。为了选择最优的蛋白质注释算法,我们依次采用了以下几种方法:k近邻(KNN)方法;基于k近邻同系物(OG)中最具代表性的正群注释的方法;一种使用机器学习算法验证在前一阶段识别的GO术语的方法。利用OrthoML2GO方法与Blast2GO和PANNZER2注释程序对来自个体生物(人类、拟南芥)和不同分类群代表的组合样本的序列样本进行了GO项预测精度的比较。我们的研究结果表明,就蛋白质功能预测的质量而言,所提出的方法与这些现有方法相当,并且通过一些评估指标优于这些方法,特别是在大型和异质生物体样本上。最大的性能改进是通过使用机器学习方法将关于最接近的同系词和正群的信息与术语验证相结合来实现的。我们的方法证明了大规模自动蛋白质注释的高性能,进一步发展的前景包括优化特定生物任务的机器学习模型参数,整合额外的结构和功能信息源,这将进一步提高方法的准确性和通用性。此外,新的生物信息学工具的引入和注释蛋白数据库的扩展将有助于进一步改进所提出的方法。
{"title":"OrthoML2GO: homology-based protein function prediction using orthogroups and machine learning.","authors":"E V Malyugin, D A Afonnikov","doi":"10.18699/vjgb-25-119","DOIUrl":"https://doi.org/10.18699/vjgb-25-119","url":null,"abstract":"<p><p>In recent years, the rapid growth of sequencing data has exacerbated the problem of functional annotation of protein sequences, as traditional homology-based methods face limitations when working with distant homologs, making it difficult to accurately determine protein functions. This paper introduces the OrthoML2GO method for protein function prediction, which integrates homology searches using the USEARCH algorithm, orthogroup analysis based on OrthoDB version 12.0, and a machine learning algorithm (gradient boosting). A key feature of our approach is the use of orthogroup information to account for the evolutionary and functional similarity of proteins and the application of machine learning to refine the assigned GO terms for the target sequence. To select the optimal algorithm for protein annotation, the following approaches were applied sequentially: the k-nearest neighbors (KNN) method; a method based on the annotation of the orthogroup most represented in the k-nearest homologs (OG); a method of verifying the GO terms identified in the previous stage using machine learning algorithms. A comparison of the prediction accuracy of GO terms using the OrthoML2GO method with the Blast2GO and PANNZER2 annotation programs was performed on sequence samples from both individual organisms (humans, Arabidopsis) and a combined sample represented by different taxa. Our results demonstrate that the proposed method is comparable to, and by some evaluation metrics outperforms, these existing methods in terms of the quality of protein function prediction, especially on large and heterogeneous samples of organisms. The greatest performance improvement is achieved by combining information about the closest homologs and orthogroups with verification of terms using machine learning methods. Our approach demonstrates high performance for large-scale automatic protein annotation, and prospects for further development include optimizing machine learning model parameters for specific biological tasks and integrating additional sources of structural and functional information, which will further improve the method's accuracy and versatility. In addition, the introduction of new bioinformatics tools and the expansion of the annotated protein database will contribute to the further improvement of the proposed approach.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"29 7","pages":"1145-1154"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12799360/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
One of the main goals of modern evolutionary biology is to understand the mechanisms that lead to the initial differentiation (primary divergence) of populations into groups with genetic traits. This divergence requires reproductive isolation, which prevents or hinders contact and the exchange of genetic material between populations. This study explores the potential for isolation based not on obvious geographical barriers, population distance, or ecological specialization, but rather on hereditary mechanisms, such as gene drift and flow and selection against heterozygous individuals. To this end, we propose and investigate a dynamic discrete-time model that describes the dynamics of frequencies and numbers in a system of limited populations coupled by migrations. We consider a panmictic population with Mendelian inheritance rules, one-locus selection, and density-dependent factors limiting population growth. Individuals freely mate and randomly move around a one-dimensional ring-shaped habitat. The model was verified using data from an experiment on the box population system of Drosophila melanogaster performed by Yu.P. Altukhov et al. With rather simple assumptions, the model explains some mechanisms for the emergence and preservation of significant genetic differences between subpopulations (primary genetic divergence), accompanied by heterogeneity in allele frequencies and abundances within a homogeneous area. In this scenario, several large groups of genetically homogeneous subpopulations form and independently develop. Hybridization occurs at contact sites, and polymorphism is maintained through migration from genetically homogeneous nearby sites. It was found that only disruptive selection, directed against heterozygous individuals, can sustainably maintain such a spatial distribution. Under directional selection, divergence may occur for a short time as part of the transitional evolutionary process towards the best-adapted genotype. Because of the reduced adaptability of heterozygous (hybrid) individuals and low growth rates in these sites (hybrid zones), gene flow between adjacent sites with opposite genotypes (phenotypes) is significantly impeded. As a result, the hybrid zones can become effective geographical barriers that prevent the genetic flow between coupled subpopulations.
{"title":"Computer modeling of spatial dynamics and primary genetic divergence for a population system in a ring areal.","authors":"M P Kulakov, O L Zhdanova, E Ya Frisman","doi":"10.18699/vjgb-25-115","DOIUrl":"https://doi.org/10.18699/vjgb-25-115","url":null,"abstract":"<p><p>One of the main goals of modern evolutionary biology is to understand the mechanisms that lead to the initial differentiation (primary divergence) of populations into groups with genetic traits. This divergence requires reproductive isolation, which prevents or hinders contact and the exchange of genetic material between populations. This study explores the potential for isolation based not on obvious geographical barriers, population distance, or ecological specialization, but rather on hereditary mechanisms, such as gene drift and flow and selection against heterozygous individuals. To this end, we propose and investigate a dynamic discrete-time model that describes the dynamics of frequencies and numbers in a system of limited populations coupled by migrations. We consider a panmictic population with Mendelian inheritance rules, one-locus selection, and density-dependent factors limiting population growth. Individuals freely mate and randomly move around a one-dimensional ring-shaped habitat. The model was verified using data from an experiment on the box population system of Drosophila melanogaster performed by Yu.P. Altukhov et al. With rather simple assumptions, the model explains some mechanisms for the emergence and preservation of significant genetic differences between subpopulations (primary genetic divergence), accompanied by heterogeneity in allele frequencies and abundances within a homogeneous area. In this scenario, several large groups of genetically homogeneous subpopulations form and independently develop. Hybridization occurs at contact sites, and polymorphism is maintained through migration from genetically homogeneous nearby sites. It was found that only disruptive selection, directed against heterozygous individuals, can sustainably maintain such a spatial distribution. Under directional selection, divergence may occur for a short time as part of the transitional evolutionary process towards the best-adapted genotype. Because of the reduced adaptability of heterozygous (hybrid) individuals and low growth rates in these sites (hybrid zones), gene flow between adjacent sites with opposite genotypes (phenotypes) is significantly impeded. As a result, the hybrid zones can become effective geographical barriers that prevent the genetic flow between coupled subpopulations.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"29 7","pages":"1109-1121"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12799358/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N M Levanova, E G Vergunov, A N Savostyanov, I V Yatsyk, V A Ivanisenko
Accumulated evidence links dysregulated cytokine signaling to the pathogenesis of autism spectrum disorder (ASD), implicating genes, proteins, and their intermolecular networks. This paper systematizes these findings using bioinformatics analysis and machine learning methods. The primary tool employed in the study was the ANDSystem cognitive platform, developed at the Institute of Cytology and Genetics, which utilizes artificial intelligence techniques for automated knowledge extraction from biomedical databases and scientific publications. Using ANDSystem, we reconstructed a gene network of cytokine-mediated regulation of autism spectrum disorder (ASD)-associated genes and proteins. The analysis identified 110 cytokines that regulate the activity, degradation, and transport of 58 proteins involved in ASD pathogenesis, as well as the expression of 91 ASD-associated genes. Gene Ontology (GO) enrichment analysis revealed statistically significant associations of these genes with biological processes related to the development and function of the central nervous system. Furthermore, topological network analysis and functional significance assessment based on association with ASD-related GO biological processes allowed us to identify 21 cytokines exerting the strongest influence on the regulatory network. Among these, eight cytokines (IL-4, TGF-β1, BMP4, VEGFA, BMP2, IL-10, IFN-γ, TNF-α) had the highest priority, ranking at the top across all employed metrics. Notably, eight of the 21 prioritized cytokines (TNF-α, IL-6, IL-4, VEGFA, IL-2, IL-1β, IFN-γ, IL-17) are known targets of drugs currently used as immunosuppressants and antitumor agents. The pivotal role of these cytokines in ASD pathogenesis provides a rationale for potentially repurposing such inhibitory drugs for the treatment of autism spectrum disorders.
{"title":"In silico reconstruction of the gene network for cytokine regulation of ASD-associated genes and proteins.","authors":"N M Levanova, E G Vergunov, A N Savostyanov, I V Yatsyk, V A Ivanisenko","doi":"10.18699/vjgb-25-105","DOIUrl":"https://doi.org/10.18699/vjgb-25-105","url":null,"abstract":"<p><p>Accumulated evidence links dysregulated cytokine signaling to the pathogenesis of autism spectrum disorder (ASD), implicating genes, proteins, and their intermolecular networks. This paper systematizes these findings using bioinformatics analysis and machine learning methods. The primary tool employed in the study was the ANDSystem cognitive platform, developed at the Institute of Cytology and Genetics, which utilizes artificial intelligence techniques for automated knowledge extraction from biomedical databases and scientific publications. Using ANDSystem, we reconstructed a gene network of cytokine-mediated regulation of autism spectrum disorder (ASD)-associated genes and proteins. The analysis identified 110 cytokines that regulate the activity, degradation, and transport of 58 proteins involved in ASD pathogenesis, as well as the expression of 91 ASD-associated genes. Gene Ontology (GO) enrichment analysis revealed statistically significant associations of these genes with biological processes related to the development and function of the central nervous system. Furthermore, topological network analysis and functional significance assessment based on association with ASD-related GO biological processes allowed us to identify 21 cytokines exerting the strongest influence on the regulatory network. Among these, eight cytokines (IL-4, TGF-β1, BMP4, VEGFA, BMP2, IL-10, IFN-γ, TNF-α) had the highest priority, ranking at the top across all employed metrics. Notably, eight of the 21 prioritized cytokines (TNF-α, IL-6, IL-4, VEGFA, IL-2, IL-1β, IFN-γ, IL-17) are known targets of drugs currently used as immunosuppressants and antitumor agents. The pivotal role of these cytokines in ASD pathogenesis provides a rationale for potentially repurposing such inhibitory drugs for the treatment of autism spectrum disorders.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"29 7","pages":"1000-1008"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12795833/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Т S Golubeva, V A Cherenko, E A Filipenko, I V Zhirnov, A A Ivanov, A V Kochetov
RNA interference (RNAi) is a powerful tool for gene silencing. It has recently been used to design promising plant protection strategies against pests such as viruses, insects, etc. This generally requires modifying the plant genome to achieve in planta synthesis of the double-stranded RNA (dsRNA), which guides the cellular RNA interference machinery to silence the genes of interest. However, given Russian legislation, the approach in which dsRNA is synthesized by the plant itself remains unavailable for crop protection. The use of exogenously produced dsRNA appears to be a promising alternative, allowing researchers to avoid genetic modification of plants, making it possible to implement potential results in agriculture. Furthermore, exogenous dsRNAs are superior to chemical pesticides (fungicides, insecticides, etc.), which are widely used to control various plant diseases. The dsRNA acts through sequence-specific nucleic acid interactions, making it extremely selective and unlikely to harm off-target organisms. Thus, it seems promising to utilize RNAi technology for agricultural plant protection. In this case, questions arise regarding how to produce the required amounts of pathogen-specific exogenous dsRNA, and which delivery method will be optimal for providing sufficient protection. This work aims to utilize exogenous dsRNA to silence the Nicotiana benthamiana phytoene desaturase gene. Phytoene desaturase is a convenient model gene in gene silencing experiments, as its knockdown results in a distinct phenotypic manifestation, namely, leaf bleaching. The dsRNA synthesis for this work was performed in vivo in Escherichia coli cells, and the chosen delivery method was root treatment through watering, both techniques being as simple and accessible as possible. It is surmised that the proposed approach could be adapted for broader use of RNAi technologies in agricultural crop protection.
{"title":"Silencing of the Nicotiana benthamiana phytoendesaturase gene by root treatment of exogenous dsRNA.","authors":"Т S Golubeva, V A Cherenko, E A Filipenko, I V Zhirnov, A A Ivanov, A V Kochetov","doi":"10.18699/vjgb-25-123","DOIUrl":"https://doi.org/10.18699/vjgb-25-123","url":null,"abstract":"<p><p>RNA interference (RNAi) is a powerful tool for gene silencing. It has recently been used to design promising plant protection strategies against pests such as viruses, insects, etc. This generally requires modifying the plant genome to achieve in planta synthesis of the double-stranded RNA (dsRNA), which guides the cellular RNA interference machinery to silence the genes of interest. However, given Russian legislation, the approach in which dsRNA is synthesized by the plant itself remains unavailable for crop protection. The use of exogenously produced dsRNA appears to be a promising alternative, allowing researchers to avoid genetic modification of plants, making it possible to implement potential results in agriculture. Furthermore, exogenous dsRNAs are superior to chemical pesticides (fungicides, insecticides, etc.), which are widely used to control various plant diseases. The dsRNA acts through sequence-specific nucleic acid interactions, making it extremely selective and unlikely to harm off-target organisms. Thus, it seems promising to utilize RNAi technology for agricultural plant protection. In this case, questions arise regarding how to produce the required amounts of pathogen-specific exogenous dsRNA, and which delivery method will be optimal for providing sufficient protection. This work aims to utilize exogenous dsRNA to silence the Nicotiana benthamiana phytoene desaturase gene. Phytoene desaturase is a convenient model gene in gene silencing experiments, as its knockdown results in a distinct phenotypic manifestation, namely, leaf bleaching. The dsRNA synthesis for this work was performed in vivo in Escherichia coli cells, and the chosen delivery method was root treatment through watering, both techniques being as simple and accessible as possible. It is surmised that the proposed approach could be adapted for broader use of RNAi technologies in agricultural crop protection.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"29 8","pages":"1169-1175"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12876927/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper reviews existing approaches for reconstructing frame-based mathematical models of molecular genetic systems from the level of genetic synthesis to models of metabolic networks. A frame-based mathematical model is a model in which the following terms are specified: formal structure, type of mathematical model for a particular biochemical process, reactants and their roles. Typically, such models are generated automatically on the basis of description of biological processes in terms of domain-specific languages. For molecular genetic systems, these languages use constructions familiar to a wide range of biologists in the form of a list of biochemical reactions. They rely on the concepts of elementary subsystems, where complex models are assembled from small block units ("frames"). In this paper, we have shown an example with the generation of a classical repressilator model consisting of three genes that mutually inhibit each other's synthesis. We have given it in three different versions of the graphic standard, its characteristic mathematical interpretation and variants of its numerical calculation. We have shown that even at the level of frame models it is possible to identify qualitatively new behaviour of the model through the introduction of just one gene into the model structure. This change provides a way to control the modes of behaviour of the model through changing the concentrations of reactants. The frame-based approach opens the way to generate models of cells, tissues, organs, organisms and communities through frame-based model generation tools that specify structure, roles of modelled reactants using domain-specific languages and graphical methods of model specification.
{"title":"Frame-based mathematical models - a tool for the study of molecular genetic systems.","authors":"F V Kazantsev, S A Lashin, Yu G Matushkin","doi":"10.18699/vjgb-25-135","DOIUrl":"https://doi.org/10.18699/vjgb-25-135","url":null,"abstract":"<p><p>This paper reviews existing approaches for reconstructing frame-based mathematical models of molecular genetic systems from the level of genetic synthesis to models of metabolic networks. A frame-based mathematical model is a model in which the following terms are specified: formal structure, type of mathematical model for a particular biochemical process, reactants and their roles. Typically, such models are generated automatically on the basis of description of biological processes in terms of domain-specific languages. For molecular genetic systems, these languages use constructions familiar to a wide range of biologists in the form of a list of biochemical reactions. They rely on the concepts of elementary subsystems, where complex models are assembled from small block units (\"frames\"). In this paper, we have shown an example with the generation of a classical repressilator model consisting of three genes that mutually inhibit each other's synthesis. We have given it in three different versions of the graphic standard, its characteristic mathematical interpretation and variants of its numerical calculation. We have shown that even at the level of frame models it is possible to identify qualitatively new behaviour of the model through the introduction of just one gene into the model structure. This change provides a way to control the modes of behaviour of the model through changing the concentrations of reactants. The frame-based approach opens the way to generate models of cells, tissues, organs, organisms and communities through frame-based model generation tools that specify structure, roles of modelled reactants using domain-specific languages and graphical methods of model specification.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"25 8","pages":"1288-1294"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12876924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M F Sanamyan, Sh U Bobokhujayev, Sh S Abdukarimov, J S Uralov, A B Rustamov
The creation of chromosome substitution lines containing one pair of chromosomes from a related species is one method for introgression of alien genetic material. The frequency of substitutions in different chromosomes of the genome varies due to the selective transmission of alien chromosomes through the gametes of hybrids. The use of monosomic lines with identified univalent chromosomes and molecular genetic SSR markers at the seedling stage allowed rapid screening of the identity of the alien chromosome in backcross hybrids, significantly accelerating and facilitating the backcrossing process for the creation of new chromosome substitution cotton lines. As a result of studying the process of transmission of chromosome 2 of the At subgenome of the cotton plant G. barbadense L. during backcrossing of four original monosomic lines of G. hirsutum L. with monosomic backcross hybrids with substitution of chromosome 2 of the At subgenome, the following specific consequences of the introgression of this chromosome were revealed: decreased crossability, setting and germination of hybrid seeds; differences in the frequency and nature of transmission of chromosome 2 of the At subgenome of the cotton plant G. barbadensе; regularity of chromosome behavior in meiosis; a high meiotic index; a significant decrease in pollen fertility in backcross monosomic hybrids BC1F1; specific morphobiological characteristics of monosomic backcrossed plants, such as delayed development of vegetative and generative organs; dwarfism; reduced foliage; and poor budding and flowering during the first year of vegetation. All of these factors negatively impact the study and backcrossing of monosomic hybrids and significantly complicate and delay the creation of chromosome-substituted forms concerning chromosome 2 of the At subgenome of cotton, G. barbadense. These specific changes likely occurred as a result of hybrid genome reorganization and introgression of alien chromatin. Furthermore, the effectiveness of using molecular genetic microsatellite (SSR) markers to monitor backcrossing processes and eliminate genetic material from the Pima 3-79 donor line of G. barbadense for the selection of genotypes with alien chromosome substitutions has been demonstrated.
{"title":"Study of the influence of introgression from chromosome 2 of the At subgenome of cotton Gossypium barbadense L. during backcrossing with the original lines of G. hirsutum L.","authors":"M F Sanamyan, Sh U Bobokhujayev, Sh S Abdukarimov, J S Uralov, A B Rustamov","doi":"10.18699/vjgb-25-125","DOIUrl":"https://doi.org/10.18699/vjgb-25-125","url":null,"abstract":"<p><p>The creation of chromosome substitution lines containing one pair of chromosomes from a related species is one method for introgression of alien genetic material. The frequency of substitutions in different chromosomes of the genome varies due to the selective transmission of alien chromosomes through the gametes of hybrids. The use of monosomic lines with identified univalent chromosomes and molecular genetic SSR markers at the seedling stage allowed rapid screening of the identity of the alien chromosome in backcross hybrids, significantly accelerating and facilitating the backcrossing process for the creation of new chromosome substitution cotton lines. As a result of studying the process of transmission of chromosome 2 of the At subgenome of the cotton plant G. barbadense L. during backcrossing of four original monosomic lines of G. hirsutum L. with monosomic backcross hybrids with substitution of chromosome 2 of the At subgenome, the following specific consequences of the introgression of this chromosome were revealed: decreased crossability, setting and germination of hybrid seeds; differences in the frequency and nature of transmission of chromosome 2 of the At subgenome of the cotton plant G. barbadensе; regularity of chromosome behavior in meiosis; a high meiotic index; a significant decrease in pollen fertility in backcross monosomic hybrids BC1F1; specific morphobiological characteristics of monosomic backcrossed plants, such as delayed development of vegetative and generative organs; dwarfism; reduced foliage; and poor budding and flowering during the first year of vegetation. All of these factors negatively impact the study and backcrossing of monosomic hybrids and significantly complicate and delay the creation of chromosome-substituted forms concerning chromosome 2 of the At subgenome of cotton, G. barbadense. These specific changes likely occurred as a result of hybrid genome reorganization and introgression of alien chromatin. Furthermore, the effectiveness of using molecular genetic microsatellite (SSR) markers to monitor backcrossing processes and eliminate genetic material from the Pima 3-79 donor line of G. barbadense for the selection of genotypes with alien chromosome substitutions has been demonstrated.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"29 8","pages":"1184-1194"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12876928/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T V Ivanisenko, P S Demenkov, M A Kleshchev, V A Ivanisenko
In recent years, artificial intelligence methods based on the analysis of heterogeneous graphs of biomedical networks have become widely used for predicting molecular interactions. In particular, graph neural networks (GNNs) effectively identify missing edges in gene networks - such as protein-protein interaction, gene-disease, drug-target, and other networks - thereby enabling the prediction of new biological relationships. To reconstruct gene networks, cognitive systems for automatic text mining of scientific publications and databases are often employed. One such AI-driven platform, ANDSystem, is designed for automatic knowledge extraction of molecular interactions and, on this basis, the reconstruction of associative gene networks. The ANDSystem knowledge base contains information on more than 100 million interactions among diverse molecular genetic entities (genes, proteins, metabolites, drugs, etc.). The interactions span a wide range of types: regulatory relationships, physical interactions (protein-protein, protein-ligand), catalytic and chemical reactions, and associations among genes, phenotypes, diseases, and more. In the present study, we applied attention-based graph neural networks trained on the ANDSystem knowledge graph to predict new edges between proteins and ligands and to identify potential ligands for the SARS-CoV-2 ORF3a protein. The accessory protein ORF3a plays an important role in viral pathogenesis through ion-channel activity, induction of apoptosis, and the ability to modulate endolysosomal processes and the host innate immune response. Despite this broad functional spectrum, ORF3a has been explored far less as a pharmacological target than other viral proteins. Using a graph neural network, we predicted five small molecules of different origins (metabolites and a drug) that potentially interact with ORF3a: N-acetyl-D-glucosamine, 4-(benzoylamino)benzoic acid, austocystin D, bictegravirum, and L-threonine. Molecular docking and MM/GBSA affinity estimation indicate the potential ability of these compounds to form complexes with ORF3a. Localization analysis showed that the binding sites of bictegravir and 4-(benzoylamino)benzoic acid lie in a cytosolic surface pocket of the protein that is solvent-exposed; L-threonine binds within the intersubunit cleft of the dimer; and austocystin D and N-acetyl-D-glucosamine are positioned at the boundary between the cytosolic surface and the transmembrane region. The accessibility of these binding sites may be reduced by the influence of the lipid bilayer. The binding energetics for bictegravirum were more favorable than for 4-(benzoylamino)benzoic acid (docking score -7.37 kcal/mol; MM/GBSA ΔG -14.71 ± 3.12 kcal/mol), making bictegravirum a promising candidate for repurposing as an ORF3a inhibitor.
{"title":"Prediction of interactions between the SARS-CoV-2 ORF3a protein and small-molecule ligands using the ANDSystem cognitive platform, graph neural networks, and molecular modeling.","authors":"T V Ivanisenko, P S Demenkov, M A Kleshchev, V A Ivanisenko","doi":"10.18699/vjgb-25-113","DOIUrl":"https://doi.org/10.18699/vjgb-25-113","url":null,"abstract":"<p><p>In recent years, artificial intelligence methods based on the analysis of heterogeneous graphs of biomedical networks have become widely used for predicting molecular interactions. In particular, graph neural networks (GNNs) effectively identify missing edges in gene networks - such as protein-protein interaction, gene-disease, drug-target, and other networks - thereby enabling the prediction of new biological relationships. To reconstruct gene networks, cognitive systems for automatic text mining of scientific publications and databases are often employed. One such AI-driven platform, ANDSystem, is designed for automatic knowledge extraction of molecular interactions and, on this basis, the reconstruction of associative gene networks. The ANDSystem knowledge base contains information on more than 100 million interactions among diverse molecular genetic entities (genes, proteins, metabolites, drugs, etc.). The interactions span a wide range of types: regulatory relationships, physical interactions (protein-protein, protein-ligand), catalytic and chemical reactions, and associations among genes, phenotypes, diseases, and more. In the present study, we applied attention-based graph neural networks trained on the ANDSystem knowledge graph to predict new edges between proteins and ligands and to identify potential ligands for the SARS-CoV-2 ORF3a protein. The accessory protein ORF3a plays an important role in viral pathogenesis through ion-channel activity, induction of apoptosis, and the ability to modulate endolysosomal processes and the host innate immune response. Despite this broad functional spectrum, ORF3a has been explored far less as a pharmacological target than other viral proteins. Using a graph neural network, we predicted five small molecules of different origins (metabolites and a drug) that potentially interact with ORF3a: N-acetyl-D-glucosamine, 4-(benzoylamino)benzoic acid, austocystin D, bictegravirum, and L-threonine. Molecular docking and MM/GBSA affinity estimation indicate the potential ability of these compounds to form complexes with ORF3a. Localization analysis showed that the binding sites of bictegravir and 4-(benzoylamino)benzoic acid lie in a cytosolic surface pocket of the protein that is solvent-exposed; L-threonine binds within the intersubunit cleft of the dimer; and austocystin D and N-acetyl-D-glucosamine are positioned at the boundary between the cytosolic surface and the transmembrane region. The accessibility of these binding sites may be reduced by the influence of the lipid bilayer. The binding energetics for bictegravirum were more favorable than for 4-(benzoylamino)benzoic acid (docking score -7.37 kcal/mol; MM/GBSA ΔG -14.71 ± 3.12 kcal/mol), making bictegravirum a promising candidate for repurposing as an ORF3a inhibitor.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"29 7","pages":"1084-1096"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12799363/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vision plays a key role in the lives of various organisms, enabling spatial orientation, foraging, predator avoidance and social interaction. In species with relatively simple visual systems, such as insects, effective behavioral strategies are achieved through high neural specialization, adaptation to specific environmental conditions, and the use of additional sensory systems such as olfaction or hearing. Animals with more complex vision and nervous systems, such as mammals, have greater cognitive abilities and flexibility, but this comes with increased demands on the brain's energy costs and computational resources. Modeling the features of such systems in a virtual environment could allow researchers to explore the fundamental principles of sensorimotor integration and the limits of cognitive complexity, as well as test hypotheses about the interaction between perception, memory and decision-making mechanisms. In this work, we implement and investigate a model of virtual organisms with a visual system operating in a three-dimensional physical environment using the Unity ML-Agents software - one of the most high-performance simulation platforms currently available. We propose a hierarchical control architecture that separates locomotion and navigation tasks between two modules: (1) visual perception and decision-making, and (2) coordinated control of limb movement for locomotion in the physical environment. A series of numerical experiments was conducted to examine the influence of visual system parameters (e. g, resolution of the "first-person" view), environmental configuration and agent architectural features on the efficiency and outcomes of reinforcement learning (using the PPO algorithm). The results demonstrate the existence of an optimal range of resolutions that provide a trade-off between computational complexity and success in accomplishing the task, while excessive dimensionality of sensory inputs or action space leads to slower learning. We performed system performance profiling and identified key bottlenecks in large-scale simulations. The discussion considers biological parallels, highlighting cases of high behavioral efficiency in insects with relatively low-resolution visual systems, and the potential of neuroevolutionary approaches for adapting agent architectures. The proposed approach and the results obtained are of potential interest to researchers working on biologically inspired artificial agents, evolutionary modeling, and the study of cognitive processes in artificial systems.
{"title":"Self-learning virtual organisms in a physics simulator: on the optimal resolution of their visual system, the architecture of the nervous system and the computational complexity of the problem.","authors":"M S Zenin, A P Devyaterikov, A Yu Palyanov","doi":"10.18699/vjgb-25-110","DOIUrl":"https://doi.org/10.18699/vjgb-25-110","url":null,"abstract":"<p><p>Vision plays a key role in the lives of various organisms, enabling spatial orientation, foraging, predator avoidance and social interaction. In species with relatively simple visual systems, such as insects, effective behavioral strategies are achieved through high neural specialization, adaptation to specific environmental conditions, and the use of additional sensory systems such as olfaction or hearing. Animals with more complex vision and nervous systems, such as mammals, have greater cognitive abilities and flexibility, but this comes with increased demands on the brain's energy costs and computational resources. Modeling the features of such systems in a virtual environment could allow researchers to explore the fundamental principles of sensorimotor integration and the limits of cognitive complexity, as well as test hypotheses about the interaction between perception, memory and decision-making mechanisms. In this work, we implement and investigate a model of virtual organisms with a visual system operating in a three-dimensional physical environment using the Unity ML-Agents software - one of the most high-performance simulation platforms currently available. We propose a hierarchical control architecture that separates locomotion and navigation tasks between two modules: (1) visual perception and decision-making, and (2) coordinated control of limb movement for locomotion in the physical environment. A series of numerical experiments was conducted to examine the influence of visual system parameters (e. g, resolution of the \"first-person\" view), environmental configuration and agent architectural features on the efficiency and outcomes of reinforcement learning (using the PPO algorithm). The results demonstrate the existence of an optimal range of resolutions that provide a trade-off between computational complexity and success in accomplishing the task, while excessive dimensionality of sensory inputs or action space leads to slower learning. We performed system performance profiling and identified key bottlenecks in large-scale simulations. The discussion considers biological parallels, highlighting cases of high behavioral efficiency in insects with relatively low-resolution visual systems, and the potential of neuroevolutionary approaches for adapting agent architectures. The proposed approach and the results obtained are of potential interest to researchers working on biologically inspired artificial agents, evolutionary modeling, and the study of cognitive processes in artificial systems.</p>","PeriodicalId":44339,"journal":{"name":"Vavilovskii Zhurnal Genetiki i Selektsii","volume":"29 7","pages":"1051-1061"},"PeriodicalIF":1.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12795856/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145970907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}