Single-cell RNA sequencing (scRNA-seq) technology has garnered considerable attention as it enables the exploration of cellular heterogeneity from a single-cell perspective. Various unsupervised methods, such as biclustering and clustering methods, offer a theoretical foundation for understanding the structure and function of cells. However, accurately identifying cell subtypes within complex scRNA-seq data remains challenging. To evaluate the current development status; summarize the strengths, weaknesses, and improvement strategies of unsupervised methods; and provide guidelines for future research, we surveyed five biclustering and 21 clustering methods applied to different types of scRNA-seq datasets. We employed three external and two internal metrics to determine clustering performance on 10 publicly available real datasets. Dataset properties are quantified from six perspectives to discover the most suitable biclustering or clustering methods. The results of this survey indicate that biclustering methods are effective for identifying local consistency or for deeply mining partially annotated datasets. Conversely, clustering methods are more suitable for dealing with unknown datasets. This survey aids in identifying cellular heterogeneity by recommending appropriate methods based on different dataset characteristics.
{"title":"A survey of biclustering and clustering methods in clustering different types of single-cell RNA sequencing data.","authors":"Chaowang Lan, Xiaoqi Tang, Caihua Liu","doi":"10.1093/bfgp/elaf010","DOIUrl":"10.1093/bfgp/elaf010","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) technology has garnered considerable attention as it enables the exploration of cellular heterogeneity from a single-cell perspective. Various unsupervised methods, such as biclustering and clustering methods, offer a theoretical foundation for understanding the structure and function of cells. However, accurately identifying cell subtypes within complex scRNA-seq data remains challenging. To evaluate the current development status; summarize the strengths, weaknesses, and improvement strategies of unsupervised methods; and provide guidelines for future research, we surveyed five biclustering and 21 clustering methods applied to different types of scRNA-seq datasets. We employed three external and two internal metrics to determine clustering performance on 10 publicly available real datasets. Dataset properties are quantified from six perspectives to discover the most suitable biclustering or clustering methods. The results of this survey indicate that biclustering methods are effective for identifying local consistency or for deeply mining partially annotated datasets. Conversely, clustering methods are more suitable for dealing with unknown datasets. This survey aids in identifying cellular heterogeneity by recommending appropriate methods based on different dataset characteristics.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"24 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12342763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144838650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thyroid cancer is one of the most common endocrine diseases worldwide with phenotypic heterogeneity. Deubiquitinating enzymes (DUBs) participated in ubiquitin (Ub) conjugases-induced signal by removing Ub from the substrates. Dysregulation of DUBs are associated with cancer progression, including thyroid carcinoma. In this review, we outline the main classification and structure of DUBs, the expression of DUBs in thyroid cancer, the association of DUBs with survival, and the possible mechanism of DUBs in thyroid cancer progression. Finally, we summarized the development of USP specific inhibitors, the strategies for designing and identifying selective inhibitors.
{"title":"Expression and role of deubiquitinating enzymes in thyroid carcinoma.","authors":"Meiling Huang, Changjiao Yan, Rui Ling, Ting Wang","doi":"10.1093/bfgp/elaf022","DOIUrl":"10.1093/bfgp/elaf022","url":null,"abstract":"<p><p>Thyroid cancer is one of the most common endocrine diseases worldwide with phenotypic heterogeneity. Deubiquitinating enzymes (DUBs) participated in ubiquitin (Ub) conjugases-induced signal by removing Ub from the substrates. Dysregulation of DUBs are associated with cancer progression, including thyroid carcinoma. In this review, we outline the main classification and structure of DUBs, the expression of DUBs in thyroid cancer, the association of DUBs with survival, and the possible mechanism of DUBs in thyroid cancer progression. Finally, we summarized the development of USP specific inhibitors, the strategies for designing and identifying selective inhibitors.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"24 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12700091/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145745903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed Azzam, Ziyang Xu, Ruobing Liu, Lie Li, Kah Meng Soh, Kishore B Challagundla, Shibiao Wan, Jieqiong Wang
The study of brain age has emerged over the past decade, aiming to estimate a person's age based on brain imaging scans. Ideally, predicted brain age should match chronological age in healthy individuals. However, brain structure and function change in the presence of brain-related diseases. Consequently, brain age also changes in affected individuals, making the brain age gap (BAG)-the difference between brain age and chronological age-a potential biomarker for brain health, early screening, and identifying age-related cognitive decline and disorders. With the recent successes of artificial intelligence in healthcare, it is essential to track the latest advancements and highlight promising directions. This review paper presents recent machine learning techniques used in brain age estimation (BAE) studies. Typically, BAE models involve developing a machine learning regression model to capture age-related variations in brain structure from imaging scans of healthy individuals and automatically predict brain age for new subjects. The process also involves estimating BAG as a measure of brain health. While we discuss recent clinical applications of BAE methods, we also review studies of biological age that can be integrated into BAE research. Finally, we point out the current limitations of BAE's studies.
{"title":"A review of artificial intelligence-based brain age estimation and its applications for related diseases.","authors":"Mohamed Azzam, Ziyang Xu, Ruobing Liu, Lie Li, Kah Meng Soh, Kishore B Challagundla, Shibiao Wan, Jieqiong Wang","doi":"10.1093/bfgp/elae042","DOIUrl":"10.1093/bfgp/elae042","url":null,"abstract":"<p><p>The study of brain age has emerged over the past decade, aiming to estimate a person's age based on brain imaging scans. Ideally, predicted brain age should match chronological age in healthy individuals. However, brain structure and function change in the presence of brain-related diseases. Consequently, brain age also changes in affected individuals, making the brain age gap (BAG)-the difference between brain age and chronological age-a potential biomarker for brain health, early screening, and identifying age-related cognitive decline and disorders. With the recent successes of artificial intelligence in healthcare, it is essential to track the latest advancements and highlight promising directions. This review paper presents recent machine learning techniques used in brain age estimation (BAE) studies. Typically, BAE models involve developing a machine learning regression model to capture age-related variations in brain structure from imaging scans of healthy individuals and automatically predict brain age for new subjects. The process also involves estimating BAG as a measure of brain health. While we discuss recent clinical applications of BAE methods, we also review studies of biological age that can be integrated into BAE research. Finally, we point out the current limitations of BAE's studies.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735757/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142481472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Super-enhancers (SEs) are typically located in the regulatory regions of genes, driving high-level gene expression. Identifying SEs is crucial for a deeper understanding of gene regulatory networks, disease mechanisms, and the development and physiological processes of organisms, thus exerting a profound impact on research and applications in the life sciences field. Traditional experimental methods for identifying SEs are costly and time-consuming. Existing methods for predicting SEs based solely on sequence data use deep learning for feature representation and have achieved good results. However, they overlook biological features related to physicochemical properties, leading to low interpretability. Additionally, the complex model structure often requires extensive labeled data for training, which limits their further application in biological data. In this paper, we integrate the strengths of different models and proposes an ensemble model based on an integration strategy to enhance the model's generalization ability. It designs a multi-angle feature representation method that combines local structure and global information to extract high-dimensional abstract relationships and key low-dimensional biological features from sequences. This enhances the effectiveness and interpretability of the model's input features, providing technical support for discovering cell-specific and species-specific patterns of SEs. We evaluated the performance on both mouse and human datasets using five metrics, including area under the receiver operating characteristic curve accuracy, and others. Compared to the latest models, EnsembleSE achieved an average improvement of 4.5% in F1 score and an average improvement of 8.05% in recall, demonstrating the robustness and adaptability of the model on a unified test set. Source codes are available at https://github.com/2103374200/EnsembleSE-main.
{"title":"EnsembleSE: identification of super-enhancers based on ensemble learning.","authors":"Wenying He, Jialu Xu, Yun Zuo, Yude Bai, Fei Guo","doi":"10.1093/bfgp/elaf003","DOIUrl":"https://doi.org/10.1093/bfgp/elaf003","url":null,"abstract":"<p><p>Super-enhancers (SEs) are typically located in the regulatory regions of genes, driving high-level gene expression. Identifying SEs is crucial for a deeper understanding of gene regulatory networks, disease mechanisms, and the development and physiological processes of organisms, thus exerting a profound impact on research and applications in the life sciences field. Traditional experimental methods for identifying SEs are costly and time-consuming. Existing methods for predicting SEs based solely on sequence data use deep learning for feature representation and have achieved good results. However, they overlook biological features related to physicochemical properties, leading to low interpretability. Additionally, the complex model structure often requires extensive labeled data for training, which limits their further application in biological data. In this paper, we integrate the strengths of different models and proposes an ensemble model based on an integration strategy to enhance the model's generalization ability. It designs a multi-angle feature representation method that combines local structure and global information to extract high-dimensional abstract relationships and key low-dimensional biological features from sequences. This enhances the effectiveness and interpretability of the model's input features, providing technical support for discovering cell-specific and species-specific patterns of SEs. We evaluated the performance on both mouse and human datasets using five metrics, including area under the receiver operating characteristic curve accuracy, and others. Compared to the latest models, EnsembleSE achieved an average improvement of 4.5% in F1 score and an average improvement of 8.05% in recall, demonstrating the robustness and adaptability of the model on a unified test set. Source codes are available at https://github.com/2103374200/EnsembleSE-main.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"24 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12008123/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143995578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziming Su, Xinyu Zhang, Qiming Wang, Qianwei Tang, Dan Yang, Yaqing Liu
The downstream analysis of 16S rRNA sequencing data remains a significant challenge for researchers lacking extensive bioinformatics expertise, often requiring proficiency in diverse tools and methodologies. To address this, we present amplysis, an R package designed to streamline the analysis and visualization of 16S rRNA amplicon sequencing data through an intuitive, code-light workflow. amplysis integrates data importing, processing, statistical analysis, and visualization into a unified framework. Key features include data normalization, microbial composition profiling, alpha/beta diversity analysis, ordination methods (e.g. Principal Component Analysis), and publication-ready visualization tools. The package's utility was demonstrated through three case studies, one of which analyzed microbial community responses to hexachlorocyclohexane (HCH) degradation in groundwater environments. Using amplysis, we efficiently generated phylum/genus-level abundance plots, alpha-diversity indices, and Principal Coordinates Analysis ordination, revealing significant shifts in community structure and diversity under HCH stress. The other case studies utilized publicly available data from published studies by other researchers. These results underscore the package's ability to simplify complex analyses while ensuring reproducibility and high-quality output. By integrating modular, user-friendly functions, amplysis lowers the barrier to robust microbiome data exploration. The package is available on GitHub (https://github.com/min-perilla/amplysis), offering a valuable resource for researchers in microbial ecology and environmental genomics.
{"title":"amplysis: an R package for microbial composition and diversity analysis using 16S rRNA amplicon data.","authors":"Ziming Su, Xinyu Zhang, Qiming Wang, Qianwei Tang, Dan Yang, Yaqing Liu","doi":"10.1093/bfgp/elaf017","DOIUrl":"10.1093/bfgp/elaf017","url":null,"abstract":"<p><p>The downstream analysis of 16S rRNA sequencing data remains a significant challenge for researchers lacking extensive bioinformatics expertise, often requiring proficiency in diverse tools and methodologies. To address this, we present amplysis, an R package designed to streamline the analysis and visualization of 16S rRNA amplicon sequencing data through an intuitive, code-light workflow. amplysis integrates data importing, processing, statistical analysis, and visualization into a unified framework. Key features include data normalization, microbial composition profiling, alpha/beta diversity analysis, ordination methods (e.g. Principal Component Analysis), and publication-ready visualization tools. The package's utility was demonstrated through three case studies, one of which analyzed microbial community responses to hexachlorocyclohexane (HCH) degradation in groundwater environments. Using amplysis, we efficiently generated phylum/genus-level abundance plots, alpha-diversity indices, and Principal Coordinates Analysis ordination, revealing significant shifts in community structure and diversity under HCH stress. The other case studies utilized publicly available data from published studies by other researchers. These results underscore the package's ability to simplify complex analyses while ensuring reproducibility and high-quality output. By integrating modular, user-friendly functions, amplysis lowers the barrier to robust microbiome data exploration. The package is available on GitHub (https://github.com/min-perilla/amplysis), offering a valuable resource for researchers in microbial ecology and environmental genomics.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"24 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12640542/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145589875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial transcriptomics has revolutionized our ability to measure gene expression while preserving spatial information, thus facilitating detailed analysis of tissue structure and function. Identifying spatial domains accurately is key for understanding tissue microenvironments and biological progression. To overcome the challenge of integrating gene expression data with spatial information, we introduce the VARGG deep learning framework. VARGG combines a pretrained Vision Transformer (ViT) with a graph neural network autoencoder, utilizing ViT's self-attention mechanism to capture global contextual information and enhance understanding of spatial relationships. This framework is further enhanced by multi-layer gated residual graph neural networks and Gaussian noise, which improve feature representation and model generalizability across different data sources. The robustness and scalability of VARGG have been verified on different platforms (10x Visium, Slide-seqV2, Stereo-seq, and MERFISH) and datasets of different sizes (human glioblastoma, mouse embryo, breast cancer). Our results demonstrate that VARGG's ability to accurately delineate spatial domains can provide a deeper understanding of tissue structure and help identify key molecular markers and potential therapeutic targets, thereby improving our understanding of disease mechanisms and providing opportunities for personalization to inform the development of treatment strategies.
{"title":"VARGG: a deep learning framework advancing precise spatial domain identification and cellular heterogeneity analysis in spatial transcriptomics.","authors":"Mengqiu Wang, Zhiwei Zhang, Lixin Lei, Kaitai Han, Zhenghui Wang, Ruoyan Dai, Zijun Wang, Chaojing Shi, Xudong Zhao, Qianjin Guo","doi":"10.1093/bfgp/elaf018","DOIUrl":"10.1093/bfgp/elaf018","url":null,"abstract":"<p><p>Spatial transcriptomics has revolutionized our ability to measure gene expression while preserving spatial information, thus facilitating detailed analysis of tissue structure and function. Identifying spatial domains accurately is key for understanding tissue microenvironments and biological progression. To overcome the challenge of integrating gene expression data with spatial information, we introduce the VARGG deep learning framework. VARGG combines a pretrained Vision Transformer (ViT) with a graph neural network autoencoder, utilizing ViT's self-attention mechanism to capture global contextual information and enhance understanding of spatial relationships. This framework is further enhanced by multi-layer gated residual graph neural networks and Gaussian noise, which improve feature representation and model generalizability across different data sources. The robustness and scalability of VARGG have been verified on different platforms (10x Visium, Slide-seqV2, Stereo-seq, and MERFISH) and datasets of different sizes (human glioblastoma, mouse embryo, breast cancer). Our results demonstrate that VARGG's ability to accurately delineate spatial domains can provide a deeper understanding of tissue structure and help identify key molecular markers and potential therapeutic targets, thereby improving our understanding of disease mechanisms and providing opportunities for personalization to inform the development of treatment strategies.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"24 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12640549/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145589955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bac Dao, Van Ngu Trinh, Huy V Nguyen, Hoa L Nguyen, Thuc Duy Le, Phuc Loi Luu
Acute myeloid leukemia (AML) is a type of blood cancer with diverse genetic variations and DNA methylation alterations. By studying the interaction of gene mutations, expression, and DNA methylation, we aimed to gain valuable insights into the processes that lead to block differentiation in AML. We analyzed TCGA-LAML data (173 samples) with RNA sequencing and DNA methylation arrays, comparing FLT3 mutant (48) and wild-type (125) cases. We conducted differential gene expression analysis using cBioPortal, identified DNA methylation differences with ChAMP tool, and correlated them with gene expression changes. Gene set enrichment analysis (g:Profiler) revealed significant biological processes and pathways. ShinyGo and GeneCards were used to find potential transcription factors and their binding sites among significant genes. We found significant differentially expressed genes (DEGs) negatively correlated with their most significant methylation probes (Pearson correlation coefficient of -0.49, P-value <0.001) between FLT3 mutant and wild-type groups. Moreover, our exploration of 450 k CpG sites uncovered a global hypo-methylated status in 168 DEGs. Notably, these methylation changes were enriched in the promoter regions of Homebox superfamily gene, which are crucial in transcriptional-regulating pathways in blood cancer. Furthermore, in FLT3 mutant AML patient samples, we observed overexpress of WT1, a transcription factor known to bind homeobox gene family. This finding suggests a potential mechanism by which WT1 recruits TET2 to demethylate specific genomic regions. Integrating gene expression and DNA methylation analyses shed light on the impact of FLT3 mutations on cancer cell development and differentiation, supporting a two-hit model in AML. This research advances understanding of AML and fosters targeted therapeutic strategy development.
{"title":"Crosstalk between genomic variants and DNA methylation in FLT3 mutant acute myeloid leukemia.","authors":"Bac Dao, Van Ngu Trinh, Huy V Nguyen, Hoa L Nguyen, Thuc Duy Le, Phuc Loi Luu","doi":"10.1093/bfgp/elae028","DOIUrl":"10.1093/bfgp/elae028","url":null,"abstract":"<p><p>Acute myeloid leukemia (AML) is a type of blood cancer with diverse genetic variations and DNA methylation alterations. By studying the interaction of gene mutations, expression, and DNA methylation, we aimed to gain valuable insights into the processes that lead to block differentiation in AML. We analyzed TCGA-LAML data (173 samples) with RNA sequencing and DNA methylation arrays, comparing FLT3 mutant (48) and wild-type (125) cases. We conducted differential gene expression analysis using cBioPortal, identified DNA methylation differences with ChAMP tool, and correlated them with gene expression changes. Gene set enrichment analysis (g:Profiler) revealed significant biological processes and pathways. ShinyGo and GeneCards were used to find potential transcription factors and their binding sites among significant genes. We found significant differentially expressed genes (DEGs) negatively correlated with their most significant methylation probes (Pearson correlation coefficient of -0.49, P-value <0.001) between FLT3 mutant and wild-type groups. Moreover, our exploration of 450 k CpG sites uncovered a global hypo-methylated status in 168 DEGs. Notably, these methylation changes were enriched in the promoter regions of Homebox superfamily gene, which are crucial in transcriptional-regulating pathways in blood cancer. Furthermore, in FLT3 mutant AML patient samples, we observed overexpress of WT1, a transcription factor known to bind homeobox gene family. This finding suggests a potential mechanism by which WT1 recruits TET2 to demethylate specific genomic regions. Integrating gene expression and DNA methylation analyses shed light on the impact of FLT3 mutations on cancer cell development and differentiation, supporting a two-hit model in AML. This research advances understanding of AML and fosters targeted therapeutic strategy development.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735749/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141472885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aleksandra Cabaj, Agata Charzyńska, Adrianna Moszyńska, Maciej Jaśkiewicz, Rafał Bartoszewski, Michał Dąbrowski
We developed an ordinary differential equations (ODEs) model of hypoxia signaling that, in addition to HIF-1α, takes into account also HIF-2α. Our model can be separated into two parts, the first, describing the production and degradation of the α subunits of HIF-1 and HIF-2, and their accumulation in response to hypoxia; and the second, describing how the α subunits cooperate with the β subunit in binding to cis-regulatory regions and activation of HIF-target genes in response to hypoxia. In our previous work [1], using the first part of our model trained on time-series data from 0.9 % hypoxia, we successfully predicted the response of the system to a further drop of the oxygen to 0.3 % hypoxia. This modeling result contributed to explaining the mechanism of the switch of the control from HIF-1 to HIF-2 during the response of human primary endothelial cells to hypoxia. In another work [2], we experimentally demonstrated a linear proportionality between the counts of motifs assigned to HIF-1 in promoter open chromatin regions of genes and the effects of HIF-1 on the induction of these genes under hypoxia. We furthermore showed that such a proportionality is predicted by the subset of the ODE model of Nguyen et al. (2013) [3] common with the second part of our ODE model. In the current work, we provide the details of our full ODE model and show that it leads to a prediction that HIF-1β can be a limiting factor of the response to hypoxia.
{"title":"A dynamic model of gene activation in response to hypoxia accounting for both HIF-1 and HIF-2.","authors":"Aleksandra Cabaj, Agata Charzyńska, Adrianna Moszyńska, Maciej Jaśkiewicz, Rafał Bartoszewski, Michał Dąbrowski","doi":"10.1093/bfgp/elaf021","DOIUrl":"10.1093/bfgp/elaf021","url":null,"abstract":"<p><p>We developed an ordinary differential equations (ODEs) model of hypoxia signaling that, in addition to HIF-1α, takes into account also HIF-2α. Our model can be separated into two parts, the first, describing the production and degradation of the α subunits of HIF-1 and HIF-2, and their accumulation in response to hypoxia; and the second, describing how the α subunits cooperate with the β subunit in binding to cis-regulatory regions and activation of HIF-target genes in response to hypoxia. In our previous work [1], using the first part of our model trained on time-series data from 0.9 % hypoxia, we successfully predicted the response of the system to a further drop of the oxygen to 0.3 % hypoxia. This modeling result contributed to explaining the mechanism of the switch of the control from HIF-1 to HIF-2 during the response of human primary endothelial cells to hypoxia. In another work [2], we experimentally demonstrated a linear proportionality between the counts of motifs assigned to HIF-1 in promoter open chromatin regions of genes and the effects of HIF-1 on the induction of these genes under hypoxia. We furthermore showed that such a proportionality is predicted by the subset of the ODE model of Nguyen et al. (2013) [3] common with the second part of our ODE model. In the current work, we provide the details of our full ODE model and show that it leads to a prediction that HIF-1β can be a limiting factor of the response to hypoxia.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"24 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12700088/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145745917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hailong Li, Xiaqing Gao, Shuangming Guo, Shenfei Gao, Chunting Yang, Rong Su, Zhe Jing, Shuping Qiu, Ping Tang, Jing Han
Aim: TPD52 (tumor protein D52) and TPD52L2 (tumor protein D52-like 2), members of the TPD52 gene family, have been implicated in multiple malignancies. However, their roles in gastric cancer (GC) remain elusive. Herein, we integrated multiomics analyses and experimental validation to elucidate their prognostic and functional significance in GC.
Methods: Utilizing The Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO), and tissue microarray datasets, we analyzed TPD52/TPD52L2 expression patterns in patients with GC. Survival analysis, Cox regression, and nomogram construction were performed to assess prognostic value. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes functional enrichment analysis and immune infiltration evaluation (Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts/Estimation of STromal and Immune cells in MAlignant Tumour tissues using Expression data) (CIBERSORTx/ESTIMATE) were conducted to explore the molecular mechanisms involved. In vitro experiments (cell proliferation, migration, invasion, and apoptosis assays) were performed via lentivirus-mediated gene knockdown in gastric cancer cell lines AGS and MKN45 cells.
Results: TPD52 and TPD52L2 were significantly overexpressed in GC tissues compared with their normal counterparts. Elevated TPD52L2 expression was significantly associated with advanced Tumor, Node, Metastasis (TNM) stage and independently predicted reduced overall survival according to multivariate Cox regression. Multivariate analysis identified TPD52L2 as an independent prognostic factor. Diagnostic Receiver Operating Characteristic (ROC) curves yielded area under the curve values of 0.813 (TPD52) and 0.807 (TPD52L2). The results of functional experiments suggested that TPD52/TPD52L2 knockdown inhibited proliferation, migration, G0/G1 arrest, and induced apoptosis. Mechanistically, TPD52/TPD52L2 silencing suppressed PI3K/Akt serine/threonine kinase (AKT)/mammalian target of rapamycin (mTOR) signaling and epithelial-mesenchymal transition marker expression.
Conclusion: TPD52 and TPD52L2 are promising prognostic biomarkers in GC, with TPD52L2 exhibiting greater clinical relevance. Targeting these proteins may disrupt oncogenic signaling pathways and enhance immunotherapy efficacy, warranting further investigation in clinical trials.
目的:TPD52基因家族成员TPD52(肿瘤蛋白D52)和TPD52L2(肿瘤蛋白D52样2)与多种恶性肿瘤有关。然而,它们在胃癌(GC)中的作用尚不明确。在此,我们结合多组学分析和实验验证来阐明它们在GC中的预后和功能意义。方法:利用肿瘤基因组图谱(TCGA)、基因表达图谱(GEO)和组织微阵列数据集,分析胃癌患者TPD52/TPD52L2的表达模式。通过生存分析、Cox回归和nomogram构建来评估预后价值。通过基因本体和京都基因与基因组百科全书功能富集分析和免疫浸润评估(Cell-type Identification By estimated Relative Subsets of RNA转录本)/利用表达数据估计恶性肿瘤组织中的基质和免疫细胞)(CIBERSORTx/ESTIMATE)来探索其中的分子机制。通过慢病毒介导的基因敲低,在胃癌细胞系AGS和MKN45细胞中进行了体外实验(细胞增殖、迁移、侵袭和凋亡实验)。结果:与正常组织相比,TPD52和TPD52L2在GC组织中明显过表达。根据多变量Cox回归,TPD52L2表达升高与肿瘤、淋巴结、转移(TNM)晚期显著相关,并独立预测总生存期降低。多变量分析发现TPD52L2是一个独立的预后因素。诊断性受试者工作特征(ROC)曲线下面积分别为0.813 (TPD52)和0.807 (TPD52L2)。功能实验结果表明,敲低TPD52/TPD52L2可抑制细胞增殖、迁移、G0/G1阻滞、诱导细胞凋亡。机制上,TPD52/TPD52L2沉默可抑制PI3K/Akt丝氨酸/苏氨酸激酶(Akt)/哺乳动物雷帕霉素靶蛋白(mTOR)信号和上皮-间质转化标志物的表达。结论:TPD52和TPD52L2是有前景的胃癌预后生物标志物,其中TPD52L2具有更大的临床相关性。靶向这些蛋白可能会破坏致癌信号通路,提高免疫治疗效果,值得在临床试验中进一步研究。
{"title":"Dual oncogenic roles of TPD52 and TPD52L2 in gastric cancer progression via PI3K/AKT activation and immunosuppressive microenvironment remodeling.","authors":"Hailong Li, Xiaqing Gao, Shuangming Guo, Shenfei Gao, Chunting Yang, Rong Su, Zhe Jing, Shuping Qiu, Ping Tang, Jing Han","doi":"10.1093/bfgp/elaf015","DOIUrl":"10.1093/bfgp/elaf015","url":null,"abstract":"<p><strong>Aim: </strong>TPD52 (tumor protein D52) and TPD52L2 (tumor protein D52-like 2), members of the TPD52 gene family, have been implicated in multiple malignancies. However, their roles in gastric cancer (GC) remain elusive. Herein, we integrated multiomics analyses and experimental validation to elucidate their prognostic and functional significance in GC.</p><p><strong>Methods: </strong>Utilizing The Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO), and tissue microarray datasets, we analyzed TPD52/TPD52L2 expression patterns in patients with GC. Survival analysis, Cox regression, and nomogram construction were performed to assess prognostic value. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes functional enrichment analysis and immune infiltration evaluation (Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts/Estimation of STromal and Immune cells in MAlignant Tumour tissues using Expression data) (CIBERSORTx/ESTIMATE) were conducted to explore the molecular mechanisms involved. In vitro experiments (cell proliferation, migration, invasion, and apoptosis assays) were performed via lentivirus-mediated gene knockdown in gastric cancer cell lines AGS and MKN45 cells.</p><p><strong>Results: </strong>TPD52 and TPD52L2 were significantly overexpressed in GC tissues compared with their normal counterparts. Elevated TPD52L2 expression was significantly associated with advanced Tumor, Node, Metastasis (TNM) stage and independently predicted reduced overall survival according to multivariate Cox regression. Multivariate analysis identified TPD52L2 as an independent prognostic factor. Diagnostic Receiver Operating Characteristic (ROC) curves yielded area under the curve values of 0.813 (TPD52) and 0.807 (TPD52L2). The results of functional experiments suggested that TPD52/TPD52L2 knockdown inhibited proliferation, migration, G0/G1 arrest, and induced apoptosis. Mechanistically, TPD52/TPD52L2 silencing suppressed PI3K/Akt serine/threonine kinase (AKT)/mammalian target of rapamycin (mTOR) signaling and epithelial-mesenchymal transition marker expression.</p><p><strong>Conclusion: </strong>TPD52 and TPD52L2 are promising prognostic biomarkers in GC, with TPD52L2 exhibiting greater clinical relevance. Targeting these proteins may disrupt oncogenic signaling pathways and enhance immunotherapy efficacy, warranting further investigation in clinical trials.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":"24 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12449195/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145093137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The CRISPR/Cas9 system developed from Streptococcus pyogenes (SpCas9) has high potential in gene editing. However, its successful application is hindered by the considerable variability in target efficiencies across different single guide RNAs (sgRNAs). Although several deep learning models have been created to predict sgRNA on-target activity, the intrinsic mechanisms of these models are difficult to explain, and there is still scope for improvement in prediction performance. To overcome these issues, we propose an ensemble interpretable model termed DeepMEns based on deep learning to predict sgRNA on-target activity. By using five different training and validation datasets, we constructed five sub-regressors, each comprising three parts. The first part uses one-hot encoding, wherein 0-1 representation of the secondary structure is used as the input to the convolutional neural network (CNN) with Transformer encoder. The second part uses the DNA shape feature matrix as the input to the CNN with Transformer encoder. The third part uses positional encoding feature matrices as the proposed input into a long short-term memory network with an attention mechanism. These three parts are concatenated through the flattened layer, and the final prediction result is the average of the five sub-regressors. Extensive benchmarking experiments indicated that DeepMEns achieved the highest Spearman correlation coefficient for 6 of 10 independent test datasets as compared to previous predictors, this finding confirmed that DeepMEns can accomplish state-of-the-art performance. Moreover, the ablation analysis also indicated that the ensemble strategy may improve the performance of the prediction model.
{"title":"DeepMEns: an ensemble model for predicting sgRNA on-target activity based on multiple features.","authors":"Shumei Ding, Jia Zheng, Cangzhi Jia","doi":"10.1093/bfgp/elae043","DOIUrl":"10.1093/bfgp/elae043","url":null,"abstract":"<p><p>The CRISPR/Cas9 system developed from Streptococcus pyogenes (SpCas9) has high potential in gene editing. However, its successful application is hindered by the considerable variability in target efficiencies across different single guide RNAs (sgRNAs). Although several deep learning models have been created to predict sgRNA on-target activity, the intrinsic mechanisms of these models are difficult to explain, and there is still scope for improvement in prediction performance. To overcome these issues, we propose an ensemble interpretable model termed DeepMEns based on deep learning to predict sgRNA on-target activity. By using five different training and validation datasets, we constructed five sub-regressors, each comprising three parts. The first part uses one-hot encoding, wherein 0-1 representation of the secondary structure is used as the input to the convolutional neural network (CNN) with Transformer encoder. The second part uses the DNA shape feature matrix as the input to the CNN with Transformer encoder. The third part uses positional encoding feature matrices as the proposed input into a long short-term memory network with an attention mechanism. These three parts are concatenated through the flattened layer, and the final prediction result is the average of the five sub-regressors. Extensive benchmarking experiments indicated that DeepMEns achieved the highest Spearman correlation coefficient for 6 of 10 independent test datasets as compared to previous predictors, this finding confirmed that DeepMEns can accomplish state-of-the-art performance. Moreover, the ablation analysis also indicated that the ensemble strategy may improve the performance of the prediction model.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735754/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}