Pub Date : 2024-09-28eCollection Date: 2024-01-01DOI: 10.1177/11779322241282489
Sam Freesun Friedman, Gemma Elyse Moran, Marianne Rakic, Anthony Phillipakis
The advent of biobanks with vast quantities of medical imaging and paired genetic measurements creates huge opportunities for a new generation of genotype-phenotype association studies. However, disentangling biological signals from the many sources of bias and artifacts remains difficult. Using diverse medical images and time-series (ie, magnetic resonance imagings [MRIs], electrocardiograms [ECGs], and dual-energy X-ray absorptiometries [DXAs]), we show how registration, both spatial and temporal, guided by domain knowledge or learned de novo, helps uncover biological information. A multimodal autoencoder comparison framework quantifies and characterizes how registration affects the representations that unsupervised and self-supervised encoders learn. In this study we (1) train autoencoders before and after registration with nine diverse types of medical image, (2) demonstrate how neural network-based methods (VoxelMorph, DeepCycle, and DropFuse) can effectively learn registrations allowing for more flexible and efficient processing than is possible with hand-crafted registration techniques, and (3) conduct exhaustive phenotypic screening, comprised of millions of statistical tests, to quantify how registration affects the generalizability of learned representations. Genome- and phenome-wide association studies (GWAS and PheWAS) uncover significantly more associations with registered modality representations than with equivalently trained and sized representations learned from native coordinate spaces. Specifically, registered PheWAS yielded 61 more disease associations for ECGs, 53 more disease associations for cardiac MRIs, and 10 more disease associations for brain MRIs. Registration also yields significant increases in the coefficient of determination when regressing continuous phenotypes (eg, 0.36 ± 0.01 with ECGs and 0.11 ± 0.02 for DXA scans). Our findings reveal the crucial role registration plays in enhancing the characterization of physiological states across a broad range of medical imaging data types. Importantly, this finding extends to more flexible types of registration, such as the cross-modal and the circular mapping methods presented here.
{"title":"Genetic Architectures of Medical Images Revealed by Registration of Multiple Modalities.","authors":"Sam Freesun Friedman, Gemma Elyse Moran, Marianne Rakic, Anthony Phillipakis","doi":"10.1177/11779322241282489","DOIUrl":"10.1177/11779322241282489","url":null,"abstract":"<p><p>The advent of biobanks with vast quantities of medical imaging and paired genetic measurements creates huge opportunities for a new generation of genotype-phenotype association studies. However, disentangling biological signals from the many sources of bias and artifacts remains difficult. Using diverse medical images and time-series (ie, magnetic resonance imagings [MRIs], electrocardiograms [ECGs], and dual-energy X-ray absorptiometries [DXAs]), we show how registration, both spatial and temporal, guided by domain knowledge or learned <i>de novo</i>, helps uncover biological information. A multimodal autoencoder comparison framework quantifies and characterizes how registration affects the representations that unsupervised and self-supervised encoders learn. In this study we (1) train autoencoders before and after registration with nine diverse types of medical image, (2) demonstrate how neural network-based methods (VoxelMorph, DeepCycle, and DropFuse) can effectively learn registrations allowing for more flexible and efficient processing than is possible with hand-crafted registration techniques, and (3) conduct exhaustive phenotypic screening, comprised of millions of statistical tests, to quantify how registration affects the generalizability of learned representations. Genome- and phenome-wide association studies (GWAS and PheWAS) uncover significantly more associations with registered modality representations than with equivalently trained and sized representations learned from native coordinate spaces. Specifically, registered PheWAS yielded 61 more disease associations for ECGs, 53 more disease associations for cardiac MRIs, and 10 more disease associations for brain MRIs. Registration also yields significant increases in the coefficient of determination when regressing continuous phenotypes (eg, 0.36 ± 0.01 with ECGs and 0.11 ± 0.02 for DXA scans). Our findings reveal the crucial role registration plays in enhancing the characterization of physiological states across a broad range of medical imaging data types. Importantly, this finding extends to more flexible types of registration, such as the cross-modal and the circular mapping methods presented here.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241282489"},"PeriodicalIF":2.3,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11450573/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142380068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Thymoma is a key risk factor for myasthenia gravis (MG). The purpose of our study was to investigate the potential key genes responsible for MG patients with thymoma.
Methods: We obtained MG and thymoma dataset from GEO database. Differentially expressed genes (DEGs) were determined and functional enrichment analyses were conducted by R packages. Weighted gene co-expression network analysis (WGCNA) was used to screen out the crucial module genes related to thymoma. Candidate genes were obtained by integrating DEGs of MG and module genes. Subsequently, we identified several candidate key genes by machine learning for diagnosing MG patients with thymoma. The nomogram and receiver operating characteristics (ROC) curves were applied to assess the diagnostic value of candidate key genes. Finally, we investigated the infiltration of immunocytes and analyzed the relationship among key genes and immune cells.
Results: We obtained 337 DEGs in MG dataset and 2150 DEGs in thymoma dataset. Biological function analyses indicated that DEGs of MG and thymoma were enriched in many common pathways. Black module (containing 207 genes) analyzed by WGCNA was considered as the most correlated with thymoma. Then, 12 candidate genes were identified by intersecting with MG DEGs and thymoma module genes as potential causes of thymoma-associated MG pathogenesis. Furthermore, five candidate key genes (JAM3, MS4A4A, MS4A6A, EGR1, and FOS) were screened out through integrating least absolute shrinkage and selection operator (LASSO) regression and Random forest (RF). The nomogram and ROC curves (area under the curve from 0.833 to 0.929) suggested all five candidate key genes had high diagnostic values. Finally, we found that five key genes and immune cell infiltrations presented varying degrees of correlation.
Conclusions: Our study identified five key potential pathogenic genes that predisposed thymoma to the development of MG, which provided potential diagnostic biomarkers and promising therapeutic targets for MG patients with thymoma.
{"title":"Identification of Potential Key Genes for the Comorbidity of Myasthenia Gravis With Thymoma by Integrated Bioinformatics Analysis and Machine Learning.","authors":"Hui Liu, Geyu Liu, Rongjing Guo, Shuang Li, Ting Chang","doi":"10.1177/11779322241281652","DOIUrl":"https://doi.org/10.1177/11779322241281652","url":null,"abstract":"<p><strong>Background: </strong>Thymoma is a key risk factor for myasthenia gravis (MG). The purpose of our study was to investigate the potential key genes responsible for MG patients with thymoma.</p><p><strong>Methods: </strong>We obtained MG and thymoma dataset from GEO database. Differentially expressed genes (DEGs) were determined and functional enrichment analyses were conducted by R packages. Weighted gene co-expression network analysis (WGCNA) was used to screen out the crucial module genes related to thymoma. Candidate genes were obtained by integrating DEGs of MG and module genes. Subsequently, we identified several candidate key genes by machine learning for diagnosing MG patients with thymoma. The nomogram and receiver operating characteristics (ROC) curves were applied to assess the diagnostic value of candidate key genes. Finally, we investigated the infiltration of immunocytes and analyzed the relationship among key genes and immune cells.</p><p><strong>Results: </strong>We obtained 337 DEGs in MG dataset and 2150 DEGs in thymoma dataset. Biological function analyses indicated that DEGs of MG and thymoma were enriched in many common pathways. Black module (containing 207 genes) analyzed by WGCNA was considered as the most correlated with thymoma. Then, 12 candidate genes were identified by intersecting with MG DEGs and thymoma module genes as potential causes of thymoma-associated MG pathogenesis. Furthermore, five candidate key genes (<i>JAM3</i>, <i>MS4A4A</i>, <i>MS4A6A</i>, <i>EGR1</i>, and <i>FOS</i>) were screened out through integrating least absolute shrinkage and selection operator (LASSO) regression and Random forest (RF). The nomogram and ROC curves (area under the curve from 0.833 to 0.929) suggested all five candidate key genes had high diagnostic values. Finally, we found that five key genes and immune cell infiltrations presented varying degrees of correlation.</p><p><strong>Conclusions: </strong>Our study identified five key potential pathogenic genes that predisposed thymoma to the development of MG, which provided potential diagnostic biomarkers and promising therapeutic targets for MG patients with thymoma.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241281652"},"PeriodicalIF":2.3,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11437577/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142341399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-24eCollection Date: 2024-01-01DOI: 10.1177/11779322241281188
Lin Miao, Douglas E Weidemann, Katherine Ngo, Benjamin A Unruh, Shihoko Kojima
Rhythmic transcripts play pivotal roles in driving the daily oscillations of various biological processes. Genetic or environmental disruptions can lead to alterations in the rhythmicity of transcripts, ultimately impacting downstream circadian outputs, including metabolic processes and even behavior. To statistically compare the differences in transcript rhythms between 2 or more conditions, several algorithms have been developed to analyze circadian transcriptomic data, each with distinct features. In this study, we compared the performance of 7 algorithms that were specifically designed to detect differential rhythmicity (DODR, LimoRhyde, CircaCompare, compareRhythms, diffCircadian, dryR, and RepeatedCircadian). We found that even when applying the same statistical threshold, these algorithms yielded varying numbers of differentially rhythmic transcripts, most likely because each algorithm defines rhythmic and differentially rhythmic transcripts differently. Nevertheless, the output for the differential phase and amplitude were identical between dryR and compareRhyhms, and diffCircadian and CircaCompare, while the output from LimoRhyde2 was highly correlated with that from diffCircadian and CircaCompare. Because each algorithm has unique requirements for input data and reports different information as an output, it is crucial to ensure the compatibility of input data with the chosen algorithm and assess whether the algorithm's output fits the user's needs when selecting an algorithm for analysis.
{"title":"A Comparative Study of Algorithms Detecting Differential Rhythmicity in Transcriptomic Data.","authors":"Lin Miao, Douglas E Weidemann, Katherine Ngo, Benjamin A Unruh, Shihoko Kojima","doi":"10.1177/11779322241281188","DOIUrl":"10.1177/11779322241281188","url":null,"abstract":"<p><p>Rhythmic transcripts play pivotal roles in driving the daily oscillations of various biological processes. Genetic or environmental disruptions can lead to alterations in the rhythmicity of transcripts, ultimately impacting downstream circadian outputs, including metabolic processes and even behavior. To statistically compare the differences in transcript rhythms between 2 or more conditions, several algorithms have been developed to analyze circadian transcriptomic data, each with distinct features. In this study, we compared the performance of 7 algorithms that were specifically designed to detect differential rhythmicity (DODR, LimoRhyde, CircaCompare, compareRhythms, diffCircadian, dryR, and RepeatedCircadian). We found that even when applying the same statistical threshold, these algorithms yielded varying numbers of differentially rhythmic transcripts, most likely because each algorithm defines rhythmic and differentially rhythmic transcripts differently. Nevertheless, the output for the differential phase and amplitude were identical between dryR and compareRhyhms, and diffCircadian and CircaCompare, while the output from LimoRhyde2 was highly correlated with that from diffCircadian and CircaCompare. Because each algorithm has unique requirements for input data and reports different information as an output, it is crucial to ensure the compatibility of input data with the chosen algorithm and assess whether the algorithm's output fits the user's needs when selecting an algorithm for analysis.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241281188"},"PeriodicalIF":2.3,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11440551/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142336251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-23eCollection Date: 2024-01-01DOI: 10.1177/11779322241280580
Dipta Chandra Pal, Tasnimul Arabi Anik, Atiq Abrar Rahman, S M Mahfujur Rahman
Providencia rettgeri has increasingly been responsible for several infections, including urinary tract, post-burn wounds, neonatal sepsis, and others. The emergence of drug-resistant isolates of P rettgeri, accompanied by intrinsic and acquired antibiotic resistance, has exacerbated the challenge of treating such infections, necessitating the development of novel therapeutics. Hypothetical proteins (HPs) form a major portion of cellular proteins and can be targeted by these novel therapeutics. In this study, 410 HPs from a pan-drug-resistant (PDR) P rettgeri strain (MRSN845308) were functionally annotated and characterized by physicochemical properties, localization, virulence, essentiality, druggability, and functionality. Among 410 HPs, the VirulentPred 2.0 tool and VICMpred combinedly predicted 33 HPs as virulent, whereas 48 HPs were highly interacting proteins based on the STRING v12 database. BlastKOALA and eggNOG-mapper v2.1.12 predicted 13 HPs involved in several metabolic pathways like Riboflavin metabolism and Lipopolysaccharide biosynthesis. Overall, 83 HPs were selected as primary drug targets; however, only 80 remained after nonhomology searching and essentiality analysis. In addition, all were detected as novel drug targets according to DrugBank 5.1.12. Considering the potential of membrane and extracellular proteins, 29 HPs (extracellular, outer, and inner membrane) were selected based on the combined prediction from PSORTb v3.0.3, CELLO v.2.5, BUSCA, SOSUIGramN, and PSLpred. According to the prevalence of those HPs in different strains of P rettgeri sequences in National Center for Biotechnology Information Identical Protein Groups (NCBI-IPG), 5 HPs were selected as final drug targets. In addition, 5 other HPs annotated as transporter proteins were also added to the list. As no crystal structures of our targets are present, 3-dimensional structures of selected HPs were predicted by the AlphaFold Server powered by AlphaFold 3. Our findings might facilitate a better understanding of the mechanism of virulence and pathogenesis, and up-to-date annotations can make uncharacterized HPs easy to identify as targets for novel therapeutics.
{"title":"Identification and Functional Annotation of Hypothetical Proteins of Pan-Drug-Resistant <i>Providencia rettgeri</i> Strain MRSN845308 Toward Designing Antimicrobial Drug Targets.","authors":"Dipta Chandra Pal, Tasnimul Arabi Anik, Atiq Abrar Rahman, S M Mahfujur Rahman","doi":"10.1177/11779322241280580","DOIUrl":"10.1177/11779322241280580","url":null,"abstract":"<p><p><i>Providencia rettgeri</i> has increasingly been responsible for several infections, including urinary tract, post-burn wounds, neonatal sepsis, and others. The emergence of drug-resistant isolates of <i>P rettgeri</i>, accompanied by intrinsic and acquired antibiotic resistance, has exacerbated the challenge of treating such infections, necessitating the development of novel therapeutics. Hypothetical proteins (HPs) form a major portion of cellular proteins and can be targeted by these novel therapeutics. In this study, 410 HPs from a pan-drug-resistant (PDR) <i>P rettgeri</i> strain (MRSN845308) were functionally annotated and characterized by physicochemical properties, localization, virulence, essentiality, druggability, and functionality. Among 410 HPs, the VirulentPred 2.0 tool and VICMpred combinedly predicted 33 HPs as virulent, whereas 48 HPs were highly interacting proteins based on the STRING v12 database. BlastKOALA and eggNOG-mapper v2.1.12 predicted 13 HPs involved in several metabolic pathways like Riboflavin metabolism and Lipopolysaccharide biosynthesis. Overall, 83 HPs were selected as primary drug targets; however, only 80 remained after nonhomology searching and essentiality analysis. In addition, all were detected as novel drug targets according to DrugBank 5.1.12. Considering the potential of membrane and extracellular proteins, 29 HPs (extracellular, outer, and inner membrane) were selected based on the combined prediction from PSORTb v3.0.3, CELLO v.2.5, BUSCA, SOSUIGramN, and PSLpred. According to the prevalence of those HPs in different strains of <i>P rettgeri</i> sequences in National Center for Biotechnology Information Identical Protein Groups (NCBI-IPG), 5 HPs were selected as final drug targets. In addition, 5 other HPs annotated as transporter proteins were also added to the list. As no crystal structures of our targets are present, 3-dimensional structures of selected HPs were predicted by the AlphaFold Server powered by AlphaFold 3. Our findings might facilitate a better understanding of the mechanism of virulence and pathogenesis, and up-to-date annotations can make uncharacterized HPs easy to identify as targets for novel therapeutics.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241280580"},"PeriodicalIF":2.3,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11452876/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142379980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18eCollection Date: 2024-01-01DOI: 10.1177/11779322241271550
Milene Volpato, Mark Hull, Ian M Carr
Gene ontology phrases are a widely used set of hierarchical terms that describe the biological properties of genes. These terms are then used to annotate individual genes, making it possible to determine the likely physiological properties of groups of genes such as a list of differentially expressed genes. Consequently, their ability to predict changes in biological features and functions based on alterations in gene expression has made gene ontology terms popular in the wide range of bioinformatic fields, such as differential gene expression and evolutionary biology. However, while they make the analysis easier, it is seldom easy to convey the results in a readily understandable manner. A number of applications have been developed to visualize gene ontology (GO) term enrichment; however, these solutions tend to focus on the display of aggregated results from a single analysis, making them unsuitable for the analysis of a series of experiments such as a time course or response to different drug treatments. As multiple pair wise comparisons are becoming a common feature of RNA profiling experiments, the absence of a mechanism to easily compare them is a significant problem. Consequently, to overcome this obstacle, we have developed GOTermViewer, an application that displays GO term enrichment data as determined by GOstats such that changes in physiological response across a number of individual analyses across a time course or range of drug treatments can be visualized.
基因本体短语是一套广泛使用的分级术语,用于描述基因的生物学特性。这些术语可用于注释单个基因,从而确定基因组(如差异表达基因列表)可能具有的生理特性。因此,基因本体术语能够根据基因表达的变化预测生物特征和功能的变化,这使得基因本体术语在差异基因表达和进化生物学等广泛的生物信息领域大受欢迎。然而,虽然这些术语使分析变得更容易,但要以易于理解的方式传达分析结果却并不容易。目前已经开发了许多应用软件来可视化基因本体(GO)术语富集;然而,这些解决方案往往侧重于显示单次分析的汇总结果,因此不适合分析一系列实验,如时间过程或对不同药物治疗的反应。由于多配对比较正在成为 RNA 图谱分析实验的一个常见特征,因此缺乏一种机制来轻松比较这些结果是一个重大问题。因此,为了克服这一障碍,我们开发了 GOTermViewer 应用程序,它可以显示由 GOstats 确定的 GO 术语富集数据,这样就可以直观地显示在不同时间过程或不同药物治疗范围内进行的多项单独分析中生理反应的变化。
{"title":"GOTermViewer: Visualization of Gene Ontology Enrichment in Multiple Differential Gene Expression Analyses.","authors":"Milene Volpato, Mark Hull, Ian M Carr","doi":"10.1177/11779322241271550","DOIUrl":"10.1177/11779322241271550","url":null,"abstract":"<p><p>Gene ontology phrases are a widely used set of hierarchical terms that describe the biological properties of genes. These terms are then used to annotate individual genes, making it possible to determine the likely physiological properties of groups of genes such as a list of differentially expressed genes. Consequently, their ability to predict changes in biological features and functions based on alterations in gene expression has made gene ontology terms popular in the wide range of bioinformatic fields, such as differential gene expression and evolutionary biology. However, while they make the analysis easier, it is seldom easy to convey the results in a readily understandable manner. A number of applications have been developed to visualize gene ontology (GO) term enrichment; however, these solutions tend to focus on the display of aggregated results from a single analysis, making them unsuitable for the analysis of a series of experiments such as a time course or response to different drug treatments. As multiple pair wise comparisons are becoming a common feature of RNA profiling experiments, the absence of a mechanism to easily compare them is a significant problem. Consequently, to overcome this obstacle, we have developed GOTermViewer, an application that displays GO term enrichment data as determined by GOstats such that changes in physiological response across a number of individual analyses across a time course or range of drug treatments can be visualized.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241271550"},"PeriodicalIF":2.3,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11418229/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142307116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-17eCollection Date: 2024-01-01DOI: 10.1177/11779322241276936
Tomonori Hoshino, Hajime Takase, Hidehiro Ishikawa, Gen Hamanaka, Shintaro Kimura, Norito Fukuda, Ji Hyun Park, Hiroki Nakajima, Hisashi Shirakawa, Akihiro Shindo, Kyu-Won Kim, Irwin H Gelman, Josephine Lok, Ken Arai
A-kinase anchor protein 12 (AKAP12), a scaffold protein, has been implicated in the central nervous system, including blood-brain barrier (BBB) function. Although its expression level in the corpus callosum is higher than in other brain regions, such as the cerebral cortex, the role of AKAP12 in the corpus callosum remains unclear. In this study, we investigate the impact of AKAP12 deficiency by transcriptome analysis using RNA-sequencing (RNA-seq) on the corpus callosum of AKAP12 knockout (KO) mice. We observed minimal changes, with only 13 genes showing differential expression, including Akap12 itself. Notably, Klf2 and Sgk1, genes potentially involved in BBB function, were downregulated in AKAP12 KO mice and expressed in vascular cells similar to Akap12. These changes in gene expression may affect important biological pathways that may be associated with neurological disorders. Our findings provide an additional data set for future research on the role of AKAP12 in the central nervous system.
{"title":"Transcriptomic Profiles of AKAP12 Deficiency in Mouse Corpus Callosum.","authors":"Tomonori Hoshino, Hajime Takase, Hidehiro Ishikawa, Gen Hamanaka, Shintaro Kimura, Norito Fukuda, Ji Hyun Park, Hiroki Nakajima, Hisashi Shirakawa, Akihiro Shindo, Kyu-Won Kim, Irwin H Gelman, Josephine Lok, Ken Arai","doi":"10.1177/11779322241276936","DOIUrl":"https://doi.org/10.1177/11779322241276936","url":null,"abstract":"<p><p>A-kinase anchor protein 12 (AKAP12), a scaffold protein, has been implicated in the central nervous system, including blood-brain barrier (BBB) function. Although its expression level in the corpus callosum is higher than in other brain regions, such as the cerebral cortex, the role of AKAP12 in the corpus callosum remains unclear. In this study, we investigate the impact of AKAP12 deficiency by transcriptome analysis using RNA-sequencing (RNA-seq) on the corpus callosum of AKAP12 knockout (KO) mice. We observed minimal changes, with only 13 genes showing differential expression, including <i>Akap12</i> itself. Notably, <i>Klf2</i> and <i>Sgk1</i>, genes potentially involved in BBB function, were downregulated in AKAP12 KO mice and expressed in vascular cells similar to <i>Akap12</i>. These changes in gene expression may affect important biological pathways that may be associated with neurological disorders. Our findings provide an additional data set for future research on the role of AKAP12 in the central nervous system.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241276936"},"PeriodicalIF":2.3,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11439161/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142341400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-15eCollection Date: 2024-01-01DOI: 10.1177/11779322241271535
Roberta Coletti, Mónica Leiria de Mendonça, Susana Vinga, Marta B Lopes
Tumor heterogeneity is a challenge to designing effective and targeted therapies. Glioma-type identification depends on specific molecular and histological features, which are defined by the official World Health Organization (WHO) classification of the central nervous system (CNS). These guidelines are constantly updated to support the diagnosis process, which affects all the successive clinical decisions. In this context, the search for new potential diagnostic and prognostic targets, characteristic of each glioma type, is crucial to support the development of novel therapies. Based on The Cancer Genome Atlas (TCGA) glioma RNA-sequencing data set updated according to the 2016 and 2021 WHO guidelines, we proposed a 2-step variable selection approach for biomarker discovery. Our framework encompasses the graphical lasso algorithm to estimate sparse networks of genes carrying diagnostic information. These networks are then used as input for regularized Cox survival regression model, allowing the identification of a smaller subset of genes with prognostic value. In each step, the results derived from the 2016 and 2021 classes were discussed and compared. For both WHO glioma classifications, our analysis identifies potential biomarkers, characteristic of each glioma type. Yet, better results were obtained for the WHO CNS classification in 2021, thereby supporting recent efforts to include molecular data on glioma classification.
{"title":"Inferring Diagnostic and Prognostic Gene Expression Signatures Across WHO Glioma Classifications: A Network-Based Approach.","authors":"Roberta Coletti, Mónica Leiria de Mendonça, Susana Vinga, Marta B Lopes","doi":"10.1177/11779322241271535","DOIUrl":"https://doi.org/10.1177/11779322241271535","url":null,"abstract":"<p><p>Tumor heterogeneity is a challenge to designing effective and targeted therapies. Glioma-type identification depends on specific molecular and histological features, which are defined by the official World Health Organization (WHO) classification of the central nervous system (CNS). These guidelines are constantly updated to support the diagnosis process, which affects all the successive clinical decisions. In this context, the search for new potential diagnostic and prognostic targets, characteristic of each glioma type, is crucial to support the development of novel therapies. Based on The Cancer Genome Atlas (TCGA) glioma RNA-sequencing data set updated according to the 2016 and 2021 WHO guidelines, we proposed a 2-step variable selection approach for biomarker discovery. Our framework encompasses the graphical lasso algorithm to estimate sparse networks of genes carrying diagnostic information. These networks are then used as input for regularized Cox survival regression model, allowing the identification of a smaller subset of genes with prognostic value. In each step, the results derived from the 2016 and 2021 classes were discussed and compared. For both WHO glioma classifications, our analysis identifies potential biomarkers, characteristic of each glioma type. Yet, better results were obtained for the WHO CNS classification in 2021, thereby supporting recent efforts to include molecular data on glioma classification.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241271535"},"PeriodicalIF":2.3,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11403688/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142280203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11eCollection Date: 2024-01-01DOI: 10.1177/11779322241272399
Hassan M Al-Emran, Fazlur Rahman, Laxmi Sarkar, Prosanto Kumar Das, Provakar Mondol, Suriya Yesmin, Pipasha Sultana, Toukir Ahammed, Rasel Parvez, Md Shazid Hasan, Shovon Lal Sarkar, M Shaminur Rahman, Anamica Hossain, Mahmudur Rahman, Ovinu Kibria Islam, Md Tanvir Islam, Shireen Nigar, Selina Akter, A S M Rubayet Ul Alam, Mohammad Mahfuzur Rahman, Iqbal Kabir Jahid, M Anwar Hossain
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that emerged in late 2019 has accumulated a series of point mutations and evolved into several variants of concern (VOCs), some of which are more transmissible and potentially more severe than the original strain. The most notable VOCs are Alpha, Beta, Gamma, Delta, and Omicron, which have spread to various parts of the world. This study conducted surveillance in Jashore, Bangladesh to identify the prevalence of SARS-CoV-2 coinfected with dengue virus and their genomic effect on the emergence of VOCs. A hospital-based COVID-19 surveillance from June to August, 2021 identified 9 453 positive patients in the surveillance area. The study enrolled 572 randomly selected COVID-19-positive patients, of which 11 (2%) had dengue viral coinfection. Whole genome sequences of SARS-CoV-2 were analyzed and compared between coinfection positive and negative group. In addition, we extracted 185 genome sequences from GISAID to investigate the cross-correlation function between SARS-CoV-2 mutations and VOC; multiple ARIMAX(p,d,q) models were developed to estimate the average number of amino acid (aa) substitution among different SARS-CoV-2 VOCs. The results of the study showed that the coinfection group had an average of 30.6 (±1.7) aa substitutions in SARS-CoV-2, whereas the dengue-negative COVID-19 group had that average of 25.6 (±1.8; P < .01). The coinfection group showed a significant difference of aa substitutions in open reading frame (ORF) and N-protein when compared to dengue-negative group (P = .03). Our ARIMAX models estimated that the emergence of SARS-CoV-2 variants Delta required additional 9 to 12 aa substitutions than Alpha, Beta, or Gamma variant. The emergence of Omicron accumulated additional 19 (95% confidence interval [CI]: 15.74, 21.95) aa substitution than Delta. Increased number of point mutations in SARS-CoV-2 genome identified from coinfected cases could be due to the compromised immune function of host and induced adaptability of pathogens during coinfections. As a result, new variants might be emerged when series of coinfection events occur during concurrent two epidemics.
{"title":"Emergence of SARS-CoV-2 Variants Are Induced by Coinfections With Dengue.","authors":"Hassan M Al-Emran, Fazlur Rahman, Laxmi Sarkar, Prosanto Kumar Das, Provakar Mondol, Suriya Yesmin, Pipasha Sultana, Toukir Ahammed, Rasel Parvez, Md Shazid Hasan, Shovon Lal Sarkar, M Shaminur Rahman, Anamica Hossain, Mahmudur Rahman, Ovinu Kibria Islam, Md Tanvir Islam, Shireen Nigar, Selina Akter, A S M Rubayet Ul Alam, Mohammad Mahfuzur Rahman, Iqbal Kabir Jahid, M Anwar Hossain","doi":"10.1177/11779322241272399","DOIUrl":"https://doi.org/10.1177/11779322241272399","url":null,"abstract":"<p><p>Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that emerged in late 2019 has accumulated a series of point mutations and evolved into several variants of concern (VOCs), some of which are more transmissible and potentially more severe than the original strain. The most notable VOCs are Alpha, Beta, Gamma, Delta, and Omicron, which have spread to various parts of the world. This study conducted surveillance in Jashore, Bangladesh to identify the prevalence of SARS-CoV-2 coinfected with dengue virus and their genomic effect on the emergence of VOCs. A hospital-based COVID-19 surveillance from June to August, 2021 identified 9 453 positive patients in the surveillance area. The study enrolled 572 randomly selected COVID-19-positive patients, of which 11 (2%) had dengue viral coinfection. Whole genome sequences of SARS-CoV-2 were analyzed and compared between coinfection positive and negative group. In addition, we extracted 185 genome sequences from GISAID to investigate the cross-correlation function between SARS-CoV-2 mutations and VOC; multiple ARIMAX(p,d,q) models were developed to estimate the average number of amino acid (aa) substitution among different SARS-CoV-2 VOCs. The results of the study showed that the coinfection group had an average of 30.6 (±1.7) aa substitutions in SARS-CoV-2, whereas the dengue-negative COVID-19 group had that average of 25.6 (±1.8; <i>P</i> < .01). The coinfection group showed a significant difference of aa substitutions in open reading frame (ORF) and N-protein when compared to dengue-negative group (<i>P</i> = .03). Our ARIMAX models estimated that the emergence of SARS-CoV-2 variants Delta required additional 9 to 12 aa substitutions than Alpha, Beta, or Gamma variant. The emergence of Omicron accumulated additional 19 (95% confidence interval [CI]: 15.74, 21.95) aa substitution than Delta. Increased number of point mutations in SARS-CoV-2 genome identified from coinfected cases could be due to the compromised immune function of host and induced adaptability of pathogens during coinfections. As a result, new variants might be emerged when series of coinfection events occur during concurrent two epidemics.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241272399"},"PeriodicalIF":2.3,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11406487/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142280202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Owing to the recent emergence of COVID-19, there is a lack of published research and clinical recommendations for posttraumatic stress disorder (PTSD) risk factors in patients who contracted or received treatment for the virus. This research aims to identify potential molecular targets to inform therapeutic strategies for this patient population. RNA sequence data for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and PTSD (from the National Center for Biotechnology Information [NCBI]) were processed using the GREIN database. Protein-protein interaction (PPI) networks, pathway enrichment analyses, miRNA interactions, gene regulatory network (GRN) studies, and identification of linked drugs, chemicals, and diseases were conducted using STRING, DAVID, Enrichr, Metascape, ShinyGO, and NetworkAnalyst v3.0. Our analysis identified 15 potentially unique hub proteins within significantly enriched pathways, including PSMB9, MX1, HLA-DOB, HLA-DRA, IFIT3, OASL, RSAD2, and so on, filtered from a pool of 201 common differentially expressed genes (DEGs). Gene ontology (GO) terms and metabolic pathway analyses revealed the significance of the extracellular region, extracellular space, extracellular exosome, adaptive immune system, and interleukin (IL)-18 signaling pathways. In addition, we discovered several miRNAs (hsa-mir-124-3p, hsa-mir-146a-5p, hsa-mir-148b-3p, and hsa-mir-21-3p), transcription factors (TF) (WRNIP1, FOXC1, GATA2, CREB1, and RELA), a potentially repurposable drug carfilzomib and chemicals (tetrachlorodibenzodioxin, estradiol, arsenic trioxide, and valproic acid) that could regulate the expression levels of hub proteins at both the transcription and posttranscription stages. Our investigations have identified several potential therapeutic targets that elucidate the probability that victims of COVID-19 experience PTSD. However, they require further exploration through clinical and pharmacological studies to explain their efficacy in preventing PTSD in COVID-19 patients.
{"title":"Adopting Integrated Bioinformatics and Systems Biology Approaches to Pinpoint the COVID-19 Patients' Risk Factors That Uplift the Onset of Posttraumatic Stress Disorder.","authors":"Sabbir Ahmed, Md Arju Hossain, Sadia Afrin Bristy, Md Shahjahan Ali, Md Habibur Rahman","doi":"10.1177/11779322241274958","DOIUrl":"https://doi.org/10.1177/11779322241274958","url":null,"abstract":"<p><p>Owing to the recent emergence of COVID-19, there is a lack of published research and clinical recommendations for posttraumatic stress disorder (PTSD) risk factors in patients who contracted or received treatment for the virus. This research aims to identify potential molecular targets to inform therapeutic strategies for this patient population. RNA sequence data for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and PTSD (from the National Center for Biotechnology Information [NCBI]) were processed using the GREIN database. Protein-protein interaction (PPI) networks, pathway enrichment analyses, miRNA interactions, gene regulatory network (GRN) studies, and identification of linked drugs, chemicals, and diseases were conducted using STRING, DAVID, Enrichr, Metascape, ShinyGO, and NetworkAnalyst v3.0. Our analysis identified 15 potentially unique hub proteins within significantly enriched pathways, including PSMB9, MX1, HLA-DOB, HLA-DRA, IFIT3, OASL, RSAD2, and so on, filtered from a pool of 201 common differentially expressed genes (DEGs). Gene ontology (GO) terms and metabolic pathway analyses revealed the significance of the extracellular region, extracellular space, extracellular exosome, adaptive immune system, and interleukin (IL)-18 signaling pathways. In addition, we discovered several miRNAs (hsa-mir-124-3p, hsa-mir-146a-5p, hsa-mir-148b-3p, and hsa-mir-21-3p), transcription factors (TF) (WRNIP1, FOXC1, GATA2, CREB1, and RELA), a potentially repurposable drug carfilzomib and chemicals (tetrachlorodibenzodioxin, estradiol, arsenic trioxide, and valproic acid) that could regulate the expression levels of hub proteins at both the transcription and posttranscription stages. Our investigations have identified several potential therapeutic targets that elucidate the probability that victims of COVID-19 experience PTSD. However, they require further exploration through clinical and pharmacological studies to explain their efficacy in preventing PTSD in COVID-19 patients.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241274958"},"PeriodicalIF":2.3,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11402063/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142280201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06eCollection Date: 2024-01-01DOI: 10.1177/11779322241272395
Hermenegildo Taboada-Castro, Alfredo José Hernández-Álvarez, Jaime A Castro-Mondragón, Sergio Encarnación-Guevara
RhizoBindingSites is a de novo depurified database of conserved DNA motifs potentially involved in the transcriptional regulation of the Rhizobium, Sinorhizobium, Bradyrhizobium, Azorhizobium, and Mesorhizobium genera covering 9 representative symbiotic species, deduced from the upstream regulatory sequences of orthologous genes (O-matrices) from the Rhizobiales taxon. The sites collected with O-matrices per gene per genome from RhizoBindingSites were used to deduce matrices using the dyad-Regulatory Sequence Analysis Tool (RSAT) method, giving rise to novel S-matrices for the construction of the RizoBindingSites v2.0 database. A comparison of the S-matrix logos showed a greater frequency and/or re-definition of specific-position nucleotides found in the O-matrices. Moreover, S-matrices were better at detecting genes in the genome, and there was a more significant number of transcription factors (TFs) in the vicinity than O-matrices, corresponding to a more significant genomic coverage for S-matrices. O-matrices of 3187 TFs and S-matrices of 2754 TFs from 9 species were deposited in RhizoBindingSites and RhizoBindingSites v2.0, respectively. The homology between the matrices of TFs from a genome showed inter-regulation between the clustered TFs. In addition, matrices of AraC, ArsR, GntR, and LysR ortholog TFs showed different motifs, suggesting distinct regulation. Benchmarking showed 72%, 68%, and 81% of common genes per regulon for O-matrices and approximately 14% less common genes with S-matrices of Rhizobium etli CFN42, Rhizobium leguminosarum bv. viciae 3841, and Sinorhizobium meliloti 1021. These data were deposited in RhizoBindingSites and the RhizoBindingSites v2.0 database (http://rhizobindingsites.ccg.unam.mx/).
{"title":"RhizoBindingSites v2.0 Is a Bioinformatic Database of DNA Motifs Potentially Involved in Transcriptional Regulation Deduced From Their Genomic Sites.","authors":"Hermenegildo Taboada-Castro, Alfredo José Hernández-Álvarez, Jaime A Castro-Mondragón, Sergio Encarnación-Guevara","doi":"10.1177/11779322241272395","DOIUrl":"10.1177/11779322241272395","url":null,"abstract":"<p><p>RhizoBindingSites is a <i>de novo</i> depurified database of conserved DNA motifs potentially involved in the transcriptional regulation of the <i>Rhizobium</i>, <i>Sinorhizobium</i>, <i>Bradyrhizobium</i>, <i>Azorhizobium</i>, and <i>Mesorhizobium</i> genera covering 9 representative symbiotic species, deduced from the upstream regulatory sequences of orthologous genes (O-matrices) from the Rhizobiales taxon. The sites collected with O-matrices per gene per genome from RhizoBindingSites were used to deduce matrices using the dyad-Regulatory Sequence Analysis Tool (RSAT) method, giving rise to novel S-matrices for the construction of the RizoBindingSites v2.0 database. A comparison of the S-matrix logos showed a greater frequency and/or re-definition of specific-position nucleotides found in the O-matrices. Moreover, S-matrices were better at detecting genes in the genome, and there was a more significant number of transcription factors (TFs) in the vicinity than O-matrices, corresponding to a more significant genomic coverage for S-matrices. O-matrices of 3187 TFs and S-matrices of 2754 TFs from 9 species were deposited in RhizoBindingSites and RhizoBindingSites v2.0, respectively. The homology between the matrices of TFs from a genome showed inter-regulation between the clustered TFs. In addition, matrices of AraC, ArsR, GntR, and LysR ortholog TFs showed different motifs, suggesting distinct regulation. Benchmarking showed 72%, 68%, and 81% of common genes per regulon for O-matrices and approximately 14% less common genes with S-matrices of <i>Rhizobium etli</i> CFN42, <i>Rhizobium leguminosarum</i> bv. <i>viciae</i> 3841, and <i>Sinorhizobium meliloti</i> 1021. These data were deposited in RhizoBindingSites and the RhizoBindingSites v2.0 database (http://rhizobindingsites.ccg.unam.mx/).</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241272395"},"PeriodicalIF":2.3,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11380129/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142153089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}