Pub Date : 2024-05-28DOI: 10.1186/s44342-024-00003-6
Ngoc Thi Bich Chu, Man Thi Le, Hong Viet La, Quynh Thi Ngoc Le, Thao Duc Le, Huyen Thi Thanh Tran, Lan Thi Mai Tran, Chi Toan Le, Dung Viet Nguyen, Phi Bang Cao, Ha Duc Chu
Small auxin-up RNA (SAUR) proteins were known as a large family that supposedly participated in various biological processes in higher plant species. However, the SAUR family has been still not explored in cacao (Theobroma cacao L.), one of the most important industrial trees. The present work, as an in silico study, revealed comprehensive aspects of the structure, phylogeny, and expression of TcSAUR gene family in cacao. A total of 90 members of the TcSAUR gene family have been identified and annotated in the cacao genome. According to the physic-chemical features analysis, all TcSAUR proteins exhibited slightly similar characteristics. Phylogenetic analysis showed that these TcSAUR proteins could be categorized into seven distinct groups, with 10 sub-groups. Our results suggested that tandemly duplication events, segmental duplication events, and whole genome duplication events might be important in the growth of the TcSAUR gene family in cacao. By re-analyzing the available transcriptome databases, we found that a number of TcSAUR genes were exclusively expressed during the zygotic embryogenesis and somatic embryogenesis. Taken together, our study will be valuable to further functional characterizations of candidate TcSAUR genes for the genetic engineering of cacao.
{"title":"Genome-wide identification, characterization, and expression analysis of the small auxin-up RNA gene family during zygotic and somatic embryo maturation of the cacao tree (Theobroma cacao).","authors":"Ngoc Thi Bich Chu, Man Thi Le, Hong Viet La, Quynh Thi Ngoc Le, Thao Duc Le, Huyen Thi Thanh Tran, Lan Thi Mai Tran, Chi Toan Le, Dung Viet Nguyen, Phi Bang Cao, Ha Duc Chu","doi":"10.1186/s44342-024-00003-6","DOIUrl":"10.1186/s44342-024-00003-6","url":null,"abstract":"<p><p>Small auxin-up RNA (SAUR) proteins were known as a large family that supposedly participated in various biological processes in higher plant species. However, the SAUR family has been still not explored in cacao (Theobroma cacao L.), one of the most important industrial trees. The present work, as an in silico study, revealed comprehensive aspects of the structure, phylogeny, and expression of TcSAUR gene family in cacao. A total of 90 members of the TcSAUR gene family have been identified and annotated in the cacao genome. According to the physic-chemical features analysis, all TcSAUR proteins exhibited slightly similar characteristics. Phylogenetic analysis showed that these TcSAUR proteins could be categorized into seven distinct groups, with 10 sub-groups. Our results suggested that tandemly duplication events, segmental duplication events, and whole genome duplication events might be important in the growth of the TcSAUR gene family in cacao. By re-analyzing the available transcriptome databases, we found that a number of TcSAUR genes were exclusively expressed during the zygotic embryogenesis and somatic embryogenesis. Taken together, our study will be valuable to further functional characterizations of candidate TcSAUR genes for the genetic engineering of cacao.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"22 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11184954/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141437954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01Epub Date: 2023-09-27DOI: 10.5808/gi.23061
So-Hyun Bae, Taewon Hwang, Mi-Ryung Han
Tumor hypoxia, oxygen deprivation state, occurs in most cancers and promotes angiogenesis, enhancing the potential for metastasis. The vascular endothelial growth factor (VEGF) family genes play crucial roles in tumorigenesis by promoting angiogenesis. To investigate the malignant processes triggered by hypoxia-induced angiogenesis across pan-cancers, we comprehensively analyzed the relationships between the expression of VEGF family genes and hypoxic microenvironment based on integrated bioinformatics methods. Our results suggest that the expression of VEGF family genes differs significantly among various cancers, highlighting their heterogeneity effect on human cancers. Across the 33 cancers, VEGFB and VEGFD showed the highest and lowest expression levels, respectively. The survival analysis showed that VEGFA and placental growth factor (PGF) were correlated with poor prognosis in many cancers, including kidney renal cell and liver hepatocellular carcinoma. VEGFC expression was positively correlated with glioma and stomach cancer. VEGFA and PGF showed distinct positive correlations with hypoxia scores in most cancers, indicating a potential correlation with tumor aggressiveness. The expression of miRNAs targeting VEGF family genes, including hsa-miR-130b-5p and hsa-miR-940, was positively correlated with hypoxia. In immune subtypes analysis, VEGFC was highly expressed in C3 (inflammatory) and C6 (transforming growth factor β dominant) across various cancers, indicating its potential role as a tumor promotor. VEGFC expression exhibited positive correlations with immune infiltration scores, suggesting low tumor purity. High expression of VEGFA and VEGFC showed favorable responses to various drugs, including BLU-667, which abrogates RET signaling, an oncogenic driver in liver and thyroid cancers. Our findings suggest potential roles of VEGF family genes in malignant processes related with hypoxia-induced angiogenesis.
{"title":"Unraveling the hypoxia modulating potential of VEGF family genes in pan-cancer.","authors":"So-Hyun Bae, Taewon Hwang, Mi-Ryung Han","doi":"10.5808/gi.23061","DOIUrl":"10.5808/gi.23061","url":null,"abstract":"<p><p>Tumor hypoxia, oxygen deprivation state, occurs in most cancers and promotes angiogenesis, enhancing the potential for metastasis. The vascular endothelial growth factor (VEGF) family genes play crucial roles in tumorigenesis by promoting angiogenesis. To investigate the malignant processes triggered by hypoxia-induced angiogenesis across pan-cancers, we comprehensively analyzed the relationships between the expression of VEGF family genes and hypoxic microenvironment based on integrated bioinformatics methods. Our results suggest that the expression of VEGF family genes differs significantly among various cancers, highlighting their heterogeneity effect on human cancers. Across the 33 cancers, VEGFB and VEGFD showed the highest and lowest expression levels, respectively. The survival analysis showed that VEGFA and placental growth factor (PGF) were correlated with poor prognosis in many cancers, including kidney renal cell and liver hepatocellular carcinoma. VEGFC expression was positively correlated with glioma and stomach cancer. VEGFA and PGF showed distinct positive correlations with hypoxia scores in most cancers, indicating a potential correlation with tumor aggressiveness. The expression of miRNAs targeting VEGF family genes, including hsa-miR-130b-5p and hsa-miR-940, was positively correlated with hypoxia. In immune subtypes analysis, VEGFC was highly expressed in C3 (inflammatory) and C6 (transforming growth factor β dominant) across various cancers, indicating its potential role as a tumor promotor. VEGFC expression exhibited positive correlations with immune infiltration scores, suggesting low tumor purity. High expression of VEGFA and VEGFC showed favorable responses to various drugs, including BLU-667, which abrogates RET signaling, an oncogenic driver in liver and thyroid cancers. Our findings suggest potential roles of VEGF family genes in malignant processes related with hypoxia-induced angiogenesis.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":"e44"},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10788353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49687003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01Epub Date: 2023-09-27DOI: 10.5808/gi.23039
Sara Hajipour, Sayed Mostafa Hosseini, Shiva Irani, Mahmood Tavallaie
Non-small cell lung cancer (NSCLC) is an important cause of cancer-associated deaths worldwide. Therefore, the exact molecular mechanisms of NSCLC are unidentified. The present investigation aims to identify the miRNAs with predictive value in NSCLC. The two datasets were downloaded from the Gene Expression Omnibus (GEO) database. Differentially expressed miRNAs (DEmiRNA) and mRNAs (DEmRNA) were selected from the normalized data. Next, miRNA-mRNA interactions were determined. Then, co-expression network analysis was completed using the WGCNA package in R software. The co-expression network between DEmiRNAs and DEmRNAs was calculated to prioritize the miRNAs. Next, the enrichment analysis was performed for DEmiRNA and DEmRNA. Finally, the drug-gene interaction network was constructed by importing the gene list to dgidb database. A total of 3,033 differentially expressed genes and 58 DE miRNA were recognized from two datasets. The co-expression network analysis was utilized to build a gene co-expression network. Next, four modules were selected based on the Zsummary score. In the next step, a bipartite miRNA-gene network was constructed and hub miRNAs (let-7a-2-3p, let-7d-5p, let-7b-5p, let-7a-5p, and let-7b-3p) were selected. Finally, a drug-gene network was constructed while SUNITINIB, MEDROXYPROGESTERONE ACETATE, DOFETILIDE, HALOPERIDOL, and CALCITRIOL drugs were recognized as a beneficial drug in NSCLC. The hub miRNAs and repurposed drugs may act a vital role in NSCLC progression and treatment, respectively; however, these results must validate in further clinical and experimental assessments.
{"title":"Identification of novel potential drugs and miRNAs biomarkers in lung cancer based on gene co-expression network analysis.","authors":"Sara Hajipour, Sayed Mostafa Hosseini, Shiva Irani, Mahmood Tavallaie","doi":"10.5808/gi.23039","DOIUrl":"10.5808/gi.23039","url":null,"abstract":"<p><p>Non-small cell lung cancer (NSCLC) is an important cause of cancer-associated deaths worldwide. Therefore, the exact molecular mechanisms of NSCLC are unidentified. The present investigation aims to identify the miRNAs with predictive value in NSCLC. The two datasets were downloaded from the Gene Expression Omnibus (GEO) database. Differentially expressed miRNAs (DEmiRNA) and mRNAs (DEmRNA) were selected from the normalized data. Next, miRNA-mRNA interactions were determined. Then, co-expression network analysis was completed using the WGCNA package in R software. The co-expression network between DEmiRNAs and DEmRNAs was calculated to prioritize the miRNAs. Next, the enrichment analysis was performed for DEmiRNA and DEmRNA. Finally, the drug-gene interaction network was constructed by importing the gene list to dgidb database. A total of 3,033 differentially expressed genes and 58 DE miRNA were recognized from two datasets. The co-expression network analysis was utilized to build a gene co-expression network. Next, four modules were selected based on the Zsummary score. In the next step, a bipartite miRNA-gene network was constructed and hub miRNAs (let-7a-2-3p, let-7d-5p, let-7b-5p, let-7a-5p, and let-7b-3p) were selected. Finally, a drug-gene network was constructed while SUNITINIB, MEDROXYPROGESTERONE ACETATE, DOFETILIDE, HALOPERIDOL, and CALCITRIOL drugs were recognized as a beneficial drug in NSCLC. The hub miRNAs and repurposed drugs may act a vital role in NSCLC progression and treatment, respectively; however, these results must validate in further clinical and experimental assessments.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"21 3","pages":"e38"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584645/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01Epub Date: 2023-09-27DOI: 10.5808/gi.23002
Ratih Dewi Yudhani, Dyonisa Nasirochmi Pakha, Suyatmi Suyatmi, Lalu Muhammad Irham
Systemic lupus erythematosus (SLE) is an inflammatory-autoimmune disease with a complex multi-organ pathogenesis, and it is known to be associated with significant morbidity and mortality. Various genetic, immunological, endocrine, and environmental factors contribute to SLE. Genomic variants have been identified as potential contributors to SLE susceptibility across multiple continents. However, the specific pathogenic variants that drive SLE remain largely undefined. In this study, we sought to identify these pathogenic variants across various continents using genomic and bioinformatic-based methodologies. We found that the variants rs35677470, rs34536443, rs17849502, and rs13306575 are likely damaging in SLE. Furthermore, these four variants appear to affect the gene expression of NCF2, TYK2, and DNASE1L3 in whole blood tissue. Our findings suggest that these genomic variants warrant further research for validation in functional studies and clinical trials involving SLE patients. We conclude that the integration of genomic and bioinformatic-based databases could enhance our understanding of disease susceptibility, including that of SLE.
{"title":"Identifying pathogenic variants related to systemic lupus erythematosus by integrating genomic databases and a bioinformatic approach.","authors":"Ratih Dewi Yudhani, Dyonisa Nasirochmi Pakha, Suyatmi Suyatmi, Lalu Muhammad Irham","doi":"10.5808/gi.23002","DOIUrl":"10.5808/gi.23002","url":null,"abstract":"<p><p>Systemic lupus erythematosus (SLE) is an inflammatory-autoimmune disease with a complex multi-organ pathogenesis, and it is known to be associated with significant morbidity and mortality. Various genetic, immunological, endocrine, and environmental factors contribute to SLE. Genomic variants have been identified as potential contributors to SLE susceptibility across multiple continents. However, the specific pathogenic variants that drive SLE remain largely undefined. In this study, we sought to identify these pathogenic variants across various continents using genomic and bioinformatic-based methodologies. We found that the variants rs35677470, rs34536443, rs17849502, and rs13306575 are likely damaging in SLE. Furthermore, these four variants appear to affect the gene expression of NCF2, TYK2, and DNASE1L3 in whole blood tissue. Our findings suggest that these genomic variants warrant further research for validation in functional studies and clinical trials involving SLE patients. We conclude that the integration of genomic and bioinformatic-based databases could enhance our understanding of disease susceptibility, including that of SLE.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"21 3","pages":"e37"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584638/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DNA barcoding without assessing reliability and validity causes taxonomic errors of species identification, which is responsible for disruptions of their conservation and aquaculture industry. Although DNA barcoding facilitates molecular identification and phylogenetic analysis of species, its availability in clariid catfish lineage remains uncertain. In this study, DNA barcoding was developed and validated for clariid catfish. 2,970 barcode sequences from mitochondrial cytochrome c oxidase I (COI) and cytochrome b (Cytb) genes and D-loop sequences were analyzed for 37 clariid catfish species. The highest intraspecific nearest neighbor distances were 85.47%, 98.03%, and 89.10% for COI, Cytb, and D-loop sequences, respectively. This suggests that the Cytb gene is the most appropriate for identifying clariid catfish and can serve as a standard region for DNA barcoding. A positive barcoding gap between interspecific and intraspecific sequence divergence was observed in the Cytb dataset but not in the COI and D-loop datasets. Intraspecific variation was typically less than 4.4%, whereas interspecific variation was generally more than 66.9%. However, a species complex was detected in walking catfish and significant intraspecific sequence divergence was observed in North African catfish. These findings suggest the need to focus on developing a DNA barcoding system for classifying clariid catfish properly and to validate its efficacy for a wider range of clariid catfish. With an enriched database of multiple sequences from a target species and its genus, species identification can be more accurate and biodiversity assessment of the species can be facilitated.
{"title":"Overcoming taxonomic challenges in DNA barcoding for improvement of identification and preservation of clariid catfish species.","authors":"Piangjai Chalermwong, Thitipong Panthum, Pish Wattanadilokcahtkun, Nattakan Ariyaraphong, Thanyapat Thong, Phanitada Srikampa, Worapong Singchat, Syed Farhan Ahmad, Kantika Noito, Ryan Rasoarahona, Artem Lisachov, Hina Ali, Ekaphan Kraichak, Narongrit Muangmai, Satid Chatchaiphan, Kednapat Sriphairoj, Sittichai Hatachote, Aingorn Chaiyes, Chatchawan Jantasuriyarat, Visarut Chailertlit, Warong Suksavate, Jumaporn Sonongbua, Witsanu Srimai, Sunchai Payungporn, Kyudong Han, Agostinho Antunes, Prapansak Srisapoome, Akihiko Koga, Prateep Duengkae, Yoichi Matsuda, Uthairat Na-Nakorn, Kornsorn Srikulnath","doi":"10.5808/gi.23038","DOIUrl":"10.5808/gi.23038","url":null,"abstract":"<p><p>DNA barcoding without assessing reliability and validity causes taxonomic errors of species identification, which is responsible for disruptions of their conservation and aquaculture industry. Although DNA barcoding facilitates molecular identification and phylogenetic analysis of species, its availability in clariid catfish lineage remains uncertain. In this study, DNA barcoding was developed and validated for clariid catfish. 2,970 barcode sequences from mitochondrial cytochrome c oxidase I (COI) and cytochrome b (Cytb) genes and D-loop sequences were analyzed for 37 clariid catfish species. The highest intraspecific nearest neighbor distances were 85.47%, 98.03%, and 89.10% for COI, Cytb, and D-loop sequences, respectively. This suggests that the Cytb gene is the most appropriate for identifying clariid catfish and can serve as a standard region for DNA barcoding. A positive barcoding gap between interspecific and intraspecific sequence divergence was observed in the Cytb dataset but not in the COI and D-loop datasets. Intraspecific variation was typically less than 4.4%, whereas interspecific variation was generally more than 66.9%. However, a species complex was detected in walking catfish and significant intraspecific sequence divergence was observed in North African catfish. These findings suggest the need to focus on developing a DNA barcoding system for classifying clariid catfish properly and to validate its efficacy for a wider range of clariid catfish. With an enriched database of multiple sequences from a target species and its genus, species identification can be more accurate and biodiversity assessment of the species can be facilitated.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"21 3","pages":"e39"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584641/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Resistance to anti-tuberculosis drugs, especially ethambutol (EMB), has been widely reported worldwide. EMB resistance is caused by mutations in the embB gene, which encodes the arabinosyl transferase enzyme. This study aimed to detect mutations in the embB gene of Mycobacterium tuberculosis from Papua and to evaluate their impact on the effectiveness of EMB. We analyzed 20 samples of M. tuberculosis culture that had undergone whole-genome sequencing, of which 19 samples were of sufficient quality for further bioinformatics analysis. Mutation analysis was performed using TBProfiler, which identified M306L, M306V, D1024N, and E378A mutations. In sample TB035, the M306L mutation was present along with E378A. The binding affinity of EMB to arabinosyl transferase was calculated using AutoDock Vina. The molecular docking results revealed that all mutants demonstrated an increased binding affinity to EMB compared to the native protein (-0.948 kcal/mol). The presence of the M306L mutation, when coexisting with E378A, resulted in a slight increase in binding affinity compared to the M306L mutation alone. The molecular dynamics simulation results indicated that the M306L, M306L + E378A, M306V, and E378A mutants decreased protein stability. Conversely, the D1024N mutant exhibited stability comparable to the native protein. In conclusion, this study suggests that the M306L, M306L + E378A, M306V, and E378A mutations may contribute to EMB resistance, while the D1024N mutation may be consistent with continued susceptibility to EMB.
{"title":"Structural dynamics insights into the M306L, M306V, and D1024N mutations in Mycobacterium tuberculosis inducing resistance to ethambutol.","authors":"Yustinus Maladan, Dodi Safari, Arli Aditya Parikesit","doi":"10.5808/gi.23019","DOIUrl":"10.5808/gi.23019","url":null,"abstract":"<p><p>Resistance to anti-tuberculosis drugs, especially ethambutol (EMB), has been widely reported worldwide. EMB resistance is caused by mutations in the embB gene, which encodes the arabinosyl transferase enzyme. This study aimed to detect mutations in the embB gene of Mycobacterium tuberculosis from Papua and to evaluate their impact on the effectiveness of EMB. We analyzed 20 samples of M. tuberculosis culture that had undergone whole-genome sequencing, of which 19 samples were of sufficient quality for further bioinformatics analysis. Mutation analysis was performed using TBProfiler, which identified M306L, M306V, D1024N, and E378A mutations. In sample TB035, the M306L mutation was present along with E378A. The binding affinity of EMB to arabinosyl transferase was calculated using AutoDock Vina. The molecular docking results revealed that all mutants demonstrated an increased binding affinity to EMB compared to the native protein (-0.948 kcal/mol). The presence of the M306L mutation, when coexisting with E378A, resulted in a slight increase in binding affinity compared to the M306L mutation alone. The molecular dynamics simulation results indicated that the M306L, M306L + E378A, M306V, and E378A mutants decreased protein stability. Conversely, the D1024N mutant exhibited stability comparable to the native protein. In conclusion, this study suggests that the M306L, M306L + E378A, M306V, and E378A mutations may contribute to EMB resistance, while the D1024N mutation may be consistent with continued susceptibility to EMB.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"21 3","pages":"e32"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584647/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01Epub Date: 2023-06-28DOI: 10.5808/gi.23047
Eunjee Lee, Joseph G Ibrahim, Hongtu Zhu
Mild cognitive impairment (MCI) is a clinical syndrome characterized by the onset and evolution of cognitive impairments, often considered a transitional stage to Alzheimer's disease (AD). The genetic traits of MCI patients who experience a rapid progression to AD can enhance early diagnosis capabilities and facilitate drug discovery for AD. While a genome-wide association study (GWAS) is a standard tool for identifying single nucleotide polymorphisms (SNPs) related to a disease, it fails to detect SNPs with small effect sizes due to stringent control for multiple testing. Additionally, the method does not consider the group structures of SNPs, such as genes or linkage disequilibrium blocks, which can provide valuable insights into the genetic architecture. To address the limitations, we propose a Bayesian bi-level variable selection method that detects SNPs associated with time of conversion from MCI to AD. Our approach integrates group inclusion indicators into an accelerated failure time model to identify important SNP groups. Additionally, we employ data augmentation techniques to impute censored time values using a predictive posterior. We adapt Dirichlet-Laplace shrinkage priors to incorporate the group structure for SNP-level variable selection. In the simulation study, our method outperformed other competing methods regarding variable selection. The analysis of Alzheimer's Disease Neuroimaging Initiative (ADNI) data revealed several genes directly or indirectly related to AD, whereas a classical GWAS did not identify any significant SNPs.
{"title":"Bayesian bi-level variable selection for genome-wide survival study.","authors":"Eunjee Lee, Joseph G Ibrahim, Hongtu Zhu","doi":"10.5808/gi.23047","DOIUrl":"10.5808/gi.23047","url":null,"abstract":"<p><p>Mild cognitive impairment (MCI) is a clinical syndrome characterized by the onset and evolution of cognitive impairments, often considered a transitional stage to Alzheimer's disease (AD). The genetic traits of MCI patients who experience a rapid progression to AD can enhance early diagnosis capabilities and facilitate drug discovery for AD. While a genome-wide association study (GWAS) is a standard tool for identifying single nucleotide polymorphisms (SNPs) related to a disease, it fails to detect SNPs with small effect sizes due to stringent control for multiple testing. Additionally, the method does not consider the group structures of SNPs, such as genes or linkage disequilibrium blocks, which can provide valuable insights into the genetic architecture. To address the limitations, we propose a Bayesian bi-level variable selection method that detects SNPs associated with time of conversion from MCI to AD. Our approach integrates group inclusion indicators into an accelerated failure time model to identify important SNP groups. Additionally, we employ data augmentation techniques to impute censored time values using a predictive posterior. We adapt Dirichlet-Laplace shrinkage priors to incorporate the group structure for SNP-level variable selection. In the simulation study, our method outperformed other competing methods regarding variable selection. The analysis of Alzheimer's Disease Neuroimaging Initiative (ADNI) data revealed several genes directly or indirectly related to AD, whereas a classical GWAS did not identify any significant SNPs.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"21 3","pages":"e28"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584651/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis, one of the most deadly infections in humans. The emergence of multidrug-resistant and extensively drug-resistant Mtb strains presents a global challenge. Mtb has shown resistance to many frontline antibiotics, including rifampicin, kanamycin, isoniazid, and capreomycin. The only licensed vaccine, Bacille Calmette-Guerin, does not efficiently protect against adult pulmonary tuberculosis. Therefore, it is urgently necessary to develop new vaccines to prevent infections caused by these strains. We used a subtractive proteomics approach on 23 virulent Mtb strains and identified a conserved membrane protein (MmpL4, NP_214964.1) as both a potential drug target and vaccine candidate. MmpL4 is a non-homologous essential protein in the host and is involved in the pathogen-specific pathway. Furthermore, MmpL4 shows no homology with anti-targets and has limited homology to human gut microflora, potentially reducing the likelihood of adverse effects and cross-reactivity if therapeutics specific to this protein are developed. Subsequently, we constructed a highly soluble, safe, antigenic, and stable multi-subunit vaccine from the MmpL4 protein using immunoinformatics. Molecular dynamics simulations revealed the stability of the vaccine-bound Toll-like receptor-4 complex on a nanosecond scale, and immune simulations indicated strong primary and secondary immune responses in the host. Therefore, our study identifies a new target that could expedite the design of effective therapeutics, and the designed vaccine should be validated. Future directions include an extensive molecular interaction analysis, in silico cloning, wet-lab experiments, and evaluation and comparison of the designed candidate as both a DNA vaccine and protein vaccine.
{"title":"Multi-epitope vaccine against drug-resistant strains of Mycobacterium tuberculosis: a proteome-wide subtraction and immunoinformatics approach.","authors":"Md Tahsin Khan, Araf Mahmud, Md Muzahidul Islam, Mst Sayedatun Nessa Sumaia, Zeaur Rahim, Kamrul Islam, Asif Iqbal","doi":"10.5808/gi.23021","DOIUrl":"10.5808/gi.23021","url":null,"abstract":"<p><p>Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis, one of the most deadly infections in humans. The emergence of multidrug-resistant and extensively drug-resistant Mtb strains presents a global challenge. Mtb has shown resistance to many frontline antibiotics, including rifampicin, kanamycin, isoniazid, and capreomycin. The only licensed vaccine, Bacille Calmette-Guerin, does not efficiently protect against adult pulmonary tuberculosis. Therefore, it is urgently necessary to develop new vaccines to prevent infections caused by these strains. We used a subtractive proteomics approach on 23 virulent Mtb strains and identified a conserved membrane protein (MmpL4, NP_214964.1) as both a potential drug target and vaccine candidate. MmpL4 is a non-homologous essential protein in the host and is involved in the pathogen-specific pathway. Furthermore, MmpL4 shows no homology with anti-targets and has limited homology to human gut microflora, potentially reducing the likelihood of adverse effects and cross-reactivity if therapeutics specific to this protein are developed. Subsequently, we constructed a highly soluble, safe, antigenic, and stable multi-subunit vaccine from the MmpL4 protein using immunoinformatics. Molecular dynamics simulations revealed the stability of the vaccine-bound Toll-like receptor-4 complex on a nanosecond scale, and immune simulations indicated strong primary and secondary immune responses in the host. Therefore, our study identifies a new target that could expedite the design of effective therapeutics, and the designed vaccine should be validated. Future directions include an extensive molecular interaction analysis, in silico cloning, wet-lab experiments, and evaluation and comparison of the designed candidate as both a DNA vaccine and protein vaccine.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"21 3","pages":"e42"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584640/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01Epub Date: 2023-09-27DOI: 10.5808/gi.23024
Ayesha Wisal, Asad Ullah, Waheed Anwar, Carlos M Morel, Syed Shah Hassan
Nosocomial infections, commonly referred to as healthcare-associated infections, are illnesses that patients get while hospitalized and are typically either not yet manifest or may develop. One of the most prevalent nosocomial diseases in hospitalized patients is pneumonia, among the leading causes of mortality and morbidity. Viral, bacterial, and fungal pathogens cause pneumonia. More severe introductions commonly included Staphylococcus aureus, which is at the top of bacterial infections, per World Health Organization reports. The staphylococci, S. aureus, strain RMI-014804, mesophile, on-sporulating, and non-motile bacterium, was isolated from the sputum of a pulmonary patient in Pakistan. Many characteristics of S. aureus strain RMI-014804 have been revealed in this paper, with complete genome sequence and annotation. Our findings indicate that the genome is a single circular 2.82 Mbp long genome with 1,962 protein-coding genes, 15 rRNA, 49 tRNA, 62 pseudogenes, and a GC content of 28.76%. As a result of this genome sequencing analysis, researchers will fully understand the genetic and molecular basis of the virulence of the S. aureus bacteria, which could help prevent the spread of nosocomial infections like pneumonia. Genome analysis of this strain was necessary to identify the specific genes and molecular mechanisms that contribute to its pathogenicity, antibiotic resistance, and genetic diversity, allowing for a more in-depth investigation of its pathogenesis to develop new treatments and preventive measures against infections caused by this bacterium.
{"title":"Whole genomic sequencing of Staphylococcus aureus strain RMI-014804 isolated from pulmonary patient sputum via next-generation sequencing technology.","authors":"Ayesha Wisal, Asad Ullah, Waheed Anwar, Carlos M Morel, Syed Shah Hassan","doi":"10.5808/gi.23024","DOIUrl":"10.5808/gi.23024","url":null,"abstract":"<p><p>Nosocomial infections, commonly referred to as healthcare-associated infections, are illnesses that patients get while hospitalized and are typically either not yet manifest or may develop. One of the most prevalent nosocomial diseases in hospitalized patients is pneumonia, among the leading causes of mortality and morbidity. Viral, bacterial, and fungal pathogens cause pneumonia. More severe introductions commonly included Staphylococcus aureus, which is at the top of bacterial infections, per World Health Organization reports. The staphylococci, S. aureus, strain RMI-014804, mesophile, on-sporulating, and non-motile bacterium, was isolated from the sputum of a pulmonary patient in Pakistan. Many characteristics of S. aureus strain RMI-014804 have been revealed in this paper, with complete genome sequence and annotation. Our findings indicate that the genome is a single circular 2.82 Mbp long genome with 1,962 protein-coding genes, 15 rRNA, 49 tRNA, 62 pseudogenes, and a GC content of 28.76%. As a result of this genome sequencing analysis, researchers will fully understand the genetic and molecular basis of the virulence of the S. aureus bacteria, which could help prevent the spread of nosocomial infections like pneumonia. Genome analysis of this strain was necessary to identify the specific genes and molecular mechanisms that contribute to its pathogenicity, antibiotic resistance, and genetic diversity, allowing for a more in-depth investigation of its pathogenesis to develop new treatments and preventive measures against infections caused by this bacterium.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"21 3","pages":"e34"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584650/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Type 2 diabetes mellitus (T2DM) is a multifactorial, polygenic, and metabolically complicated disease. A large number of genes are responsible for the biogenesis of T2DM and calpain10 (CAPN10) is one of them. The association of numerous CAPN10 genetic polymorphisms in the development of T2DM has been widely studied in different populations and noticed inconclusive results. The present study is an attempt to evaluate the plausible association of CAPN10 polymorphism SNP-19 (rs3842570) with T2DM and T2DM-related anthropometric and metabolic traits in the Noakhali region of Bangladesh. This case-control study included 202 T2DM patients and 75 healthy individuals from different places in Noakhali. A significant association (p < 0.05) of SNP-19 with T2DM in co-dominant 2R/3R vs. 3R/3R (odds ratio [OR], 2.7; p=0.0014) and dominant (2R/3R) + (2R/2R) vs. 3R/3R (OR, 2.47; p=0.0011) genetic models was observed. High-risk allele 2R also showed a significant association with T2DM in the allelic model (OR, 1.67; p=0.0109). The genotypic frequency of SNP-19 variants showed consistency with Hardy-Weinberg equilibrium (p > 0.05). Additionally, SNP-19 genetic variants showed potential associations with the anthropometric and metabolic traits of T2DM patients in terms of body mass index, systolic blood pressure, diastolic blood pressure, total cholesterol, and triglycerides. Our approach identifies the 2R/3R genotype of SNP-19 as a significant risk factor for biogenesis of T2DM in the Noakhali population. Furthermore, a large-scale study could be instrumental to correlate this finding in overall Bangladeshi population.
{"title":"Association of CAPN10 gene (rs3842570) polymorphism with the type 2 diabetes mellitus among the population of Noakhali region in Bangladesh: a case-control study.","authors":"Munia Sultana, Md Mafizul Islam, Md Murad Hossain, Md Anisur Rahman, Shuvo Chandra Das, Dhirendra Nath Barman, Farhana Siddiqi Mitu, Shipan Das Gupta","doi":"10.5808/gi.23023","DOIUrl":"10.5808/gi.23023","url":null,"abstract":"<p><p>Type 2 diabetes mellitus (T2DM) is a multifactorial, polygenic, and metabolically complicated disease. A large number of genes are responsible for the biogenesis of T2DM and calpain10 (CAPN10) is one of them. The association of numerous CAPN10 genetic polymorphisms in the development of T2DM has been widely studied in different populations and noticed inconclusive results. The present study is an attempt to evaluate the plausible association of CAPN10 polymorphism SNP-19 (rs3842570) with T2DM and T2DM-related anthropometric and metabolic traits in the Noakhali region of Bangladesh. This case-control study included 202 T2DM patients and 75 healthy individuals from different places in Noakhali. A significant association (p < 0.05) of SNP-19 with T2DM in co-dominant 2R/3R vs. 3R/3R (odds ratio [OR], 2.7; p=0.0014) and dominant (2R/3R) + (2R/2R) vs. 3R/3R (OR, 2.47; p=0.0011) genetic models was observed. High-risk allele 2R also showed a significant association with T2DM in the allelic model (OR, 1.67; p=0.0109). The genotypic frequency of SNP-19 variants showed consistency with Hardy-Weinberg equilibrium (p > 0.05). Additionally, SNP-19 genetic variants showed potential associations with the anthropometric and metabolic traits of T2DM patients in terms of body mass index, systolic blood pressure, diastolic blood pressure, total cholesterol, and triglycerides. Our approach identifies the 2R/3R genotype of SNP-19 as a significant risk factor for biogenesis of T2DM in the Noakhali population. Furthermore, a large-scale study could be instrumental to correlate this finding in overall Bangladeshi population.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"21 3","pages":"e33"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10584643/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}