Pub Date : 2025-01-17DOI: 10.1186/s12863-025-01300-x
Min-Kyu Park, Yeong-Jun Park, Myung-Suk Kang, Min-Ha Kim, Soo-Young Kim, Jae-Ho Shin
Objectives: The data were collected to obtain the complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770, isolated from the rhizosphere of Sasamorpha in a heavy metal-contaminated coal mine in Hongcheon, Republic of Korea. The objective was to explore the strain's genetic potential for plant growth promotion and heavy metal resistance, particularly arsenate and copper. The aim focused on identifying microbes that enhance plant growth in metal-tolerant environments and evaluating the strain's bioremediation and agricultural uses. This study sought key genes for bioremediation and agricultural applications in contaminated soils, aiding sustainable management and biotechnology.
Data description: We report the complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770, isolated from a coal mine in Hongcheon, Republic of Korea. The genome contains a chromosome (4,403,796 bp) and a plasmid (74,326 bp, named pMK-1) with 286-fold coverage. Genome annotation identified 4,209 genes, including 3,926 protein-coding genes, 51 tRNA genes, and 15 rRNA genes, with a G + C content of 66.1%. Functional analysis revealed genes related to plant growth promotion and heavy metal resistance, such as arsenate (arsR, arsC) and copper (copC, copD) resistance genes. Genes involved in auxin biosynthesis suggest potential agricultural applications. The genome and plasmid are available in GenBank (CP041198.1, CP014497.1), offering insights into bioremediation and plant growth in metal-contaminated environments.
{"title":"Complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770 from coal mine of Hongcheon on Republic of Korea.","authors":"Min-Kyu Park, Yeong-Jun Park, Myung-Suk Kang, Min-Ha Kim, Soo-Young Kim, Jae-Ho Shin","doi":"10.1186/s12863-025-01300-x","DOIUrl":"10.1186/s12863-025-01300-x","url":null,"abstract":"<p><strong>Objectives: </strong>The data were collected to obtain the complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770, isolated from the rhizosphere of Sasamorpha in a heavy metal-contaminated coal mine in Hongcheon, Republic of Korea. The objective was to explore the strain's genetic potential for plant growth promotion and heavy metal resistance, particularly arsenate and copper. The aim focused on identifying microbes that enhance plant growth in metal-tolerant environments and evaluating the strain's bioremediation and agricultural uses. This study sought key genes for bioremediation and agricultural applications in contaminated soils, aiding sustainable management and biotechnology.</p><p><strong>Data description: </strong>We report the complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770, isolated from a coal mine in Hongcheon, Republic of Korea. The genome contains a chromosome (4,403,796 bp) and a plasmid (74,326 bp, named pMK-1) with 286-fold coverage. Genome annotation identified 4,209 genes, including 3,926 protein-coding genes, 51 tRNA genes, and 15 rRNA genes, with a G + C content of 66.1%. Functional analysis revealed genes related to plant growth promotion and heavy metal resistance, such as arsenate (arsR, arsC) and copper (copC, copD) resistance genes. Genes involved in auxin biosynthesis suggest potential agricultural applications. The genome and plasmid are available in GenBank (CP041198.1, CP014497.1), offering insights into bioremediation and plant growth in metal-contaminated environments.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"5"},"PeriodicalIF":1.9,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11740416/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143017272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: miRNAs (microRNAs) are endogenous RNAs with lengths of 18 to 24 nucleotides and play critical roles in gene regulation and disease progression. Although traditional wet-lab experiments provide direct evidence for miRNA-disease associations, they are often time-consuming and complicated to analyze by current bioinformatics tools. In recent years, machine learning (ML) and deep learning (DL) techniques are powerful tools to analyze large-scale biological data. Hence, developing a model to predict, identify, and rank connections in miRNAs and diseases can significantly enhance the precision and efficiency in investigating the relationships between miRNAs and diseases.
Results: In this study, we utilized miRNA-disease association data obtained by biotechnological experiments to develop a DL model for miRNA-disease associations. To improve the accuracy of prediction in this model, we introduced two labeling strategies, weight-based and majority-based definitions, to classify miRNA-disease associations. After preprocessing, data was trained with a novel model combining gated recurrent units (GRU) and graph convolutional network (GCN) to predict the level of miRNA-disease associations. The miRNA-disease association datasets were from HMDD (the Human miRNA Disease Database) and categorized by two distinct labeling approaches, weight-based definitions and majority-based definitions. We classified the miRNA-disease associations into three groups, "upregulated", "downregulated" and "nonspecific", by regression analysis and multiclass classification. This GRU-GCN coordinated model achieved a robust area under the curve (AUC) score of 0.8 in all datasets, demonstrating the efficacy in predicting potential miRNA-disease relationships.
Conclusions: By introducing innovative label-preprocessing methods, this study addressed the relationships between miRNAs and diseases, and improved the ambiguity of the results in different experiments. Based on these refined label definitions, we developed a DL-based model to refine and predict the results of associations between miRNAs and diseases. This model offers a valuable tool for complementing traditional experimental methods and enhancing our understanding of miRNA-related disease mechanisms.
{"title":"Establishing a GRU-GCN coordination-based prediction model for miRNA-disease associations.","authors":"Kai-Cheng Chuang, Ping-Sung Cheng, Yu-Hung Tsai, Meng-Hsiun Tsai","doi":"10.1186/s12863-024-01293-z","DOIUrl":"10.1186/s12863-024-01293-z","url":null,"abstract":"<p><strong>Background: </strong>miRNAs (microRNAs) are endogenous RNAs with lengths of 18 to 24 nucleotides and play critical roles in gene regulation and disease progression. Although traditional wet-lab experiments provide direct evidence for miRNA-disease associations, they are often time-consuming and complicated to analyze by current bioinformatics tools. In recent years, machine learning (ML) and deep learning (DL) techniques are powerful tools to analyze large-scale biological data. Hence, developing a model to predict, identify, and rank connections in miRNAs and diseases can significantly enhance the precision and efficiency in investigating the relationships between miRNAs and diseases.</p><p><strong>Results: </strong>In this study, we utilized miRNA-disease association data obtained by biotechnological experiments to develop a DL model for miRNA-disease associations. To improve the accuracy of prediction in this model, we introduced two labeling strategies, weight-based and majority-based definitions, to classify miRNA-disease associations. After preprocessing, data was trained with a novel model combining gated recurrent units (GRU) and graph convolutional network (GCN) to predict the level of miRNA-disease associations. The miRNA-disease association datasets were from HMDD (the Human miRNA Disease Database) and categorized by two distinct labeling approaches, weight-based definitions and majority-based definitions. We classified the miRNA-disease associations into three groups, \"upregulated\", \"downregulated\" and \"nonspecific\", by regression analysis and multiclass classification. This GRU-GCN coordinated model achieved a robust area under the curve (AUC) score of 0.8 in all datasets, demonstrating the efficacy in predicting potential miRNA-disease relationships.</p><p><strong>Conclusions: </strong>By introducing innovative label-preprocessing methods, this study addressed the relationships between miRNAs and diseases, and improved the ambiguity of the results in different experiments. Based on these refined label definitions, we developed a DL-based model to refine and predict the results of associations between miRNAs and diseases. This model offers a valuable tool for complementing traditional experimental methods and enhancing our understanding of miRNA-related disease mechanisms.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"4"},"PeriodicalIF":1.9,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734345/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142985591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-13DOI: 10.1186/s12863-024-01294-y
Jiao Wang, Lei Sun, Bo Jiao, Pu Zhao, Tianyun Xu, Sa Gu, Chenmin Huo, Jianzhou Pang, Shuo Zhou
Background: Wheat seeds display different colors due to the types and contents of anthocyanins, which is closely related to anthocyanin metabolism. In this study, a transcriptomic and metabolomic analysis between white and purple color wheat pericarp aimed to explore some key genes and metabolites involved in anthocyanin metabolism.
Results: Two wheat cultivars, a white seed cultivar Shiluan02-1 and purple seed cultivar Hengzi151 were used to identify the variations in differentially expressed genes (DEGs) and differentially accumulated flavonoids (DAFs). Based on metabolomic data, 314 metabolites and 191 DAFs were identified. Chalcone, flavonol, pro-anthocyanidin and anthocyanidin were the most differentially accumulated flavonoid compounds in Hengzi151. 2610 up-regulated and 2668 down-regulated DEGs were identified according to transcriptomic data. The results showed that some structural genes in anthocyanin synthesis pathway were prominently activated in Hengzi151, such as PAL, CAD, CHS and so on. Transcription factors (TFs) of MYB, bHLH, WD40 and some other TFs probably involved in regulating anthocyanin biosynthesis were identified. Some genes from hormone synthetic and signaling pathways that may participate in regulating anthocyanin biosynthesis also have been identified.
Conclusions: Our results provide valuable information on the candidate genes and metabolites involved in the anthocyanin metabolism in wheat pericarp.
{"title":"Integrated metabolomic and transcriptomic analysis of anthocyanin metabolism in wheat pericarp.","authors":"Jiao Wang, Lei Sun, Bo Jiao, Pu Zhao, Tianyun Xu, Sa Gu, Chenmin Huo, Jianzhou Pang, Shuo Zhou","doi":"10.1186/s12863-024-01294-y","DOIUrl":"10.1186/s12863-024-01294-y","url":null,"abstract":"<p><strong>Background: </strong>Wheat seeds display different colors due to the types and contents of anthocyanins, which is closely related to anthocyanin metabolism. In this study, a transcriptomic and metabolomic analysis between white and purple color wheat pericarp aimed to explore some key genes and metabolites involved in anthocyanin metabolism.</p><p><strong>Results: </strong>Two wheat cultivars, a white seed cultivar Shiluan02-1 and purple seed cultivar Hengzi151 were used to identify the variations in differentially expressed genes (DEGs) and differentially accumulated flavonoids (DAFs). Based on metabolomic data, 314 metabolites and 191 DAFs were identified. Chalcone, flavonol, pro-anthocyanidin and anthocyanidin were the most differentially accumulated flavonoid compounds in Hengzi151. 2610 up-regulated and 2668 down-regulated DEGs were identified according to transcriptomic data. The results showed that some structural genes in anthocyanin synthesis pathway were prominently activated in Hengzi151, such as PAL, CAD, CHS and so on. Transcription factors (TFs) of MYB, bHLH, WD40 and some other TFs probably involved in regulating anthocyanin biosynthesis were identified. Some genes from hormone synthetic and signaling pathways that may participate in regulating anthocyanin biosynthesis also have been identified.</p><p><strong>Conclusions: </strong>Our results provide valuable information on the candidate genes and metabolites involved in the anthocyanin metabolism in wheat pericarp.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"3"},"PeriodicalIF":1.9,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11727400/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142980939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-10DOI: 10.1186/s12863-024-01292-0
Shufen Mo, Haiming Zhong, Weiping Dai, Yuanyuan Li, Bin Qi, Taidong Li, Yongguang Cai
Background: HER2-positive breast cancer (BC) is a subtype of breast cancer. Increased ERBB3 expression has been implicated as a potential cause of resistance to other HER-targeted therapies. Our study aimed to screen and validate prognostic markers associated with ERBB3 expression by bioinformatics and affecting the prognosis of HER2 staging.
Methods: Analyzing differences in ERBB3-related groups. ERBB3 expression-related differentially expressed genes (DEGs) were identified and intersected with survival status-related DEGs to obtain intersected genes. Three algorithms, LASSO, RandomForest and XGBoost were combined to identify the signature genes. we construct risk models and generate ROC curves for prediction. Furthermore, we delve into the immunological traits, correlations, and expression patterns of signature genes by conducting a comprehensive analysis that encompasses immune infiltration analysis, correlation analysis, and differential expression analysis.
Results: Significant variability in ERBB3 expression and prognosis in high and low ERBB3 expression groups. Twenty-five candidate DEGs were identified by intersecting ERBB3-related DEGs with survival-related DEGs. Utilizing three distinct machine learning algorithms, we identified three signature genes-PBX1, IGHM, and CXCL13-that exhibited significant diagnostic value within the diagnostic model. In addition, the risk model had better prognostic and predictive effects, and the immune infiltration analysis showed that IGHM, CXCL13 might affect the proliferation of BC cells through immune cells. Functional studies demonstrated that interference with PBX1 inhibited the proliferation, migration, and epithelial-mesenchymal transition process of HER2-positive BC cells.
Conclusion: PBX1, IGHM and CXCL13 are associated with the expression level of the ERBB3 and are prognostic markers for HER2-positive in BC, which may play an important role in the development and progression of BC.
{"title":"ERBB3-related gene PBX1 is associated with prognosis in patients with HER2-positive breast cancer.","authors":"Shufen Mo, Haiming Zhong, Weiping Dai, Yuanyuan Li, Bin Qi, Taidong Li, Yongguang Cai","doi":"10.1186/s12863-024-01292-0","DOIUrl":"10.1186/s12863-024-01292-0","url":null,"abstract":"<p><strong>Background: </strong>HER2-positive breast cancer (BC) is a subtype of breast cancer. Increased ERBB3 expression has been implicated as a potential cause of resistance to other HER-targeted therapies. Our study aimed to screen and validate prognostic markers associated with ERBB3 expression by bioinformatics and affecting the prognosis of HER2 staging.</p><p><strong>Methods: </strong>Analyzing differences in ERBB3-related groups. ERBB3 expression-related differentially expressed genes (DEGs) were identified and intersected with survival status-related DEGs to obtain intersected genes. Three algorithms, LASSO, RandomForest and XGBoost were combined to identify the signature genes. we construct risk models and generate ROC curves for prediction. Furthermore, we delve into the immunological traits, correlations, and expression patterns of signature genes by conducting a comprehensive analysis that encompasses immune infiltration analysis, correlation analysis, and differential expression analysis.</p><p><strong>Results: </strong>Significant variability in ERBB3 expression and prognosis in high and low ERBB3 expression groups. Twenty-five candidate DEGs were identified by intersecting ERBB3-related DEGs with survival-related DEGs. Utilizing three distinct machine learning algorithms, we identified three signature genes-PBX1, IGHM, and CXCL13-that exhibited significant diagnostic value within the diagnostic model. In addition, the risk model had better prognostic and predictive effects, and the immune infiltration analysis showed that IGHM, CXCL13 might affect the proliferation of BC cells through immune cells. Functional studies demonstrated that interference with PBX1 inhibited the proliferation, migration, and epithelial-mesenchymal transition process of HER2-positive BC cells.</p><p><strong>Conclusion: </strong>PBX1, IGHM and CXCL13 are associated with the expression level of the ERBB3 and are prognostic markers for HER2-positive in BC, which may play an important role in the development and progression of BC.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"2"},"PeriodicalIF":1.9,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11720925/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-03DOI: 10.1186/s12863-024-01288-w
Lei Wang, Chang Lu, Zhi-Gang Bao, Meng Li, Fusheng Wu, Yi-Zeng Lu, Bo-Qiang Tong, Mei Yu, Yong-Jun Zhao
Objectives: Toona sinensis, commonly known as Chinese toon, is a perennial woody plant with significant economic and ecological importance. This study employed whole-genome resequencing of 180 T. sinensis samples collected from Shandong to analyze genetic variation and diversity, ultimately identifying 18,231 high-quality SNPs after rigorous quality control and linkage disequilibrium pruning. This comprehensive genomic resource provides novel insights into the genetic architecture of T. sinensis, facilitating the elucidation of population structure and supporting future breeding programs.
Data description: We performed whole-genome resequencing on 180 Toona sinensis samples, generating 1170.26 Gbp of clean data with a Q30 percentage of 93.69%. The average alignment rate to the reference genome was 96.72%, with an average coverage depth of 8 × and a genome coverage of 88.71%. Following data quality control and alignment, we performed SNP calling and filtering to identify high-quality SNPs across all samples. Population structure analyses were then conducted using the identified SNPs, including principal component analysis (PCA), structure analysis, and phylogenetic tree construction. These comprehensive analyses provide a foundation for understanding the genetic diversity and evolutionary dynamics of T. sinensis.
{"title":"Population structure and genetic diversity of Toona sinensis revealed by whole-genome resequencing.","authors":"Lei Wang, Chang Lu, Zhi-Gang Bao, Meng Li, Fusheng Wu, Yi-Zeng Lu, Bo-Qiang Tong, Mei Yu, Yong-Jun Zhao","doi":"10.1186/s12863-024-01288-w","DOIUrl":"https://doi.org/10.1186/s12863-024-01288-w","url":null,"abstract":"<p><strong>Objectives: </strong>Toona sinensis, commonly known as Chinese toon, is a perennial woody plant with significant economic and ecological importance. This study employed whole-genome resequencing of 180 T. sinensis samples collected from Shandong to analyze genetic variation and diversity, ultimately identifying 18,231 high-quality SNPs after rigorous quality control and linkage disequilibrium pruning. This comprehensive genomic resource provides novel insights into the genetic architecture of T. sinensis, facilitating the elucidation of population structure and supporting future breeding programs.</p><p><strong>Data description: </strong>We performed whole-genome resequencing on 180 Toona sinensis samples, generating 1170.26 Gbp of clean data with a Q30 percentage of 93.69%. The average alignment rate to the reference genome was 96.72%, with an average coverage depth of 8 × and a genome coverage of 88.71%. Following data quality control and alignment, we performed SNP calling and filtering to identify high-quality SNPs across all samples. Population structure analyses were then conducted using the identified SNPs, including principal component analysis (PCA), structure analysis, and phylogenetic tree construction. These comprehensive analyses provide a foundation for understanding the genetic diversity and evolutionary dynamics of T. sinensis.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"1"},"PeriodicalIF":1.9,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11697678/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-26DOI: 10.1186/s12863-024-01291-1
Van Hong Thi Dao, Loan To Nguyen, Khanh Phuong Do, Vinh The Nguyen, Hieu Van Nguyen, Khanh Ngoc Pham, Truong Xuan Nguyen, Son Truong Dinh
Objectives: This study aims to generate a de novo complete whole-genome assembly of Pseudomonas sp. strain HOU2, which is an endophytic bacterium isolated from dangshen roots that shows to improve the growth of in vitro dangshen plants. Further investigation of the whole genome of Pseudomonas sp. strain HOU2 will help identify potential genes or pathways that could be involved in the plant growth-promoting effects on in vitro dangshen plants, providing valuable information for future applications.
Data description: The genomic DNA of Pseudomonas sp. strain HOU2 was sequenced using Oxford Nanopore's PromethION sequencer with an R10.4.1 flow cell (Table 1, Data file 1). The assembly of the Pseudomonas sp. strain HOU2 genome was conducted using Flye version 2.9, resulting in a single circular chromosome of 6,047,544 bp with a mean coverage of 488 (Table 1, Data file 2). The annotation of genes, proteins, and features of the HOU2 genome were performed by the RAST server (Rapid Annotation using Subsystem Technology) ( https://rast.nmpdr.org/ ) (Table 1, Data file 3, 4, 5) [6, 7]. The Pseudomonas sp. strain HOU2 genome was determined to be most similar to that of Pseudomonas koreensis using the Type Strain Genome Server ( https://tygs.dsmz.de/ , version v391) [8].
目的:从党参根中分离的内生细菌Pseudomonas sp.菌株HOU2,对党参植株的体外生长有促进作用。对假单胞菌HOU2菌株全基因组的进一步研究,将有助于发现可能参与植物促生作用的基因或途径,为今后的应用提供有价值的信息。数据描述:假单胞菌菌株HOU2的基因组DNA使用Oxford Nanopore的PromethION测序仪和R10.4.1流式细胞进行测序(表1,数据文件1)。假单胞菌菌株HOU2基因组的组装使用Flye version 2.9进行,得到一条长6047444 bp的单圆形染色体,平均覆盖面积为488(表1,数据文件2)。通过RAST服务器(Rapid Annotation using Subsystem Technology) (https://rast.nmpdr.org/)进行HOU2基因组的特征分析(表1,数据文件3,4,5)[6,7]。利用Type strain genome Server (https://tygs.dsmz.de/, version v391)[8]确定假单胞菌菌株HOU2基因组与韩国假单胞菌基因组最相似。
{"title":"Whole-genome sequence of Pseudomonas sp. strain HOU2 isolated from dangshen (Codonopsis javanica) roots.","authors":"Van Hong Thi Dao, Loan To Nguyen, Khanh Phuong Do, Vinh The Nguyen, Hieu Van Nguyen, Khanh Ngoc Pham, Truong Xuan Nguyen, Son Truong Dinh","doi":"10.1186/s12863-024-01291-1","DOIUrl":"10.1186/s12863-024-01291-1","url":null,"abstract":"<p><strong>Objectives: </strong>This study aims to generate a de novo complete whole-genome assembly of Pseudomonas sp. strain HOU2, which is an endophytic bacterium isolated from dangshen roots that shows to improve the growth of in vitro dangshen plants. Further investigation of the whole genome of Pseudomonas sp. strain HOU2 will help identify potential genes or pathways that could be involved in the plant growth-promoting effects on in vitro dangshen plants, providing valuable information for future applications.</p><p><strong>Data description: </strong>The genomic DNA of Pseudomonas sp. strain HOU2 was sequenced using Oxford Nanopore's PromethION sequencer with an R10.4.1 flow cell (Table 1, Data file 1). The assembly of the Pseudomonas sp. strain HOU2 genome was conducted using Flye version 2.9, resulting in a single circular chromosome of 6,047,544 bp with a mean coverage of 488 (Table 1, Data file 2). The annotation of genes, proteins, and features of the HOU2 genome were performed by the RAST server (Rapid Annotation using Subsystem Technology) ( https://rast.nmpdr.org/ ) (Table 1, Data file 3, 4, 5) [6, 7]. The Pseudomonas sp. strain HOU2 genome was determined to be most similar to that of Pseudomonas koreensis using the Type Strain Genome Server ( https://tygs.dsmz.de/ , version v391) [8].</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"107"},"PeriodicalIF":1.9,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11670358/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142900739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-18DOI: 10.1186/s12863-024-01289-9
Bin Mao, Yue Zheng, Yunli Xiao, Kaixia Yang, Jingru Shangguan, Mi Shen, Hao Sun, Xiangliang Fang, Yue Fu
Smittia aterrima (Meigen, 1818) and Smittia pratorum (Goetghebuer, 1927) are important indicator insects for aquatic environments, showing extensive tolerance to the environment. However, the genome-wide phylogenetic relationships and characteristics of the detoxification mechanisms in S. aterrima and S. pratorum remain unclear. Based on the genomes of the two species obtained in our preliminary studies and nine genomes from the NCBI database, we found that chironomids diverged from other mosquitoes approximately 200 million years ago (MYA), and S. aterrima and S. pratorum diverged about 30 MYA according to phylogenetic analysis. Gene family evolution analysis showed significant expansion of 43 and 15 gene families in S. aterrima and S. pratorum, respectively, particularly those related to detoxification pathways. Positive selection analysis reveals that genes under positive selection are crucial for promoting environmental adaptation. Additionally, the detoxification-associated gene families including Cytochrome P450 (CYP), Glutathione S-transferases (GST), ATP-binding cassette (ABC), carboxylesterase (CCE), and UDP-glucuronosyltransferase (UGT) were annotated. Our analysis results show that these five detoxification gene families have significantly expanded in the chironomid genomes. This study highlights the genome evolution of chironomids and their responses to mechanisms of tolerance to environmental challenges.
{"title":"Genome-wide phylogenetic analysis and expansion of gene families involved in detoxification in Smittia aterrima (Meigen)and Smittia pratorum (Goetghebuer) (Diptera, Chironomidae).","authors":"Bin Mao, Yue Zheng, Yunli Xiao, Kaixia Yang, Jingru Shangguan, Mi Shen, Hao Sun, Xiangliang Fang, Yue Fu","doi":"10.1186/s12863-024-01289-9","DOIUrl":"10.1186/s12863-024-01289-9","url":null,"abstract":"<p><p>Smittia aterrima (Meigen, 1818) and Smittia pratorum (Goetghebuer, 1927) are important indicator insects for aquatic environments, showing extensive tolerance to the environment. However, the genome-wide phylogenetic relationships and characteristics of the detoxification mechanisms in S. aterrima and S. pratorum remain unclear. Based on the genomes of the two species obtained in our preliminary studies and nine genomes from the NCBI database, we found that chironomids diverged from other mosquitoes approximately 200 million years ago (MYA), and S. aterrima and S. pratorum diverged about 30 MYA according to phylogenetic analysis. Gene family evolution analysis showed significant expansion of 43 and 15 gene families in S. aterrima and S. pratorum, respectively, particularly those related to detoxification pathways. Positive selection analysis reveals that genes under positive selection are crucial for promoting environmental adaptation. Additionally, the detoxification-associated gene families including Cytochrome P450 (CYP), Glutathione S-transferases (GST), ATP-binding cassette (ABC), carboxylesterase (CCE), and UDP-glucuronosyltransferase (UGT) were annotated. Our analysis results show that these five detoxification gene families have significantly expanded in the chironomid genomes. This study highlights the genome evolution of chironomids and their responses to mechanisms of tolerance to environmental challenges.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"106"},"PeriodicalIF":1.9,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11657295/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142857208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Gossypium raimondii serves as a widely used genomic model cotton species. Its genetic influence to enhance fiber quality and ability to adapt to challenging environments both contribute to increasing cotton production. The formins are a large protein family that predominately consists of FH1 and FH2 domains. The presence of the formin domains highly regulates the actin and microtubule filament in the cytoskeleton dynamics confronting various abiotic stresses such as drought, salinity, and cold temperatures.
Results: In this study, 26 formin genes were analyzed and characterized in G. raimondii and mostly were found in the nucleus and chloroplast. According to the evolutionary phylogenetic relationship, GrFH were dispersed and classified into seven different groups and shared an ancestry relationship with MtFH. The GrFH gene structure prediction revealed diverse intron-exon arrangements between groups. The FH2 conserved domain was found in all the GrFH distributed on 12 different chromosomes. Moreover, 11 pairs of GrFH transpired segmental duplication. Among them, GrFH4-GrFH7 evolved 35 million years ago (MYA) according to the evolutionary divergence time. Besides, 57 cis-acting regulatory elements (CAREs) motifs were found to play a potential role in plant growth, development, and in response to various abiotic stresses, including cold stress. The GrFH genes mostly exhibited biological processes resulting in the regulation of actin polymerization. The ERF, GATA, MYB, and LBD, major transcription factors (TFs) families in GrFH, regulated expression in abiotic stress specifically salt as well as defense against certain pathogens. The microRNA of GrFH unveiled the regulatory mechanism to regulate their gene expression in abiotic stresses such as salt and cold. One of the most economic aspects of cotton (G.raimondii) is the production of lint due to its use in manufacturing fabrics and other industrial applications. The expression profiles of GrFH in different tissues particularly during the conversion from ovule to fiber (lint), and the increased levels (up-regulation) of GrFH4, GrFH6, GrFH12, GrFH14, and GrFH26 under cold conditions, along with GrFH19 and GrFH26 in response to salt stress, indicated their potential involvement in combating these environmental challenges. Moreover, these stress-tolerant GrFH linked to cytoskeleton dynamics are essential in producing high-quality lint.
Conclusions: The findings from this study can contribute to elucidating the evolutionary and functional characterizations of formin genes and deciphering their potential role in abiotic stress such as cold and salt as well as in the future implications in wet lab.
{"title":"Genome-wide identification, characterization and expression profiles of FORMIN gene family in cotton (Gossypium Raimondii L.).","authors":"Pollob Shing, Md Shohel Ul Islam, Mst Sumaiya Khatun, Fatema Tuz Zohra, Naimul Hasan, Shaikh Mizanur Rahman, Md Abdur Rauf Sarkar","doi":"10.1186/s12863-024-01285-z","DOIUrl":"10.1186/s12863-024-01285-z","url":null,"abstract":"<p><strong>Background: </strong>Gossypium raimondii serves as a widely used genomic model cotton species. Its genetic influence to enhance fiber quality and ability to adapt to challenging environments both contribute to increasing cotton production. The formins are a large protein family that predominately consists of FH1 and FH2 domains. The presence of the formin domains highly regulates the actin and microtubule filament in the cytoskeleton dynamics confronting various abiotic stresses such as drought, salinity, and cold temperatures.</p><p><strong>Results: </strong>In this study, 26 formin genes were analyzed and characterized in G. raimondii and mostly were found in the nucleus and chloroplast. According to the evolutionary phylogenetic relationship, GrFH were dispersed and classified into seven different groups and shared an ancestry relationship with MtFH. The GrFH gene structure prediction revealed diverse intron-exon arrangements between groups. The FH2 conserved domain was found in all the GrFH distributed on 12 different chromosomes. Moreover, 11 pairs of GrFH transpired segmental duplication. Among them, GrFH4-GrFH7 evolved 35 million years ago (MYA) according to the evolutionary divergence time. Besides, 57 cis-acting regulatory elements (CAREs) motifs were found to play a potential role in plant growth, development, and in response to various abiotic stresses, including cold stress. The GrFH genes mostly exhibited biological processes resulting in the regulation of actin polymerization. The ERF, GATA, MYB, and LBD, major transcription factors (TFs) families in GrFH, regulated expression in abiotic stress specifically salt as well as defense against certain pathogens. The microRNA of GrFH unveiled the regulatory mechanism to regulate their gene expression in abiotic stresses such as salt and cold. One of the most economic aspects of cotton (G.raimondii) is the production of lint due to its use in manufacturing fabrics and other industrial applications. The expression profiles of GrFH in different tissues particularly during the conversion from ovule to fiber (lint), and the increased levels (up-regulation) of GrFH4, GrFH6, GrFH12, GrFH14, and GrFH26 under cold conditions, along with GrFH19 and GrFH26 in response to salt stress, indicated their potential involvement in combating these environmental challenges. Moreover, these stress-tolerant GrFH linked to cytoskeleton dynamics are essential in producing high-quality lint.</p><p><strong>Conclusions: </strong>The findings from this study can contribute to elucidating the evolutionary and functional characterizations of formin genes and deciphering their potential role in abiotic stress such as cold and salt as well as in the future implications in wet lab.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"105"},"PeriodicalIF":1.9,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11657977/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142857207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-17DOI: 10.1186/s12863-024-01290-2
Andreas Martin Lisewski
Objectives: SARS-CoV-2 spike (S) glycoprotein furin cleavage site is a key determinant of SARS-CoV-2 virulence and COVID-19 pathogencity. Located at the S1/S2 junction, it is unique among sarbecoviruses but frequently found among betacoronaviruses. Recent evidence suggests that this site includes two additional functional motifs: a pat7 nuclear localization signal and two flanking O-glycosites. However, a systematic genus and subgenus analysis of spike protein sequences bearing this polyfunctional sequence domain has been missing.
Data description: Here we report comprehensive sequence data to demonstrate that among spike proteins of genus Betacoronavirus and outside of the SARS-CoV-2 clade a fully analogous S1/S2 domain was found in only one other virus: the artificial MERS infectious clone MERS-MA30, described already in 2017, which was rationally selected from serial passage in genetically humanized mice. As the evolutionarily closest betacoronaviruses outside of the SARS-CoV-2 clade lack all its three functional motifs, these data extend-beyond natural evolution and zoonosis-the current view on SARS-CoV-2 pre-pandemic origins by presenting the analogous S1/S2 MERS-MA30 sequence domain as a precise molecular blueprint for SARS-CoV-2.
{"title":"Pre-pandemic artificial MERS analog of polyfunctional SARS-CoV-2 S1/S2 furin cleavage site domain is unique among spike proteins of genus Betacoronavirus.","authors":"Andreas Martin Lisewski","doi":"10.1186/s12863-024-01290-2","DOIUrl":"10.1186/s12863-024-01290-2","url":null,"abstract":"<p><strong>Objectives: </strong>SARS-CoV-2 spike (S) glycoprotein furin cleavage site is a key determinant of SARS-CoV-2 virulence and COVID-19 pathogencity. Located at the S1/S2 junction, it is unique among sarbecoviruses but frequently found among betacoronaviruses. Recent evidence suggests that this site includes two additional functional motifs: a pat7 nuclear localization signal and two flanking O-glycosites. However, a systematic genus and subgenus analysis of spike protein sequences bearing this polyfunctional sequence domain has been missing.</p><p><strong>Data description: </strong>Here we report comprehensive sequence data to demonstrate that among spike proteins of genus Betacoronavirus and outside of the SARS-CoV-2 clade a fully analogous S1/S2 domain was found in only one other virus: the artificial MERS infectious clone MERS-MA30, described already in 2017, which was rationally selected from serial passage in genetically humanized mice. As the evolutionarily closest betacoronaviruses outside of the SARS-CoV-2 clade lack all its three functional motifs, these data extend-beyond natural evolution and zoonosis-the current view on SARS-CoV-2 pre-pandemic origins by presenting the analogous S1/S2 MERS-MA30 sequence domain as a precise molecular blueprint for SARS-CoV-2.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"104"},"PeriodicalIF":1.9,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11650820/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142847807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-05DOI: 10.1186/s12863-024-01286-y
Rashmi Mahajan, Anuj Kumar Tyagi
Background: Tuberculosis (TB) patients undergoing anti-tuberculosis treatment often face serious adverse drug reactions, such as hepatotoxicity. Genetic variants of the N-acetyltransferase 2 (NAT2) gene have been linked to an increased risk of these toxic events.
Objective: This study aims to provide a comprehensive evaluation of the evidence linking NAT2 genetic variants to anti-tuberculosis drug-related hepatotoxicity (ATDH).
Method: A comprehensive review and meta-analysis was performed by accessing databases such as PubMed, Scopus, and Web of Science. A total of 24 articles were incorporated into the dataset. Meta-analyses were conducted to gather estimates of the association between the slow acetlylators (SA) genotype and ATDH. The studies were stratified by ethnicity, regimen, genotyping methods, criteria for liver toxicity, and dosage. Also, meta-analysis for the specific SA type that was most likely responsible for the ATDH was also conducted.
Results: The included studies showed individuals with a slow NAT2 acetylator had a significantly greater risk of experiencing hepatotoxicity ATDH (odds ratio [OR] 2.52 (95% CI: 1.95-3.27; p value < 0.001) compared to individuals with other types of acetylator (i.e., rapid and immediate). Among individuals with slow acetylator NAT2*5/7, NAT2*5/6, and NAT2*6/6 genotypes, there is a greater likelihood of association compared to other variations.
Conclusion: Our meta-analysis confirms a significant association between slow NAT2 acetylator and increased hepatotoxicity risk. The findings from the present underscore the potential of pharmacogenomic testing to improve TB treatment outcomes. By identifying patients with the slow acetylator NAT2 genotype, healthcare providers can predict an increased risk of anti-tuberculosis drug-induced hepatotoxicity. This allows for personalized treatment strategies, such as adjusting drug dosages or selecting alternative therapies, to minimize adverse effects and optimize efficacy.
背景:接受抗结核治疗的结核病患者往往面临严重的药物不良反应,如肝毒性。n -乙酰转移酶2 (NAT2)基因的遗传变异与这些毒性事件的风险增加有关。目的:本研究旨在对NAT2基因变异与抗结核药物相关肝毒性(ATDH)相关的证据进行综合评价。方法:通过访问PubMed、Scopus、Web of Science等数据库进行综合综述和meta分析。总共有24篇文章被纳入数据集。进行了荟萃分析,以收集慢乙酰化(SA)基因型与ATDH之间关系的估计。这些研究按种族、治疗方案、基因分型方法、肝毒性标准和剂量进行分层。此外,还对最可能导致ATDH的特定SA类型进行了荟萃分析。结果:纳入的研究显示,NAT2乙酰化缓慢的个体发生肝毒性ATDH的风险显著更高(优势比[OR] 2.52 (95% CI: 1.95-3.27;结论:我们的荟萃分析证实了缓慢的NAT2乙酰化与肝毒性风险增加之间的显著关联。目前的研究结果强调了药物基因组学检测改善结核病治疗结果的潜力。通过识别慢乙酰化NAT2基因型患者,医疗保健提供者可以预测抗结核药物引起肝毒性的风险增加。这允许个性化的治疗策略,如调整药物剂量或选择替代疗法,以尽量减少不良反应和优化疗效。
{"title":"Pharmacogenomic insights into tuberculosis treatment shows the NAT2 genetic variants linked to hepatotoxicity risk: a systematic review and meta-analysis.","authors":"Rashmi Mahajan, Anuj Kumar Tyagi","doi":"10.1186/s12863-024-01286-y","DOIUrl":"10.1186/s12863-024-01286-y","url":null,"abstract":"<p><strong>Background: </strong>Tuberculosis (TB) patients undergoing anti-tuberculosis treatment often face serious adverse drug reactions, such as hepatotoxicity. Genetic variants of the N-acetyltransferase 2 (NAT2) gene have been linked to an increased risk of these toxic events.</p><p><strong>Objective: </strong>This study aims to provide a comprehensive evaluation of the evidence linking NAT2 genetic variants to anti-tuberculosis drug-related hepatotoxicity (ATDH).</p><p><strong>Method: </strong>A comprehensive review and meta-analysis was performed by accessing databases such as PubMed, Scopus, and Web of Science. A total of 24 articles were incorporated into the dataset. Meta-analyses were conducted to gather estimates of the association between the slow acetlylators (SA) genotype and ATDH. The studies were stratified by ethnicity, regimen, genotyping methods, criteria for liver toxicity, and dosage. Also, meta-analysis for the specific SA type that was most likely responsible for the ATDH was also conducted.</p><p><strong>Results: </strong>The included studies showed individuals with a slow NAT2 acetylator had a significantly greater risk of experiencing hepatotoxicity ATDH (odds ratio [OR] 2.52 (95% CI: 1.95-3.27; p value < 0.001) compared to individuals with other types of acetylator (i.e., rapid and immediate). Among individuals with slow acetylator NAT2*5/7, NAT2*5/6, and NAT2*6/6 genotypes, there is a greater likelihood of association compared to other variations.</p><p><strong>Conclusion: </strong>Our meta-analysis confirms a significant association between slow NAT2 acetylator and increased hepatotoxicity risk. The findings from the present underscore the potential of pharmacogenomic testing to improve TB treatment outcomes. By identifying patients with the slow acetylator NAT2 genotype, healthcare providers can predict an increased risk of anti-tuberculosis drug-induced hepatotoxicity. This allows for personalized treatment strategies, such as adjusting drug dosages or selecting alternative therapies, to minimize adverse effects and optimize efficacy.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"25 1","pages":"103"},"PeriodicalIF":1.9,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622454/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142787711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}