首页 > 最新文献

BMC Genomics最新文献

英文 中文
A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study. 作物植物泛基因组开发的逐步指南:紫花苜蓿(Medicago sativa)案例研究。
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-31 DOI: 10.1186/s12864-024-10931-w
Harpreet Kaur, Laura M Shannon, Deborah A Samac

Background: The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a single species at a reasonable cost. Pangenomes have been constructed for many major diploid crops, including rice, maize, soybean, sorghum, pearl millet, peas, sunflower, grapes, and mustards. However, pangenomes for polyploid species are relatively scarce and are available in only few crops including wheat, cotton, rapeseed, and potatoes.

Main body: In this review, we explore the various methods used in crop pangenome development, discussing the challenges and implications of these techniques based on insights from published pangenome studies. We offer a systematic guide and discuss the tools available for constructing a pangenome and conducting downstream analyses. Alfalfa, a highly heterozygous, cross pollinated and autotetraploid forage crop species, is used as an example to discuss the concerns and challenges offered by polyploid crop species. We conducted a comparative analysis using linear and graph-based methods by constructing an alfalfa graph pangenome using three publicly available genome assemblies. To illustrate the intricacies captured by pangenome graphs for a complex crop genome, we used five different gene sequences and aligned them against the three graph-based pangenomes. The comparison of the three graph pangenome methods reveals notable variations in the genomic variation captured by each pipeline.

Conclusion: Pangenome resources are proving invaluable by offering insights into core and dispensable genes, novel gene discovery, and genome-wide patterns of variation. Developing user-friendly online portals for linear pangenome visualization has made these resources accessible to the broader scientific and breeding community. However, challenges remain with graph-based pangenomes including compatibility with other tools, extraction of sequence for regions of interest, and visualization of genetic variation captured in pangenome graphs. These issues necessitate further refinement of tools and pipelines to effectively address the complexities of polyploid, highly heterozygous, and cross-pollinated species.

背景:泛基因组学的概念和结构变异的重要性正在得到植物基因组学界的认可。由于测序和计算技术的进步,以合理的成本对单一物种的众多个体进行全基因组测序已变得可行。目前已经构建了许多主要二倍体作物的庞基因组,包括水稻、玉米、大豆、高粱、珍珠粟、豌豆、向日葵、葡萄和芥菜。然而,多倍体物种的泛基因组相对较少,只有小麦、棉花、油菜籽和马铃薯等少数作物有泛基因组:在这篇综述中,我们探讨了作物泛基因组开发中使用的各种方法,并根据已发表的泛基因组研究结果,讨论了这些技术所面临的挑战和意义。我们提供了一个系统指南,并讨论了构建庞基因组和进行下游分析的可用工具。紫花苜蓿是一种高度杂合、异花授粉和自交四倍体的饲料作物物种,我们以紫花苜蓿为例,讨论了多倍体作物物种带来的问题和挑战。我们使用线性方法和基于图谱的方法进行了比较分析,利用三个公开的基因组汇编构建了紫花苜蓿图谱泛基因组。为了说明庞基因组图谱捕捉到的复杂作物基因组的错综复杂性,我们使用了五个不同的基因序列,并将它们与三个基于图谱的庞基因组进行了比对。对三种图谱庞基因组方法进行比较后发现,每种方法捕获的基因组变异都存在明显差异:通过深入了解核心基因和可有可无的基因、新基因的发现以及全基因组的变异模式,庞基因组资源被证明是非常宝贵的。为线性庞基因组可视化开发用户友好型在线门户网站,使更广泛的科学界和育种界可以访问这些资源。然而,基于图谱的庞基因组仍然面临挑战,包括与其他工具的兼容性、感兴趣区域的序列提取以及庞基因组图谱中捕获的遗传变异的可视化。这些问题需要进一步完善工具和管道,以有效解决多倍体、高度杂合和异花授粉物种的复杂性。
{"title":"A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study.","authors":"Harpreet Kaur, Laura M Shannon, Deborah A Samac","doi":"10.1186/s12864-024-10931-w","DOIUrl":"10.1186/s12864-024-10931-w","url":null,"abstract":"<p><strong>Background: </strong>The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a single species at a reasonable cost. Pangenomes have been constructed for many major diploid crops, including rice, maize, soybean, sorghum, pearl millet, peas, sunflower, grapes, and mustards. However, pangenomes for polyploid species are relatively scarce and are available in only few crops including wheat, cotton, rapeseed, and potatoes.</p><p><strong>Main body: </strong>In this review, we explore the various methods used in crop pangenome development, discussing the challenges and implications of these techniques based on insights from published pangenome studies. We offer a systematic guide and discuss the tools available for constructing a pangenome and conducting downstream analyses. Alfalfa, a highly heterozygous, cross pollinated and autotetraploid forage crop species, is used as an example to discuss the concerns and challenges offered by polyploid crop species. We conducted a comparative analysis using linear and graph-based methods by constructing an alfalfa graph pangenome using three publicly available genome assemblies. To illustrate the intricacies captured by pangenome graphs for a complex crop genome, we used five different gene sequences and aligned them against the three graph-based pangenomes. The comparison of the three graph pangenome methods reveals notable variations in the genomic variation captured by each pipeline.</p><p><strong>Conclusion: </strong>Pangenome resources are proving invaluable by offering insights into core and dispensable genes, novel gene discovery, and genome-wide patterns of variation. Developing user-friendly online portals for linear pangenome visualization has made these resources accessible to the broader scientific and breeding community. However, challenges remain with graph-based pangenomes including compatibility with other tools, extraction of sequence for regions of interest, and visualization of genetic variation captured in pangenome graphs. These issues necessitate further refinement of tools and pipelines to effectively address the complexities of polyploid, highly heterozygous, and cross-pollinated species.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1022"},"PeriodicalIF":3.5,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526573/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142557133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
sRNAdeep: a novel tool for bacterial sRNA prediction based on DistilBERT encoding mode and deep learning algorithms. sRNAdeep:基于 DistilBERT 编码模式和深度学习算法的细菌 sRNA 预测新工具。
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-31 DOI: 10.1186/s12864-024-10951-6
Weiye Qian, Jiawei Sun, Tianyi Liu, Zhiyuan Yang, Stephen Kwok-Wing Tsui

Background: Bacterial small regulatory RNA (sRNA) plays a crucial role in cell metabolism and could be used as a new potential drug target in the treatment of pathogen-induced disease. However, experimental methods for identifying sRNAs still require a large investment of human and material resources.

Methods: In this study, we propose a novel sRNA prediction model called sRNAdeep based on the DistilBERT feature extraction and TextCNN methods. The sRNA and non-sRNA sequences of bacteria were considered as sentences and then fed into a composite model consisting of deep learning models to evaluate classification performance.

Results: By filtering sRNAs from BSRD database, we obtained a validation dataset comprised of 2438 positive and 4730 negative samples. The benchmark experiments showed that sRNAdeep displayed better performance in the various indexes compared to previous sRNA prediction tools. By applying our tool to Mycobacterium tuberculosis (MTB) genome, we have identified 21 sRNAs within the intergenic and intron regions. A set of 272 targeted genes regulated by these sRNAs were also captured in MTB. The coding proteins of two genes (lysX and icd1) are implicated in drug response, with significant active sites related to drug resistance mechanisms of MTB.

Conclusion: In conclusion, our newly developed sRNAdeep can help researchers identify bacterial sRNAs more precisely and can be freely available from https://github.com/pyajagod/sRNAdeep.git .

背景:细菌小调控 RNA(sRNA)在细胞代谢中起着至关重要的作用,可作为治疗病原体引起的疾病的潜在新药靶点。然而,鉴定 sRNA 的实验方法仍然需要投入大量的人力和物力:本研究基于 DistilBERT 特征提取和 TextCNN 方法,提出了一种新型 sRNA 预测模型 sRNAdeep。将细菌的sRNA和非sRNA序列视为句子,然后将其输入由深度学习模型组成的复合模型,以评估分类性能:通过过滤 BSRD 数据库中的 sRNA,我们得到了由 2438 个阳性样本和 4730 个阴性样本组成的验证数据集。基准实验结果表明,与之前的sRNA预测工具相比,sRNAdeep在各项指标上都有更好的表现。通过将我们的工具应用于结核分枝杆菌(MTB)基因组,我们在基因间区和内含子区发现了 21 个 sRNA。我们还在 MTB 中捕获了一组受这些 sRNA 调控的 272 个目标基因。两个基因(lysX 和 icd1)的编码蛋白与药物反应有关,其重要活性位点与 MTB 的耐药机制相关:总之,我们新开发的 sRNAdeep 可以帮助研究人员更精确地识别细菌 sRNA,并可从 https://github.com/pyajagod/sRNAdeep.git 免费获取。
{"title":"sRNAdeep: a novel tool for bacterial sRNA prediction based on DistilBERT encoding mode and deep learning algorithms.","authors":"Weiye Qian, Jiawei Sun, Tianyi Liu, Zhiyuan Yang, Stephen Kwok-Wing Tsui","doi":"10.1186/s12864-024-10951-6","DOIUrl":"10.1186/s12864-024-10951-6","url":null,"abstract":"<p><strong>Background: </strong>Bacterial small regulatory RNA (sRNA) plays a crucial role in cell metabolism and could be used as a new potential drug target in the treatment of pathogen-induced disease. However, experimental methods for identifying sRNAs still require a large investment of human and material resources.</p><p><strong>Methods: </strong>In this study, we propose a novel sRNA prediction model called sRNAdeep based on the DistilBERT feature extraction and TextCNN methods. The sRNA and non-sRNA sequences of bacteria were considered as sentences and then fed into a composite model consisting of deep learning models to evaluate classification performance.</p><p><strong>Results: </strong>By filtering sRNAs from BSRD database, we obtained a validation dataset comprised of 2438 positive and 4730 negative samples. The benchmark experiments showed that sRNAdeep displayed better performance in the various indexes compared to previous sRNA prediction tools. By applying our tool to Mycobacterium tuberculosis (MTB) genome, we have identified 21 sRNAs within the intergenic and intron regions. A set of 272 targeted genes regulated by these sRNAs were also captured in MTB. The coding proteins of two genes (lysX and icd1) are implicated in drug response, with significant active sites related to drug resistance mechanisms of MTB.</p><p><strong>Conclusion: </strong>In conclusion, our newly developed sRNAdeep can help researchers identify bacterial sRNAs more precisely and can be freely available from https://github.com/pyajagod/sRNAdeep.git .</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1021"},"PeriodicalIF":3.5,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526673/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142557135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative analysis of the PAL gene family in nine citruses provides new insights into the stress resistance mechanism of Citrus species. 对九种柑橘的 PAL 基因家族进行比较分析,为了解柑橘物种的抗逆机制提供了新的视角。
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-31 DOI: 10.1186/s12864-024-10938-3
Tuo Yin, Rong Xu, Ling Zhu, Xiuyao Yang, Mengjie Zhang, Xulin Li, Yinqiang Zi, Ke Wen, Ke Zhao, Hanbing Cai, Xiaozhen Liu, Hanyao Zhang

Background: The phenylalanine ammonia-lyase (PAL) gene, a well-studied plant defense gene, is crucial for growth, development, and stress resistance. The PAL gene family has been studied in many plants. Citrus is among the most vital cash crops worldwide. However, the PAL gene family has not been comprehensively studied in most Citrus species, and the biological functions and specific underlying mechanisms are unclear.

Results: We identified 41 PAL genes from nine Citrus species and revealed different patterns of evolution among the PAL genes in different Citrus species. Gene duplication was found to be a vital mechanism for the expansion of the PAL gene family in citrus. In addition, there was a strong correlation between the ability of PAL genes to respond to stress and their evolutionary duration in citrus. PAL genes with shorter evolutionary times were involved in more multiple stress responses, and these PAL genes with broad-spectrum resistance were all single-copy genes. By further integrating the lignin and flavonoid synthesis pathways in citrus, we observed that PAL genes contribute to the synthesis of lignin and flavonoids, which enhance the physical defense and ROS scavenging ability of citrus plants, thereby helping them withstand stress.

Conclusions: This study provides a comprehensive framework of the PAL gene family in citrus, and we propose a hypothetical model for the stress resistance mechanism in citrus. This study provides a foundation for further investigations into the biological functions of PAL genes in the growth, development, and response to various stresses in citrus.

背景:苯丙氨酸氨基转移酶(PAL)基因是一种研究较多的植物防御基因,对植物的生长、发育和抗逆性至关重要。对许多植物的 PAL 基因家族都进行过研究。柑橘是全球最重要的经济作物之一。然而,在大多数柑橘物种中,PAL 基因家族尚未得到全面研究,其生物学功能和具体的内在机制尚不清楚:结果:我们从 9 个柑橘物种中发现了 41 个 PAL 基因,并揭示了不同柑橘物种中 PAL 基因的不同进化模式。研究发现,基因复制是柑橘中 PAL 基因家族扩展的重要机制。此外,PAL 基因应对压力的能力与它们在柑橘中的进化持续时间之间存在密切联系。进化时间较短的 PAL 基因参与了更多的多种胁迫响应,而且这些具有广谱抗性的 PAL 基因都是单拷贝基因。通过进一步整合柑橘木质素和类黄酮的合成途径,我们观察到PAL基因有助于木质素和类黄酮的合成,而木质素和类黄酮能增强柑橘植物的物理防御和清除ROS的能力,从而帮助它们抵御胁迫:本研究为柑橘中的 PAL 基因家族提供了一个全面的框架,并为柑橘的抗逆机制提出了一个假设模型。本研究为进一步研究 PAL 基因在柑橘生长、发育和应对各种胁迫中的生物学功能奠定了基础。
{"title":"Comparative analysis of the PAL gene family in nine citruses provides new insights into the stress resistance mechanism of Citrus species.","authors":"Tuo Yin, Rong Xu, Ling Zhu, Xiuyao Yang, Mengjie Zhang, Xulin Li, Yinqiang Zi, Ke Wen, Ke Zhao, Hanbing Cai, Xiaozhen Liu, Hanyao Zhang","doi":"10.1186/s12864-024-10938-3","DOIUrl":"10.1186/s12864-024-10938-3","url":null,"abstract":"<p><strong>Background: </strong>The phenylalanine ammonia-lyase (PAL) gene, a well-studied plant defense gene, is crucial for growth, development, and stress resistance. The PAL gene family has been studied in many plants. Citrus is among the most vital cash crops worldwide. However, the PAL gene family has not been comprehensively studied in most Citrus species, and the biological functions and specific underlying mechanisms are unclear.</p><p><strong>Results: </strong>We identified 41 PAL genes from nine Citrus species and revealed different patterns of evolution among the PAL genes in different Citrus species. Gene duplication was found to be a vital mechanism for the expansion of the PAL gene family in citrus. In addition, there was a strong correlation between the ability of PAL genes to respond to stress and their evolutionary duration in citrus. PAL genes with shorter evolutionary times were involved in more multiple stress responses, and these PAL genes with broad-spectrum resistance were all single-copy genes. By further integrating the lignin and flavonoid synthesis pathways in citrus, we observed that PAL genes contribute to the synthesis of lignin and flavonoids, which enhance the physical defense and ROS scavenging ability of citrus plants, thereby helping them withstand stress.</p><p><strong>Conclusions: </strong>This study provides a comprehensive framework of the PAL gene family in citrus, and we propose a hypothetical model for the stress resistance mechanism in citrus. This study provides a foundation for further investigations into the biological functions of PAL genes in the growth, development, and response to various stresses in citrus.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1020"},"PeriodicalIF":3.5,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526608/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142557134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative plastome analysis of Arundinelleae (Poaceae, Panicoideae), with implications for phylogenetic relationships and plastome evolution. Arundinelleae (Poaceae, Panicoideae) 的比较质体分析,及其对系统发育关系和质体进化的影响。
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-30 DOI: 10.1186/s12864-024-10871-5
Li-Qiong Jiang, Bryan T Drew, Watchara Arthan, Guo-Ying Yu, Hong Wu, Yue Zhao, Hua Peng, Chun-Lei Xiang

Background: Arundinelleae is a small tribe within the Poaceae (grass family) possessing a widespread distribution that includes Asia, the Americas, and Africa. Several species of Arundinelleae are used as natural forage, feed, and raw materials for paper. The tribe is taxonomically cumbersome due to a paucity of clear diagnostic morphological characters. There has been scant genetic and genomic research conducted for this group, and as a result the phylogenetic relationships and species boundaries within Arundinelleae are poorly understood.

Results: We compared and analyzed 11 plastomes of Arundinelleae, of which seven plastomes were newly sequenced. The plastomes range from 139,629 base pairs (bp) (Garnotia tenella) to 140,943 bp (Arundinella barbinodis), with a standard four-part structure. The average GC content was 38.39%, but varied in different regions of the plastome. In all, 110 genes were annotated, comprising 76 protein-coding genes, 30 tRNA genes, and four rRNA genes. Furthermore, 539 simple sequence repeats, 519 long repeats, and 10 hyper-variable regions were identified from the 11 plastomes of Arundinelleae. A phylogenetic reconstruction of Panicoideae based on 98 plastomes demonstrated the monophyly of Arundinella and Garnotia, but the circumscription of Arundinelleae remains unresolved.

Conclusion: Complete chloroplast genome sequences can improve phylogenetic resolution relative to single marker approaches, particularly within taxonomically challenging groups. All phylogenetic analyses strongly support the monophyly of Arundinella and Garnotia, respectively, but the monophylly of Arundinelleae was not well supported. The intergeneric phylogenetic relationships within Arundinelleae require clarification, indicating that more data is necessary to resolve generic boundaries and evaluate the monophyly of Arundinelleae. A comprehensive taxonomic revision for the tribe is necessary. In addition, the identified hyper-variable regions could function as molecular markers for clarifying phylogenetic relationships and potentially as barcoding markers for species identification in the future.

背景:Arundinelleae 是禾本科(Poaceae)中的一个小族,分布广泛,包括亚洲、美洲和非洲。Arundinelleae 的若干种被用作天然饲料、饲料和造纸原料。由于缺乏明确的形态学诊断特征,该族在分类学上非常繁琐。对该类群的遗传和基因组研究很少,因此对 Arundinelleae 内的系统发育关系和物种界限知之甚少:结果:我们比较并分析了 Arundinelleae 的 11 个质粒,其中 7 个质粒是新测序的。这些质粒的碱基对从 139,629 碱基对(Garnotia tenella)到 140,943 碱基对(Arundinella barbinodis)不等,具有标准的四部分结构。平均 GC 含量为 38.39%,但在质体的不同区域有所不同。共注释了 110 个基因,包括 76 个蛋白质编码基因、30 个 tRNA 基因和 4 个 rRNA 基因。此外,还从 Arundinelleae 的 11 个质体中发现了 539 个简单序列重复序列、519 个长重复序列和 10 个超变区。基于 98 个质粒的 Panicoideae 系统进化重建表明 Arundinelleae 和 Garnotia 为单系,但 Arundinelleae 的周系仍未确定:结论:与单一标记方法相比,完整的叶绿体基因组序列可提高系统发育的分辨率,尤其是在具有分类学挑战性的类群中。所有系统发育分析都分别强烈支持 Arundinella 和 Garnotia 的单系性,但 Arundinelleae 的单系性没有得到很好的支持。Arundinelleae内的属间系统发育关系需要澄清,这表明需要更多的数据来确定属的界限和评估Arundinelleae的单系性。有必要对该族进行全面的分类学修订。此外,已确定的超变异区可作为分子标记,用于澄清系统发生关系,并有可能作为条形码标记用于未来的物种鉴定。
{"title":"Comparative plastome analysis of Arundinelleae (Poaceae, Panicoideae), with implications for phylogenetic relationships and plastome evolution.","authors":"Li-Qiong Jiang, Bryan T Drew, Watchara Arthan, Guo-Ying Yu, Hong Wu, Yue Zhao, Hua Peng, Chun-Lei Xiang","doi":"10.1186/s12864-024-10871-5","DOIUrl":"10.1186/s12864-024-10871-5","url":null,"abstract":"<p><strong>Background: </strong>Arundinelleae is a small tribe within the Poaceae (grass family) possessing a widespread distribution that includes Asia, the Americas, and Africa. Several species of Arundinelleae are used as natural forage, feed, and raw materials for paper. The tribe is taxonomically cumbersome due to a paucity of clear diagnostic morphological characters. There has been scant genetic and genomic research conducted for this group, and as a result the phylogenetic relationships and species boundaries within Arundinelleae are poorly understood.</p><p><strong>Results: </strong>We compared and analyzed 11 plastomes of Arundinelleae, of which seven plastomes were newly sequenced. The plastomes range from 139,629 base pairs (bp) (Garnotia tenella) to 140,943 bp (Arundinella barbinodis), with a standard four-part structure. The average GC content was 38.39%, but varied in different regions of the plastome. In all, 110 genes were annotated, comprising 76 protein-coding genes, 30 tRNA genes, and four rRNA genes. Furthermore, 539 simple sequence repeats, 519 long repeats, and 10 hyper-variable regions were identified from the 11 plastomes of Arundinelleae. A phylogenetic reconstruction of Panicoideae based on 98 plastomes demonstrated the monophyly of Arundinella and Garnotia, but the circumscription of Arundinelleae remains unresolved.</p><p><strong>Conclusion: </strong>Complete chloroplast genome sequences can improve phylogenetic resolution relative to single marker approaches, particularly within taxonomically challenging groups. All phylogenetic analyses strongly support the monophyly of Arundinella and Garnotia, respectively, but the monophylly of Arundinelleae was not well supported. The intergeneric phylogenetic relationships within Arundinelleae require clarification, indicating that more data is necessary to resolve generic boundaries and evaluate the monophyly of Arundinelleae. A comprehensive taxonomic revision for the tribe is necessary. In addition, the identified hyper-variable regions could function as molecular markers for clarifying phylogenetic relationships and potentially as barcoding markers for species identification in the future.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1016"},"PeriodicalIF":3.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11523875/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-wide analysis of phytochrome-interacting factor (PIF) families and their potential roles in light and gibberellin signaling in Chinese pine. 中国松树植物色素相互作用因子(PIF)家族的全基因组分析及其在光和赤霉素信号传导中的潜在作用
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-30 DOI: 10.1186/s12864-024-10915-w
Yingtian Guo, Chengyan Deng, Guizhi Feng, Dan Liu

Phytochrome-interacting factors (PIFs) are a subgroup of transcription factors within the basic helix-loop-helix (bHLH) family, playing a crucial role in integrating various environmental signals to regulate plant growth and development. Despite the significance of PIFs in these processes, a comprehensive genome-wide analysis of PIFs in conifers has yet to be conducted. In this investigation, three PtPIF genes were identified in Chinese pine, categorized into three subgroups, with conserved motifs indicating the presence of the APA/APB motif and bHLH domain in the PtPIF1 and PtPIF3 proteins. Phylogenetic analysis revealed that the PtPIF1 and PtPIF3 proteins belong to the PIF7/8 and PIF3 groups, respectively, and were relatively conserved among gymnosperms. Additionally, a class of PIF lacking APA/APB motif was identified in conifers, suggesting its function may differ from that of traditional PIFs. The cis-elements of the PtPIF genes were systematically examined, and analysis of PtPIF gene expression across various tissues and under different light, temperature, and plant hormone conditions demonstrated similar expression profiles for PtPIF1 and PtPIF3. Investigations into protein-protein interactions and co-expression networks speculated the involvement of PtPIFs and PtPHYA/Bs in circadian rhythms and hormone signal transduction. Further analysis of transcriptome data and experimental validation indicated an interaction between PtPIF3 and PtPHYB1, potentially linked to diurnal rhythms. Notably, the study revealed that PtPIF3 may be involved in gibberellic acid (GA) signaling through its interaction with PtDELLAs, suggesting a potential role for PtPIF3 in mediating both light and GA responses. Overall, this research provides a foundation for future studies investigating the functions of PIFs in conifer growth and development.

植物色素相互作用因子(PIFs)是基本螺旋环螺旋(bHLH)家族中的一个转录因子亚群,在整合各种环境信号以调控植物生长和发育方面发挥着至关重要的作用。尽管 PIFs 在这些过程中意义重大,但目前尚未对针叶树中的 PIFs 进行全面的全基因组分析。本研究在中国松树中发现了三个 PtPIF 基因,并将其分为三个亚组,其中 PtPIF1 和 PtPIF3 蛋白的保守基序表明存在 APA/APB 基序和 bHLH 结构域。系统进化分析表明,PtPIF1 和 PtPIF3 蛋白分别属于 PIF7/8 和 PIF3 组,在裸子植物中相对保守。此外,在针叶树中还发现了一类缺乏 APA/APB 基序的 PIF,表明其功能可能与传统的 PIF 不同。对 PtPIF 基因的顺式元件进行了系统研究,并分析了 PtPIF 基因在不同组织以及不同光照、温度和植物激素条件下的表达情况,结果表明 PtPIF1 和 PtPIF3 具有相似的表达谱。对蛋白-蛋白相互作用和共表达网络的研究推测,PtPIF 和 PtPHYA/Bs 参与了昼夜节律和激素信号转导。对转录组数据的进一步分析和实验验证表明,PtPIF3 和 PtPHYB1 之间存在相互作用,可能与昼夜节律有关。值得注意的是,该研究发现 PtPIF3 可能通过与 PtDELLAs 的相互作用参与赤霉素(GA)信号转导,这表明 PtPIF3 在介导光和 GA 响应方面具有潜在作用。总之,这项研究为今后研究 PIFs 在针叶树生长和发育中的功能奠定了基础。
{"title":"Genome-wide analysis of phytochrome-interacting factor (PIF) families and their potential roles in light and gibberellin signaling in Chinese pine.","authors":"Yingtian Guo, Chengyan Deng, Guizhi Feng, Dan Liu","doi":"10.1186/s12864-024-10915-w","DOIUrl":"10.1186/s12864-024-10915-w","url":null,"abstract":"<p><p>Phytochrome-interacting factors (PIFs) are a subgroup of transcription factors within the basic helix-loop-helix (bHLH) family, playing a crucial role in integrating various environmental signals to regulate plant growth and development. Despite the significance of PIFs in these processes, a comprehensive genome-wide analysis of PIFs in conifers has yet to be conducted. In this investigation, three PtPIF genes were identified in Chinese pine, categorized into three subgroups, with conserved motifs indicating the presence of the APA/APB motif and bHLH domain in the PtPIF1 and PtPIF3 proteins. Phylogenetic analysis revealed that the PtPIF1 and PtPIF3 proteins belong to the PIF7/8 and PIF3 groups, respectively, and were relatively conserved among gymnosperms. Additionally, a class of PIF lacking APA/APB motif was identified in conifers, suggesting its function may differ from that of traditional PIFs. The cis-elements of the PtPIF genes were systematically examined, and analysis of PtPIF gene expression across various tissues and under different light, temperature, and plant hormone conditions demonstrated similar expression profiles for PtPIF1 and PtPIF3. Investigations into protein-protein interactions and co-expression networks speculated the involvement of PtPIFs and PtPHYA/Bs in circadian rhythms and hormone signal transduction. Further analysis of transcriptome data and experimental validation indicated an interaction between PtPIF3 and PtPHYB1, potentially linked to diurnal rhythms. Notably, the study revealed that PtPIF3 may be involved in gibberellic acid (GA) signaling through its interaction with PtDELLAs, suggesting a potential role for PtPIF3 in mediating both light and GA responses. Overall, this research provides a foundation for future studies investigating the functions of PIFs in conifer growth and development.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1017"},"PeriodicalIF":3.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11523891/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reclassification of the first Bacillus tropicus phage calls for reclassification of other Bacillus temperate phages previously designated as plasmids. 对第一种热带芽孢杆菌噬菌体进行重新分类,要求对以前被指定为质粒的其他温带芽孢杆菌噬菌体进行重新分类。
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-30 DOI: 10.1186/s12864-024-10937-4
Ridwaan Nazeer Milase, Johnson Lin, Nontobeko E Mvubu, Nokulunga Hlengwa

Bacillus tropicus is a recently identified subspecies of the Bacillus cereus group of bacteria that have been shown to possess genes associated with antimicrobial resistance (AMR) and identified as the causative agent for anthrax-like disease in Chinese soft-shelled turtles. In addition, B. tropicus has demonstrated great potential in the fields of bioremediation and bioconversion. This article describes the comparative genomics of a Bacillus phage vB_Btc-RBClinn15 (referred to as RBClin15) infecting the recently identified B. tropicus AOA-CPS1. RBClin15 is a temperate phage with a putative parABS partitioning system as well as an arbitrium system, which are presumed to enable extrachromosomal genome maintenance and regulate the lysis/lysogeny switch, respectively. The temperate phage RBClin15 has been sequenced however, was erroneously deposited as a plasmid in the NCBI GenBank database. A BLASTn search against the GenBank database using the whole genome sequence of RBClin15 revealed seven other putative temperate phages that were also deposited as plasmids in the database. Comparative genomic analyses shows that RBClin15 shares between 87 and 92% average nucleotide identity (ANI) with the seven temperate phages from the GenBank database. All together RBClin15 and the seven putative temperate phages share common genome arrangements and < 29% protein homologs with the closest phages, including 0105phi7-2. A phylogenomic tree and proteome-based phylogenetic tree analysis showed that RBClin15 and the seven temperate phages formed a separate branch from the closest phage, 0105phi7-2. In addition, the intergenomic similarity between RBClin15 and its closely related phages ranged between 0.3 and 47.7%. Collectively, based on the phylogenetic, and comparative genomic analyses, we propose three new species which will include RBClin15 and the seven temperate phages in the newly proposed genus Theosmithvirus under Caudoviricetes.

热带芽孢杆菌(Bacillus tropicus)是最近发现的蜡样芽孢杆菌(Bacillus cereus)的一个亚种,已被证明具有抗菌药耐药性(AMR)相关基因,并被确定为中华鳖炭疽类疾病的致病菌。此外,B. tropicus 在生物修复和生物转化领域也表现出巨大的潜力。本文描述了感染最近发现的滋养芽孢杆菌 AOA-CPS1 的芽孢杆菌噬菌体 vB_Btc-RBClinn15(简称 RBClin15)的比较基因组学。RBClin15 是一种温带噬菌体,具有假定的 parABS 分配系统和仲裁系统,据推测这两个系统可分别实现染色体外基因组的维持和调节裂解/溶解转换。温带噬菌体 RBClin15 已被测序,但被错误地作为质粒存入 NCBI GenBank 数据库。利用 RBClin15 的全基因组序列对 GenBank 数据库进行 BLASTn 搜索,发现了其他 7 个推测的温带噬菌体,它们也作为质粒存入了数据库。基因组比较分析表明,RBClin15 与 GenBank 数据库中的七种温带噬菌体的平均核苷酸同一性(ANI)介于 87% 与 92% 之间。总之,RBClin15 与这七种推定的温带噬菌体有着共同的基因组排列和基因组结构。
{"title":"Reclassification of the first Bacillus tropicus phage calls for reclassification of other Bacillus temperate phages previously designated as plasmids.","authors":"Ridwaan Nazeer Milase, Johnson Lin, Nontobeko E Mvubu, Nokulunga Hlengwa","doi":"10.1186/s12864-024-10937-4","DOIUrl":"10.1186/s12864-024-10937-4","url":null,"abstract":"<p><p>Bacillus tropicus is a recently identified subspecies of the Bacillus cereus group of bacteria that have been shown to possess genes associated with antimicrobial resistance (AMR) and identified as the causative agent for anthrax-like disease in Chinese soft-shelled turtles. In addition, B. tropicus has demonstrated great potential in the fields of bioremediation and bioconversion. This article describes the comparative genomics of a Bacillus phage vB_Btc-RBClinn15 (referred to as RBClin15) infecting the recently identified B. tropicus AOA-CPS1. RBClin15 is a temperate phage with a putative parABS partitioning system as well as an arbitrium system, which are presumed to enable extrachromosomal genome maintenance and regulate the lysis/lysogeny switch, respectively. The temperate phage RBClin15 has been sequenced however, was erroneously deposited as a plasmid in the NCBI GenBank database. A BLASTn search against the GenBank database using the whole genome sequence of RBClin15 revealed seven other putative temperate phages that were also deposited as plasmids in the database. Comparative genomic analyses shows that RBClin15 shares between 87 and 92% average nucleotide identity (ANI) with the seven temperate phages from the GenBank database. All together RBClin15 and the seven putative temperate phages share common genome arrangements and < 29% protein homologs with the closest phages, including 0105phi7-2. A phylogenomic tree and proteome-based phylogenetic tree analysis showed that RBClin15 and the seven temperate phages formed a separate branch from the closest phage, 0105phi7-2. In addition, the intergenomic similarity between RBClin15 and its closely related phages ranged between 0.3 and 47.7%. Collectively, based on the phylogenetic, and comparative genomic analyses, we propose three new species which will include RBClin15 and the seven temperate phages in the newly proposed genus Theosmithvirus under Caudoviricetes.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1018"},"PeriodicalIF":3.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526630/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bank vole genomics links determinate and indeterminate growth of teeth. 田鼠基因组学将牙齿的确定性生长和非确定性生长联系起来。
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-30 DOI: 10.1186/s12864-024-10901-2
Zachary T Calamari, Andrew Song, Emily Cohen, Muspika Akter, Rishi Das Roy, Outi Hallikas, Mona M Christensen, Pengyang Li, Pauline Marangoni, Jukka Jernvall, Ophir D Klein

Background: Continuously growing teeth are an important innovation in mammalian evolution, yet genetic regulation of continuous growth by stem cells remains incompletely understood. Dental stem cells responsible for tooth crown growth are lost at the onset of tooth root formation. Genetic signaling that initiates this loss is difficult to study with the ever-growing incisor and rooted molars of mice, the most common mammalian dental model species, because signals for root formation overlap with signals that pattern tooth size and shape (i.e., cusp patterns). Bank and prairie voles (Cricetidae, Rodentia, Glires) have evolved rooted and unrooted molars while retaining similar size and shape, providing alternative models for studying roots.

Results: We assembled a de novo genome of Myodes glareolus, a vole with high-crowned, rooted molars, and performed genomic and transcriptomic analyses in a broad phylogenetic context of Glires (rodents and lagomorphs) to assess differential selection and evolution in tooth forming genes. Bulk transcriptomics comparisons of embryonic molar development between bank voles and mice demonstrated overall conservation of gene expression levels, with species-specific differences corresponding to the accelerated and more extensive patterning of the vole molar. We leverage convergent evolution of unrooted molars across the clade to examine changes that may underlie the evolution of unrooted molars. We identified 15 dental genes with changing synteny relationships and six dental genes undergoing positive selection across Glires, two of which were undergoing positive selection in species with unrooted molars, Dspp and Aqp1. Decreased expression of both genes in prairie voles with unrooted molars compared to bank voles supports the presence of positive selection and may underlie differences in root formation.

Conclusions: Our results support ongoing evolution of dental genes across Glires and identify candidate genes for mechanistic studies of root formation. Comparative research using the bank vole as a model species can reveal the complex evolutionary background of convergent evolution for ever-growing molars.

背景:持续生长的牙齿是哺乳动物进化过程中的一项重要创新,但人们对干细胞持续生长的遗传调控仍不甚了解。在牙根形成之初,负责牙冠生长的牙齿干细胞就会丧失。小鼠是最常见的哺乳动物牙齿模型物种,其门齿和根臼齿不断生长,但很难通过基因信号来研究这种损失,因为牙根形成的信号与牙齿大小和形状模式(即尖牙模式)的信号重叠。滩田鼠和草原田鼠(Cricetidae, Rodentia, Glires)进化出了有根和无根臼齿,同时保留了相似的大小和形状,为研究牙根提供了替代模型:我们组装了Myodes glareolus(一种具有高冠、有根臼齿的田鼠)的全新基因组,并在Glires(啮齿类和袋鼬类)的广泛系统发育背景下进行了基因组和转录组分析,以评估牙齿形成基因的差异选择和进化。对银行田鼠和小鼠胚胎臼齿发育的大量转录组学比较表明,基因表达水平总体上保持不变,而物种特异性差异则与田鼠臼齿的加速和更广泛的模式化相对应。我们利用无根臼齿在整个支系中的趋同进化,研究了可能是无根臼齿进化的基础的变化。我们发现了 15 个具有变化的同源关系的牙科基因和 6 个在 Glires 中经历正选择的牙科基因,其中两个基因(Dspp 和 Aqp1)在具有无根臼齿的物种中经历了正选择。与银行田鼠相比,这两个基因在臼齿无根的草原田鼠中的表达量减少,这支持了正选择的存在,并可能是牙根形成差异的基础:我们的研究结果支持牙基因在啮齿类动物中的不断进化,并为牙根形成的机理研究确定了候选基因。以银行田鼠为模式物种进行的比较研究可以揭示不断生长的臼齿趋同进化的复杂进化背景。
{"title":"Bank vole genomics links determinate and indeterminate growth of teeth.","authors":"Zachary T Calamari, Andrew Song, Emily Cohen, Muspika Akter, Rishi Das Roy, Outi Hallikas, Mona M Christensen, Pengyang Li, Pauline Marangoni, Jukka Jernvall, Ophir D Klein","doi":"10.1186/s12864-024-10901-2","DOIUrl":"10.1186/s12864-024-10901-2","url":null,"abstract":"<p><strong>Background: </strong>Continuously growing teeth are an important innovation in mammalian evolution, yet genetic regulation of continuous growth by stem cells remains incompletely understood. Dental stem cells responsible for tooth crown growth are lost at the onset of tooth root formation. Genetic signaling that initiates this loss is difficult to study with the ever-growing incisor and rooted molars of mice, the most common mammalian dental model species, because signals for root formation overlap with signals that pattern tooth size and shape (i.e., cusp patterns). Bank and prairie voles (Cricetidae, Rodentia, Glires) have evolved rooted and unrooted molars while retaining similar size and shape, providing alternative models for studying roots.</p><p><strong>Results: </strong>We assembled a de novo genome of Myodes glareolus, a vole with high-crowned, rooted molars, and performed genomic and transcriptomic analyses in a broad phylogenetic context of Glires (rodents and lagomorphs) to assess differential selection and evolution in tooth forming genes. Bulk transcriptomics comparisons of embryonic molar development between bank voles and mice demonstrated overall conservation of gene expression levels, with species-specific differences corresponding to the accelerated and more extensive patterning of the vole molar. We leverage convergent evolution of unrooted molars across the clade to examine changes that may underlie the evolution of unrooted molars. We identified 15 dental genes with changing synteny relationships and six dental genes undergoing positive selection across Glires, two of which were undergoing positive selection in species with unrooted molars, Dspp and Aqp1. Decreased expression of both genes in prairie voles with unrooted molars compared to bank voles supports the presence of positive selection and may underlie differences in root formation.</p><p><strong>Conclusions: </strong>Our results support ongoing evolution of dental genes across Glires and identify candidate genes for mechanistic studies of root formation. Comparative research using the bank vole as a model species can reveal the complex evolutionary background of convergent evolution for ever-growing molars.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1000"},"PeriodicalIF":3.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11523675/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GASIDN: identification of sub-Golgi proteins with multi-scale feature fusion. GASIDN:通过多尺度特征融合识别亚高尔基体蛋白。
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-30 DOI: 10.1186/s12864-024-10954-3
Jianan Sui, Jiazi Chen, Yuehui Chen, Naoki Iwamori, Jin Sun

The Golgi apparatus is a crucial component of the inner membrane system in eukaryotic cells, playing a central role in protein biosynthesis. Dysfunction of the Golgi apparatus has been linked to neurodegenerative diseases. Accurate identification of sub-Golgi protein types is therefore essential for developing effective treatments for such diseases. Due to the expensive and time-consuming nature of experimental methods for identifying sub-Golgi protein types, various computational methods have been developed as identification tools. However, the majority of these methods rely solely on neighboring features in the protein sequence and neglect the crucial spatial structure information of the protein.To discover alternative methods for accurately identifying sub-Golgi proteins, we have developed a model called GASIDN. The GASIDN model extracts multi-dimension features by utilizing a 1D convolution module on protein sequences and a graph learning module on contact maps constructed from AlphaFold2.The model utilizes the deep representation learning model SeqVec to initialize protein sequences. GASIDN achieved accuracy values of 98.4% and 96.4% in independent testing and ten-fold cross-validation, respectively, outperforming the majority of previous predictors. To the best of our knowledge, this is the first method that utilizes multi-scale feature fusion to identify and locate sub-Golgi proteins. In order to assess the generalizability and scalability of our model, we conducted experiments to apply it in the identification of proteins from other organelles, including plant vacuoles and peroxisomes. The results obtained from these experiments demonstrated promising outcomes, indicating the effectiveness and versatility of our model. The source code and datasets can be accessed at https://github.com/SJNNNN/GASIDN .

高尔基体是真核细胞内膜系统的重要组成部分,在蛋白质的生物合成中发挥着核心作用。高尔基体功能障碍与神经退行性疾病有关。因此,准确鉴定高尔基体下蛋白质类型对于开发治疗此类疾病的有效方法至关重要。由于鉴定高尔基体下蛋白类型的实验方法既昂贵又耗时,人们开发了各种计算方法作为鉴定工具。然而,这些方法大多只依赖于蛋白质序列中的邻近特征,而忽略了蛋白质的关键空间结构信息。为了探索准确识别亚高尔基体蛋白质的替代方法,我们开发了一种名为 GASIDN 的模型。GASIDN 模型利用蛋白质序列上的一维卷积模块和 AlphaFold2 构建的接触图上的图学习模块提取多维特征。在独立测试和十倍交叉验证中,GASIDN 的准确率分别达到了 98.4% 和 96.4%,优于之前的大多数预测器。据我们所知,这是第一种利用多尺度特征融合来识别和定位亚高尔基体蛋白质的方法。为了评估我们的模型的通用性和可扩展性,我们进行了实验,将其应用于识别其他细胞器的蛋白质,包括植物液泡和过氧物酶体。这些实验结果表明,我们的模型具有良好的有效性和通用性。源代码和数据集可从 https://github.com/SJNNNN/GASIDN 获取。
{"title":"GASIDN: identification of sub-Golgi proteins with multi-scale feature fusion.","authors":"Jianan Sui, Jiazi Chen, Yuehui Chen, Naoki Iwamori, Jin Sun","doi":"10.1186/s12864-024-10954-3","DOIUrl":"10.1186/s12864-024-10954-3","url":null,"abstract":"<p><p>The Golgi apparatus is a crucial component of the inner membrane system in eukaryotic cells, playing a central role in protein biosynthesis. Dysfunction of the Golgi apparatus has been linked to neurodegenerative diseases. Accurate identification of sub-Golgi protein types is therefore essential for developing effective treatments for such diseases. Due to the expensive and time-consuming nature of experimental methods for identifying sub-Golgi protein types, various computational methods have been developed as identification tools. However, the majority of these methods rely solely on neighboring features in the protein sequence and neglect the crucial spatial structure information of the protein.To discover alternative methods for accurately identifying sub-Golgi proteins, we have developed a model called GASIDN. The GASIDN model extracts multi-dimension features by utilizing a 1D convolution module on protein sequences and a graph learning module on contact maps constructed from AlphaFold2.The model utilizes the deep representation learning model SeqVec to initialize protein sequences. GASIDN achieved accuracy values of 98.4% and 96.4% in independent testing and ten-fold cross-validation, respectively, outperforming the majority of previous predictors. To the best of our knowledge, this is the first method that utilizes multi-scale feature fusion to identify and locate sub-Golgi proteins. In order to assess the generalizability and scalability of our model, we conducted experiments to apply it in the identification of proteins from other organelles, including plant vacuoles and peroxisomes. The results obtained from these experiments demonstrated promising outcomes, indicating the effectiveness and versatility of our model. The source code and datasets can be accessed at https://github.com/SJNNNN/GASIDN .</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1019"},"PeriodicalIF":3.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526662/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling putative modulators of mutable collagenous tissue in the brittle star Ophiomastix wendtii: an RNA-Seq analysis. 揭示脆星 Ophiomastix wendtii 变异胶原组织的推定调节因子:RNA-Seq 分析。
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-29 DOI: 10.1186/s12864-024-10926-7
Reyhaneh Nouri, Vladimir Mashanov, April Harris, Gari New, William Taylor, Daniel Janies, Robert W Reid, Denis Jacob Machado

Collagenous connective tissue, found throughout the bodies of metazoans, plays a crucial role in maintaining structural integrity. This versatile tissue has the potential for numerous biomedical applications, including the development of innovative collagen-based biomaterials. Inspiration for such advancements can be drawn from echinoderms, a group of marine invertebrates that includes sea stars, sea cucumbers, brittle stars, sea urchins, and sea lilies. Through their nervous system, these organisms can reversibly control the pliability of their connective tissue components (i.e., tendons and ligaments) that are composed of mutable collagenous tissue (MCT). The variable tensile properties of the MCT allow echinoderms to perform unique functions, including postural maintenance, reduction of muscular energy use, autotomy to avoid predators, and asexual reproduction through fission. The changes in the tensile strength of MCT structures are specifically controlled by specialized neurosecretory cells called juxtaligamental cells. These cells release substances that either soften or stiffen the MCT. So far, only a few of these substances have been purified and characterized, and the genetic underpinning of MCT biology remains unknown. Therefore, we have conducted this research to identify MCT-related genes in echinoderms as a first step towards a better understanding of the MCT molecular control mechanisms. Our ultimate goal is to unlock new biomaterial applications based on this knowledge. In this project, we used RNA-Seq to identify and annotate differentially expressed genes in the MCT structures of the brittle star Ophiomastix wendtii. As a result, we present a list of 16 putative MCT modulator genes, which will be validated and characterized in forthcoming functional analyses.

胶原蛋白结缔组织遍布整个类人猿身体,在保持结构完整性方面发挥着至关重要的作用。这种多用途组织具有多种生物医学应用潜力,包括开发基于胶原蛋白的创新生物材料。棘皮动物是一类海洋无脊椎动物,包括海星、海参、脆星、海胆和海百合。通过神经系统,这些生物可以可逆地控制由可变胶原组织(MCT)构成的结缔组织成分(即肌腱和韧带)的柔韧性。可变胶原组织的可变拉伸特性使棘皮动物能够发挥独特的功能,包括维持姿势、减少肌肉能量消耗、自动切除以避开捕食者,以及通过裂变进行无性繁殖。MCT 结构拉伸强度的变化是由称为 "并蒂细胞 "的特化神经分泌细胞专门控制的。这些细胞释放的物质可以软化或硬化 MCT。迄今为止,这些物质中只有少数得到了纯化和表征,MCT 生物学的基因基础仍然未知。因此,我们开展了这项研究,以确定棘皮动物中与 MCT 相关的基因,作为更好地了解 MCT 分子控制机制的第一步。我们的最终目标是在此基础上开发新的生物材料应用。在本项目中,我们利用 RNA-Seq 鉴定并注释了脆星 Ophiomastix wendtii MCT 结构中的差异表达基因。因此,我们列出了 16 个推测的 MCT 调节基因,这些基因将在即将进行的功能分析中得到验证和表征。
{"title":"Unveiling putative modulators of mutable collagenous tissue in the brittle star Ophiomastix wendtii: an RNA-Seq analysis.","authors":"Reyhaneh Nouri, Vladimir Mashanov, April Harris, Gari New, William Taylor, Daniel Janies, Robert W Reid, Denis Jacob Machado","doi":"10.1186/s12864-024-10926-7","DOIUrl":"10.1186/s12864-024-10926-7","url":null,"abstract":"<p><p>Collagenous connective tissue, found throughout the bodies of metazoans, plays a crucial role in maintaining structural integrity. This versatile tissue has the potential for numerous biomedical applications, including the development of innovative collagen-based biomaterials. Inspiration for such advancements can be drawn from echinoderms, a group of marine invertebrates that includes sea stars, sea cucumbers, brittle stars, sea urchins, and sea lilies. Through their nervous system, these organisms can reversibly control the pliability of their connective tissue components (i.e., tendons and ligaments) that are composed of mutable collagenous tissue (MCT). The variable tensile properties of the MCT allow echinoderms to perform unique functions, including postural maintenance, reduction of muscular energy use, autotomy to avoid predators, and asexual reproduction through fission. The changes in the tensile strength of MCT structures are specifically controlled by specialized neurosecretory cells called juxtaligamental cells. These cells release substances that either soften or stiffen the MCT. So far, only a few of these substances have been purified and characterized, and the genetic underpinning of MCT biology remains unknown. Therefore, we have conducted this research to identify MCT-related genes in echinoderms as a first step towards a better understanding of the MCT molecular control mechanisms. Our ultimate goal is to unlock new biomaterial applications based on this knowledge. In this project, we used RNA-Seq to identify and annotate differentially expressed genes in the MCT structures of the brittle star Ophiomastix wendtii. As a result, we present a list of 16 putative MCT modulator genes, which will be validated and characterized in forthcoming functional analyses.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1013"},"PeriodicalIF":3.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520437/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting complex infections in trypanosomatids using whole genome sequencing. 利用全基因组测序检测锥虫的复杂感染。
IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Pub Date : 2024-10-29 DOI: 10.1186/s12864-024-10862-6
João Luís Reis-Cunha, Daniel Charlton Jeffares

Background: Trypanosomatid parasites are a group of protozoans that cause devastating diseases that disproportionately affect developing countries. These protozoans have developed several mechanisms for adaptation to survive in the mammalian host, such as extensive expansion of multigene families enrolled in host-parasite interaction, adaptation to invade and modulate host cells, and the presence of aneuploidy and polyploidy. Two mechanisms might result in "complex" isolates, with more than two haplotypes being present in a single sample: multiplicity of infections (MOI) and polyploidy. We have developed and validated a methodology to identify multiclonal infections and polyploidy using whole genome sequencing reads, based on fluctuations in allelic read depth in heterozygous positions, which can be easily implemented in experiments sequencing genomes from one sample to larger population surveys.

Results: The methodology estimates the complexity index (CI) of an isolate, and compares real samples with simulated clonal infections at individual and populational level, excluding regions with somy and gene copy number variation. It was primarily validated with simulated MOI and known polyploid isolates respectively from Leishmania and Trypanosoma cruzi. Then, the approach was used to assess the complexity of infection using genome wide SNP data from 497 trypanosomatid samples from four clades, L. donovani/L. infantum, L. braziliensis, T. cruzi and T. brucei providing an overview of multiclonal infection and polyploidy in these cultured parasites. We show that our method robustly detects complex infections in samples with at least 25x coverage, 100 heterozygous SNPs and where 5-10% of the reads correspond to the secondary clone. We find that relatively small proportions (≤ 7%) of cultured trypanosomatid isolates are complex.

Conclusions: The method can accurately identify polyploid isolates, and can identify multiclonal infections in scenarios with sufficient genome read coverage. We pack our method in a single R script that requires only a standard variant call format (VCF) file to run ( https://github.com/jaumlrc/Complex-Infections ). Our analyses indicate that multiclonality and polyploidy do occur in all clades, but not very frequently in cultured trypanosomatids. We caution that our estimates are lower bounds due to the limitations of current laboratory and bioinformatic methods.

背景:锥虫寄生虫是一类原生动物,可导致严重影响发展中国家的毁灭性疾病。为了在哺乳动物宿主体内生存,这些原生动物发展出了多种适应机制,如在宿主与寄生虫相互作用中广泛扩增多基因家族,适应入侵和调节宿主细胞,以及存在非整倍体和多倍体。有两种机制可能会导致 "复杂 "的分离物,即单个样本中存在两种以上的单倍型:多重感染(MOI)和多倍体。我们根据杂合位置等位基因读数深度的波动,开发并验证了一种利用全基因组测序读数识别多克隆感染和多倍体的方法,该方法可在从一个样本到更大群体调查的基因组测序实验中轻松实施:结果:该方法估算了分离株的复杂性指数(CI),并在个体和种群水平上对真实样本与模拟克隆感染进行了比较,排除了存在染色体和基因拷贝数变异的区域。该方法主要通过模拟 MOI 和已知多倍体分离物分别从利什曼原虫和克鲁斯锥虫中进行验证。然后,利用来自 L. donovani/L.infantum、L. braziliensis、T. cruzi 和 T. brucei 四个支系的 497 个锥虫样本的全基因组 SNP 数据评估了感染的复杂性,提供了这些培养寄生虫中多克隆感染和多倍体的概况。我们的研究表明,我们的方法能在至少有 25 倍覆盖率、100 个杂合 SNP 和 5-10% 的读数与二级克隆相对应的样本中稳健地检测出复杂感染。我们发现,相对较小比例(≤ 7%)的培养锥虫分离物是复杂的:结论:该方法能准确识别多倍体分离株,并能在基因组读数覆盖率足够大的情况下识别多克隆感染。我们将该方法打包到一个 R 脚本中,运行时只需一个标准变异调用格式(VCF)文件 ( https://github.com/jaumlrc/Complex-Infections )。我们的分析表明,多克隆性和多倍体确实发生在所有支系中,但在培养的锥虫中并不常见。我们要提醒的是,由于目前实验室和生物信息学方法的局限性,我们的估计值只是下限。
{"title":"Detecting complex infections in trypanosomatids using whole genome sequencing.","authors":"João Luís Reis-Cunha, Daniel Charlton Jeffares","doi":"10.1186/s12864-024-10862-6","DOIUrl":"10.1186/s12864-024-10862-6","url":null,"abstract":"<p><strong>Background: </strong>Trypanosomatid parasites are a group of protozoans that cause devastating diseases that disproportionately affect developing countries. These protozoans have developed several mechanisms for adaptation to survive in the mammalian host, such as extensive expansion of multigene families enrolled in host-parasite interaction, adaptation to invade and modulate host cells, and the presence of aneuploidy and polyploidy. Two mechanisms might result in \"complex\" isolates, with more than two haplotypes being present in a single sample: multiplicity of infections (MOI) and polyploidy. We have developed and validated a methodology to identify multiclonal infections and polyploidy using whole genome sequencing reads, based on fluctuations in allelic read depth in heterozygous positions, which can be easily implemented in experiments sequencing genomes from one sample to larger population surveys.</p><p><strong>Results: </strong>The methodology estimates the complexity index (CI) of an isolate, and compares real samples with simulated clonal infections at individual and populational level, excluding regions with somy and gene copy number variation. It was primarily validated with simulated MOI and known polyploid isolates respectively from Leishmania and Trypanosoma cruzi. Then, the approach was used to assess the complexity of infection using genome wide SNP data from 497 trypanosomatid samples from four clades, L. donovani/L. infantum, L. braziliensis, T. cruzi and T. brucei providing an overview of multiclonal infection and polyploidy in these cultured parasites. We show that our method robustly detects complex infections in samples with at least 25x coverage, 100 heterozygous SNPs and where 5-10% of the reads correspond to the secondary clone. We find that relatively small proportions (≤ 7%) of cultured trypanosomatid isolates are complex.</p><p><strong>Conclusions: </strong>The method can accurately identify polyploid isolates, and can identify multiclonal infections in scenarios with sufficient genome read coverage. We pack our method in a single R script that requires only a standard variant call format (VCF) file to run ( https://github.com/jaumlrc/Complex-Infections ). Our analyses indicate that multiclonality and polyploidy do occur in all clades, but not very frequently in cultured trypanosomatids. We caution that our estimates are lower bounds due to the limitations of current laboratory and bioinformatic methods.</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"25 1","pages":"1011"},"PeriodicalIF":3.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520695/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
BMC Genomics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1