Genomics, Proteomics & Bioinformatics最新文献_第2页

Global Marine Cold Seep Metagenomes Reveal Diversity of Taxonomy, Metabolic Function, and Natural Products 全球海洋冷渗漏元基因组揭示了分类、代谢功能和天然产品的多样性

IF 9.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2024-01-10 DOI: 10.1093/gpbjnl/qzad006

Tao Yu, Yingfeng Luo, Xinyu Tan, Dahe Zhao, Xiaochun Bi, Chenji Li, Yanning Zheng, Hua Xiang, Songnian Hu

Abstract Cold seeps in the deep sea are closely linked to energy exploration as well as global climate change. The alkane-dominated chemical energy-driven model makes cold seeps an oasis of deep-sea life, showcasing an unparalleled reservoir of microbial genetic diversity. By analyzing 113 metagenomes collected from 14 global sites across 5 cold seep types, we present a comprehensive Cold Seep Microbiomic Database (CSMD) to archive the genomic and functional diversity of cold seep microbiome. The CSMD included over 49 million non-redundant genes and 3175 metagenome-assembled genomes (MAGs), which represented 1895 species spanning 105 phyla. In addition, beta diversity analysis indicated that both the sampling site and cold seep type had a substantial impact on the prokaryotic microbiome community composition. Heterotrophic and anaerobic metabolisms were prevalent in microbial communities, accompanied by considerable mixotrophs and facultative anaerobes, highlighting the versatile metabolic potential in cold seeps. Furthermore, secondary metabolic gene cluster analysis indicated that at least 98.81% of the sequences potentially encoded novel natural products, with ribosomal processing peptides being the predominant type widely distributed in archaea and bacteria. Overall, the CSMD represents a valuable resource that would enhance the understanding and utilization of global cold seep microbiomes.

摘要深海冷渗漏与能源开发和全球气候变化密切相关。以烷烃为主的化学能驱动模式使冷渗漏成为深海生命的绿洲，展示了无与伦比的微生物遗传多样性宝库。通过分析从全球 5 种冷渗漏类型的 14 个地点收集到的 113 个元基因组，我们建立了一个全面的冷渗漏微生物组数据库（CSMD），以归档冷渗漏微生物组的基因组和功能多样性。该数据库包括超过 4,900 万个非冗余基因和 3175 个元基因组组装基因组（MAGs），代表了跨越 105 个门的 1895 个物种。此外，β 多样性分析表明，采样地点和冷渗漏类型对原核微生物群落组成有很大影响。微生物群落中普遍存在异养代谢和厌氧代谢，同时还有相当多的混养菌和兼性厌氧菌，这凸显了冷渗漏中多种多样的代谢潜力。此外，次生代谢基因聚类分析表明，至少有 98.81% 的序列可能编码新型天然产物，其中核糖体加工肽是最主要的类型，广泛分布于古细菌和细菌中。总之，CSMD 是一种宝贵的资源，可增进对全球冷渗漏微生物群的了解和利用。

{"title":"Global Marine Cold Seep Metagenomes Reveal Diversity of Taxonomy, Metabolic Function, and Natural Products","authors":"Tao Yu, Yingfeng Luo, Xinyu Tan, Dahe Zhao, Xiaochun Bi, Chenji Li, Yanning Zheng, Hua Xiang, Songnian Hu","doi":"10.1093/gpbjnl/qzad006","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzad006","url":null,"abstract":"Abstract Cold seeps in the deep sea are closely linked to energy exploration as well as global climate change. The alkane-dominated chemical energy-driven model makes cold seeps an oasis of deep-sea life, showcasing an unparalleled reservoir of microbial genetic diversity. By analyzing 113 metagenomes collected from 14 global sites across 5 cold seep types, we present a comprehensive Cold Seep Microbiomic Database (CSMD) to archive the genomic and functional diversity of cold seep microbiome. The CSMD included over 49 million non-redundant genes and 3175 metagenome-assembled genomes (MAGs), which represented 1895 species spanning 105 phyla. In addition, beta diversity analysis indicated that both the sampling site and cold seep type had a substantial impact on the prokaryotic microbiome community composition. Heterotrophic and anaerobic metabolisms were prevalent in microbial communities, accompanied by considerable mixotrophs and facultative anaerobes, highlighting the versatile metabolic potential in cold seeps. Furthermore, secondary metabolic gene cluster analysis indicated that at least 98.81% of the sequences potentially encoded novel natural products, with ribosomal processing peptides being the predominant type widely distributed in archaea and bacteria. Overall, the CSMD represents a valuable resource that would enhance the understanding and utilization of global cold seep microbiomes.","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"414 1","pages":""},"PeriodicalIF":9.5,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139922165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Molecular Evolution of Protein Sequences and Codon Usage in Monkeypox Viruses 猴痘病毒蛋白质序列和密码使用的分子进化

IF 9.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2024-01-10 DOI: 10.1093/gpbjnl/qzad003

Ke-jia Shan, Changcheng Wu, Xiaolu Tang, Roujian Lu, Yaling Hu, Wenjie Tan, Jian Lu

Abstract The monkeypox virus (mpox virus, MPXV) epidemic in 2022 has posed a significant public health risk. Yet, the evolutionary principles of MPXV remain largely unknown. Here, we examined the evolutionary patterns of protein sequences and codon usage in MPXV. We first demonstrated the signal of positive selection in OPG027, specifically in the Clade I lineage of MPXV. Subsequently, we discovered accelerated protein sequence evolution over time in the variants responsible for the 2022 outbreak. Furthermore, we showed strong epistasis between amino acid substitutions located in different genes. The codon adaptation index (CAI) analysis revealed that MPXV genes tended to use more non-preferred codons compared to human genes, and the CAI decreased over time and diverged between clades, with Clade I > IIa and IIb-A > IIb-B. While the decrease in fatality rate among the three groups aligned with the CAI pattern, it remains unclear whether this correlation was coincidental or if the deoptimization of codon usage in MPXV led to a reduction in fatality rates. This study sheds new light on the mechanisms that govern the evolution of MPXV in human populations.

摘要 2022 年流行的猴痘病毒（monkeypox virus，MPXV）对公众健康构成了重大威胁。然而，MPXV的进化原理在很大程度上仍然未知。在此，我们研究了MPXV中蛋白质序列和密码子使用的进化模式。我们首先证明了 OPG027 中的正选择信号，特别是在 MPXV 的支系 I 中。随后，我们发现在造成 2022 年疫情爆发的变体中，蛋白质序列随时间加速进化。此外，我们还发现位于不同基因中的氨基酸替代之间存在很强的外显性。密码子适应指数（CAI）分析表明，与人类基因相比，MPXV基因倾向于使用更多的非首选密码子，CAI随时间推移而降低，并在支系之间出现分化，支系I>IIa和支系IIb-A>IIb-B。虽然这三个类群的死亡率下降与 CAI 模式一致，但目前还不清楚这种相关性是巧合还是 MPXV 中密码子使用的非优化导致了死亡率的下降。这项研究为人类中 MPXV 的进化机制提供了新的线索。

{"title":"Molecular Evolution of Protein Sequences and Codon Usage in Monkeypox Viruses","authors":"Ke-jia Shan, Changcheng Wu, Xiaolu Tang, Roujian Lu, Yaling Hu, Wenjie Tan, Jian Lu","doi":"10.1093/gpbjnl/qzad003","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzad003","url":null,"abstract":"Abstract The monkeypox virus (mpox virus, MPXV) epidemic in 2022 has posed a significant public health risk. Yet, the evolutionary principles of MPXV remain largely unknown. Here, we examined the evolutionary patterns of protein sequences and codon usage in MPXV. We first demonstrated the signal of positive selection in OPG027, specifically in the Clade I lineage of MPXV. Subsequently, we discovered accelerated protein sequence evolution over time in the variants responsible for the 2022 outbreak. Furthermore, we showed strong epistasis between amino acid substitutions located in different genes. The codon adaptation index (CAI) analysis revealed that MPXV genes tended to use more non-preferred codons compared to human genes, and the CAI decreased over time and diverged between clades, with Clade I &gt; IIa and IIb-A &gt; IIb-B. While the decrease in fatality rate among the three groups aligned with the CAI pattern, it remains unclear whether this correlation was coincidental or if the deoptimization of codon usage in MPXV led to a reduction in fatality rates. This study sheds new light on the mechanisms that govern the evolution of MPXV in human populations.","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"12 1","pages":""},"PeriodicalIF":9.5,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139922082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Whole-genome Sequencing Reveals Autooctoploidy in Chinese Sturgeon and Its Evolutionary Trajectories 全基因组测序揭示中华鲟自八倍体及其进化轨迹

IF 9.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2024-01-10 DOI: 10.1093/gpbjnl/qzad002

Binzhong Wang, Bin Wu, Xueqing Liu, Yacheng Hu, Yao Ming, Mingzhou Bai, Juanjuan Liu, Kan Xiao, Qingkai Zeng, Jing Yang, Hongqi Wang, Baifu Guo, Chun Tan, Zixuan Hu, Xun Zhao, Yanhong Li, Zhen Yue, Junpu Mei, Wei Jiang, Yuanjin Yang, Zhiyuan Li, Yong Gao, Lei Chen, Jianbo Jian, Hejun Du

Abstract The order Acipenseriformes, which includes sturgeons and paddlefishes, represents “living fossils” with complex genomes that are good models for understanding whole-genome duplication (WGD) and ploidy evolution in fishes. Here, we sequenced and assembled the first high-quality chromosome-level genome for the complex octoploid Acipenser sinensis (Chinese sturgeon), a critically endangered species that also represents a poorly understood ploidy group in Acipenseriformes. Our results show that A. sinensis is a complex autooctoploid species containing four kinds of octovalents (8n), a hexavalent (6n), two tetravalents (4n), and a divalent (2n). An analysis taking into account delayed rediploidization reveals that the octoploid genome composition of Chinese sturgeon results from two rounds of homologous WGDs, and further provides insights into the timing of its ploidy evolution. This study provides the first octoploid genome resource of Acipenseriformes for understanding ploidy compositions and evolutionary trajectories of polyploidy fishes.

摘要包括中华鲟和桨鱼在内的鲟形目是具有复杂基因组的 "活化石"，是了解鱼类全基因组复制（WGD）和倍性进化的良好模型。在这里，我们对中华鲟（Acipenser sinensis）复杂的八倍体进行了测序，并组装了第一个高质量的染色体级基因组，中华鲟是一个极度濒危的物种，同时也代表了中华鲟形目中一个鲜为人知的倍数组。我们的研究结果表明，中华鲟是一个复杂的自八倍体物种，包含四种八价体（8n）、一种六价体（6n）、两种四价体（4n）和一种二价体（2n）。考虑到延迟再倍化的分析表明，中华鲟八倍体基因组的组成是两轮同源WGD的结果，并进一步揭示了其倍性进化的时间。该研究首次提供了中华鲟八倍体基因组资源，有助于了解多倍体鱼类的倍性组成和进化轨迹。

{"title":"Whole-genome Sequencing Reveals Autooctoploidy in Chinese Sturgeon and Its Evolutionary Trajectories","authors":"Binzhong Wang, Bin Wu, Xueqing Liu, Yacheng Hu, Yao Ming, Mingzhou Bai, Juanjuan Liu, Kan Xiao, Qingkai Zeng, Jing Yang, Hongqi Wang, Baifu Guo, Chun Tan, Zixuan Hu, Xun Zhao, Yanhong Li, Zhen Yue, Junpu Mei, Wei Jiang, Yuanjin Yang, Zhiyuan Li, Yong Gao, Lei Chen, Jianbo Jian, Hejun Du","doi":"10.1093/gpbjnl/qzad002","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzad002","url":null,"abstract":"Abstract The order Acipenseriformes, which includes sturgeons and paddlefishes, represents “living fossils” with complex genomes that are good models for understanding whole-genome duplication (WGD) and ploidy evolution in fishes. Here, we sequenced and assembled the first high-quality chromosome-level genome for the complex octoploid Acipenser sinensis (Chinese sturgeon), a critically endangered species that also represents a poorly understood ploidy group in Acipenseriformes. Our results show that A. sinensis is a complex autooctoploid species containing four kinds of octovalents (8n), a hexavalent (6n), two tetravalents (4n), and a divalent (2n). An analysis taking into account delayed rediploidization reveals that the octoploid genome composition of Chinese sturgeon results from two rounds of homologous WGDs, and further provides insights into the timing of its ploidy evolution. This study provides the first octoploid genome resource of Acipenseriformes for understanding ploidy compositions and evolutionary trajectories of polyploidy fishes.","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"153 1","pages":""},"PeriodicalIF":9.5,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139922085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TransCell: In silico Characterization of Genomic Landscape and Cellular Responses by Deep Transfer Learning TransCell：通过深度迁移学习对基因组图谱和细胞反应进行硅学表征

IF 9.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2024-01-10 DOI: 10.1093/gpbjnl/qzad008

Shan-Ju Yeh, Shreya Paithankar, Ruoqiao Chen, Jing Xing, Mengying Sun, Ke Liu, Jiayu Zhou, Bin Chen

Abstract Gene expression profiling of new or modified cell lines becomes routine today; however, obtaining comprehensive molecular characterization and cellular responses for a variety of cell lines, including those derived from underrepresented groups, is not trivial when resources are minimal. Using gene expression to predict other measurements has been actively explored; however, systematic investigation of its predictive power in various measurements has not been well studied. We evaluated commonly used machine learning methods and presented TransCell, a two-step deep transfer learning framework that utilized the knowledge derived from pan-cancer tumor samples to predict molecular features and responses. Among these models, TransCell has the best performance in predicting metabolite, gene effect score (or genetic dependency), and drug sensitivity, and has comparable performance in predicting mutation, copy number variation, and protein expression. Notably, TransCell improved the performance by over 50% in drug sensitivity prediction and achieved a correlation of 0.7 in gene effect score prediction. Furthermore, predicted drug sensitivities revealed potential repurposing candidates for new 100 pediatric cancer cell lines, and predicted gene effect scores reflected BRAF resistance in melanoma cell lines. Together, we investigated the predictive power of gene expression in six molecular measurement types and developed a web portal (http://apps.octad.org/transcell/) that enables the prediction of 352,000 genomic and cellular response features solely from gene expression profiles.

摘要如今，对新细胞系或改良细胞系进行基因表达谱分析已成为家常便饭；然而，在资源极度匮乏的情况下，要获得各种细胞系（包括来自代表性不足群体的细胞系）的全面分子特征和细胞反应并非易事。人们一直在积极探索利用基因表达来预测其他测量结果，但对其在各种测量结果中的预测能力的系统研究还不多。我们评估了常用的机器学习方法，并提出了 TransCell，这是一个两步深度迁移学习框架，利用从泛癌症肿瘤样本中获得的知识来预测分子特征和反应。在这些模型中，TransCell 在预测代谢物、基因效应得分（或遗传依赖性）和药物敏感性方面表现最佳，在预测突变、拷贝数变异和蛋白质表达方面表现相当。值得注意的是，TransCell 在药物敏感性预测方面的性能提高了 50%以上，在基因效应得分预测方面达到了 0.7 的相关性。此外，预测的药物敏感性揭示了新的 100 种儿科癌症细胞系的潜在再利用候选者，预测的基因效应得分反映了黑色素瘤细胞系中的 BRAF 抗药性。我们共同研究了基因表达在六种分子测量类型中的预测能力，并开发了一个门户网站（http://apps.octad.org/transcell/），该网站可完全通过基因表达谱预测 35.2 万个基因组和细胞反应特征。

{"title":"TransCell: In silico Characterization of Genomic Landscape and Cellular Responses by Deep Transfer Learning","authors":"Shan-Ju Yeh, Shreya Paithankar, Ruoqiao Chen, Jing Xing, Mengying Sun, Ke Liu, Jiayu Zhou, Bin Chen","doi":"10.1093/gpbjnl/qzad008","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzad008","url":null,"abstract":"Abstract Gene expression profiling of new or modified cell lines becomes routine today; however, obtaining comprehensive molecular characterization and cellular responses for a variety of cell lines, including those derived from underrepresented groups, is not trivial when resources are minimal. Using gene expression to predict other measurements has been actively explored; however, systematic investigation of its predictive power in various measurements has not been well studied. We evaluated commonly used machine learning methods and presented TransCell, a two-step deep transfer learning framework that utilized the knowledge derived from pan-cancer tumor samples to predict molecular features and responses. Among these models, TransCell has the best performance in predicting metabolite, gene effect score (or genetic dependency), and drug sensitivity, and has comparable performance in predicting mutation, copy number variation, and protein expression. Notably, TransCell improved the performance by over 50% in drug sensitivity prediction and achieved a correlation of 0.7 in gene effect score prediction. Furthermore, predicted drug sensitivities revealed potential repurposing candidates for new 100 pediatric cancer cell lines, and predicted gene effect scores reflected BRAF resistance in melanoma cell lines. Together, we investigated the predictive power of gene expression in six molecular measurement types and developed a web portal (http://apps.octad.org/transcell/) that enables the prediction of 352,000 genomic and cellular response features solely from gene expression profiles.","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"12 1","pages":""},"PeriodicalIF":9.5,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139921997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Microbiome in Female Reproductive Health: Implications for Fertility and Assisted Reproductive Technologies 女性生殖健康中的微生物组：对生育和辅助生殖技术的影响

IF 9.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2024-01-10 DOI: 10.1093/gpbjnl/qzad005

Liwen Xiao, Zhenqiang Zuo, Fangqing Zhao

Abstract The microbiome plays a critical role in the process of conception and the outcomes of pregnancy. Disruptions in microbiome homeostasis in women of reproductive age can lead to various pregnancy complications, which significantly impact maternal and fetal health. Recent studies have associated the microbiome in the female reproductive tract (FRT) with assisted reproductive technology (ART) outcomes, and restoring microbiome balance has been shown to improve fertility in infertile couples. This review provides an overview of the role of the microbiome in female reproductive health, including its implications for pregnancy outcomes and ARTs. Additionally, recent advances in the use of microbial biomarkers as indicators of pregnancy disorders are summarized. A comprehensive understanding of the characteristics of the microbiome before and during pregnancy and its impact on reproductive health will greatly promote maternal and fetal health. Such knowledge can also contribute to the development of ARTs and microbiome-based interventions.

摘要微生物组在受孕过程和妊娠结果中起着至关重要的作用。育龄妇女体内微生物组平衡的破坏会导致各种妊娠并发症，从而严重影响母体和胎儿的健康。最近的研究表明，女性生殖道（FRT）中的微生物组与辅助生殖技术（ART）的结果有关，恢复微生物组的平衡可提高不孕夫妇的生育能力。本综述概述了微生物组在女性生殖健康中的作用，包括其对妊娠结局和辅助生殖技术的影响。此外，还总结了使用微生物生物标志物作为妊娠紊乱指标的最新进展。全面了解妊娠前和妊娠期间微生物组的特征及其对生殖健康的影响将极大地促进孕产妇和胎儿的健康。这些知识还有助于开发抗逆转录病毒疗法和基于微生物组的干预措施。

引用次数: 0

Q-BioLiP: A Comprehensive Resource for Quaternary Structure-based Protein–ligand Interactions Q-BioLiP：基于四元结构的蛋白质配体相互作用综合资源库

IF 9.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2024-01-04 DOI: 10.1093/gpbjnl/qzae001

Hong Wei, Wenkai Wang, Zhenling Peng, Jianyi Yang

Abstract Since its establishment in 2013, BioLiP has become one of the widely used resources for protein–ligand interactions. Nevertheless, several known issues occurred with it over the past decade. For example, the protein–ligand interactions are represented in the form of single-chain-based tertiary structures, which may be inappropriate as many interactions involve multiple protein chains (known as quaternary structures). We sought to address these issues, resulting in Q-BioLiP, a comprehensive resource for quaternary structure-based protein–ligand interactions. The major features of Q-BioLiP include: (1) representing protein structures in the form of quaternary structures rather than single-chain-based tertiary structures; (2) pairing DNA/RNA chains properly rather than separation; (3) providing both experimental and predicted binding affinities; (4) retaining both biologically relevant and irrelevant interactions to alleviate the problem of the wrong justification of ligands’ biological relevance; and (5) developing a new quaternary structure-based algorithm for the modelling of protein–ligand complex structure. With these new features, Q-BioLiP is expected to be a valuable resource for studying biomolecule interactions, including protein–small molecule interaction, protein–metal ion interaction, protein–peptide interaction, protein–protein interaction, protein–DNA/RNA interaction, and RNA–small molecule interaction. Q-BioLiP is freely available at https://yanglab.qd.sdu.edu.cn/Q-BioLiP/.

摘要自 2013 年建立以来，BioLiP 已成为广泛使用的蛋白质配体相互作用资源之一。然而，在过去十年中，它也出现了一些已知的问题。例如，蛋白质-配体相互作用以基于单链的三级结构形式表示，这可能并不合适，因为许多相互作用涉及多个蛋白质链（称为四级结构）。为了解决这些问题，我们开发了 Q-BioLiP，这是一种基于四元结构的蛋白质配体相互作用综合资源。Q-BioLiP 的主要特点包括(1) 以四元结构而非基于单链的三级结构的形式表示蛋白质结构；(2) 正确配对 DNA/RNA 链而非分离；(3) 同时提供实验结合亲和力和预测结合亲和力；(4) 保留生物相关和不相关的相互作用，以减轻配体生物相关性的错误论证问题；以及 (5) 开发基于四元结构的新算法，用于蛋白质-配体复合物结构建模。有了这些新功能，Q-BioLiP 将成为研究生物大分子相互作用（包括蛋白质-小分子相互作用、蛋白质-金属离子相互作用、蛋白质-肽相互作用、蛋白质-蛋白质相互作用、蛋白质-DNA/RNA 相互作用以及 RNA-小分子相互作用）的宝贵资源。Q-BioLiP 可在 https://yanglab.qd.sdu.edu.cn/Q-BioLiP/ 免费获取。

{"title":"Q-BioLiP: A Comprehensive Resource for Quaternary Structure-based Protein–ligand Interactions","authors":"Hong Wei, Wenkai Wang, Zhenling Peng, Jianyi Yang","doi":"10.1093/gpbjnl/qzae001","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzae001","url":null,"abstract":"Abstract Since its establishment in 2013, BioLiP has become one of the widely used resources for protein–ligand interactions. Nevertheless, several known issues occurred with it over the past decade. For example, the protein–ligand interactions are represented in the form of single-chain-based tertiary structures, which may be inappropriate as many interactions involve multiple protein chains (known as quaternary structures). We sought to address these issues, resulting in Q-BioLiP, a comprehensive resource for quaternary structure-based protein–ligand interactions. The major features of Q-BioLiP include: (1) representing protein structures in the form of quaternary structures rather than single-chain-based tertiary structures; (2) pairing DNA/RNA chains properly rather than separation; (3) providing both experimental and predicted binding affinities; (4) retaining both biologically relevant and irrelevant interactions to alleviate the problem of the wrong justification of ligands’ biological relevance; and (5) developing a new quaternary structure-based algorithm for the modelling of protein–ligand complex structure. With these new features, Q-BioLiP is expected to be a valuable resource for studying biomolecule interactions, including protein–small molecule interaction, protein–metal ion interaction, protein–peptide interaction, protein–protein interaction, protein–DNA/RNA interaction, and RNA–small molecule interaction. Q-BioLiP is freely available at https://yanglab.qd.sdu.edu.cn/Q-BioLiP/.","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"118 1","pages":""},"PeriodicalIF":9.5,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139921994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NextPolish2: A Repeat-aware Polishing Tool for Genomes Assembled Using HiFi Long Reads NextPolish2：使用 HiFi 长读数组装基因组的重复感知抛光工具

IF 9.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2024-01-04 DOI: 10.1093/gpbjnl/qzad009

Jiang Hu, Zhuo Wang, Fan Liang, Shan-Lin Liu, Kai Ye, De-Peng Wang

Abstract The high-fidelity (HiFi) long-read sequencing technology developed by PacBio has greatly improved the base-level accuracy of genome assemblies. However, these assemblies still contain base-level errors, particularly within the error-prone regions of HiFi long reads. Existing genome polishing tools usually introduce overcorrections and haplotype switch errors when correcting errors in genomes assembled from HiFi long reads. Here we describe an upgraded genome polishing tool–NextPolish2, which can fix base errors remaining in those “highly accurate” genomes assembled from HiFi long reads without introducing excessive overcorrections and haplotype switch errors. We believe that NextPolish2 has a great significance to further improve the accuracy of telomere-to-telomere (T2T) genomes. NextPolish2 is freely available at https://github.com/Nextomics/NextPolish2.

摘要 PacBio 公司开发的高保真（HiFi）长读数测序技术大大提高了基因组组装的碱基精确度。然而，这些装配仍然包含碱基水平错误，尤其是在 HiFi 长读数的易错区域。现有的基因组抛光工具在纠正由 HiFi 长读数组装的基因组中的错误时，通常会引入过校正和单倍型转换错误。在这里，我们介绍一种升级版基因组抛光工具--NextPolish2，它可以修正那些由 HiFi 长读数组装的 "高精度 "基因组中残留的碱基错误，而不会引入过多的过校正和单倍型转换错误。我们相信，NextPolish2 对进一步提高端粒到端粒（T2T）基因组的准确性具有重要意义。NextPolish2 可在 https://github.com/Nextomics/NextPolish2 免费获取。

引用次数: 0

Systematic Exploration of Optimized Base Editing gRNA Design and Pleiotropic Effects with BExplorer 利用 BExplorer 系统探索优化的碱基编辑 gRNA 设计和多效应。

IF 11.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2023-12-01 DOI: 10.1016/j.gpb.2022.06.005

Gongchen Zhang , Chenyu Zhu , Xiaohan Chen , Jifang Yan, Dongyu Xue, Zixuan Wei, Guohui Chuai, Qi Liu

Base editing technology is being increasingly applied in genome engineering, but the current strategy for designing guide RNAs (gRNAs) relies substantially on empirical experience rather than a dependable and efficient in silico design. Furthermore, the pleiotropic effect of base editing on disease treatment remains unexplored, which prevents its further clinical usage. Here, we presented BExplorer, an integrated and comprehensive computational pipeline to optimize the design of gRNAs for 26 existing types of base editors in silico. Using BExplorer, we described its results for two types of mainstream base editors, BE3 and ABE7.10, and evaluated the pleiotropic effects of the corresponding base editing loci. BExplorer revealed 524 and 900 editable pathogenic single nucleotide polymorphism (SNP) loci in the human genome together with the selected optimized gRNAs for BE3 and ABE7.10, respectively. In addition, the impact of 707 edited pathogenic SNP loci following base editing on 131 diseases was systematically explored by revealing their pleiotropic effects, indicating that base editing should be carefully utilized given the potential pleiotropic effects. Collectively, the systematic exploration of optimized base editing gRNA design and the corresponding pleiotropic effects with BExplorer provides a computational basis for applying base editing in disease treatment.

碱基编辑技术正越来越多地应用于基因组工程，但目前设计向导 RNA（gRNA）的策略主要依赖于经验，而不是可靠高效的硅学设计。此外，碱基编辑对疾病治疗的多效性影响仍有待探索，这阻碍了它在临床上的进一步应用。在这里，我们介绍了 BExplorer，这是一个综合全面的计算管道，用于优化现有 26 种碱基编辑器的 gRNA 设计。利用 BExplorer，我们描述了针对 BE3 和 ABE7.10 两种主流碱基编辑器的结果，并评估了相应碱基编辑位点的多效应。BExplorer 分别发现了人类基因组中524个和900个可编辑的致病性单核苷酸多态性（SNP）位点，以及BE3和ABE7.10所选的优化gRNA。此外，还系统探讨了 707 个经过碱基编辑的致病性 SNP 位点对 151 种疾病的影响，揭示了它们的多向效应，表明鉴于潜在的多向效应，应谨慎使用碱基编辑。总之，利用 BExplorer 对优化的碱基编辑 gRNA 设计和相应的多向效应进行系统探索，为在疾病治疗中应用碱基编辑提供了计算基础。

{"title":"Systematic Exploration of Optimized Base Editing gRNA Design and Pleiotropic Effects with BExplorer","authors":"Gongchen Zhang , Chenyu Zhu , Xiaohan Chen , Jifang Yan, Dongyu Xue, Zixuan Wei, Guohui Chuai, Qi Liu","doi":"10.1016/j.gpb.2022.06.005","DOIUrl":"10.1016/j.gpb.2022.06.005","url":null,"abstract":"<div><div><strong>Base editing</strong> technology is being increasingly applied in genome engineering, but the current strategy for designing guide RNAs (gRNAs) relies substantially on empirical experience rather than a dependable and efficient <em>in silico</em> design. Furthermore, the pleiotropic effect of base editing on disease treatment remains unexplored, which prevents its further clinical usage. Here, we presented BExplorer, an integrated and comprehensive computational pipeline to optimize the design of gRNAs for 26 existing types of base editors <em>in silico</em>. Using BExplorer, we described its results for two types of mainstream base editors, BE3 and ABE7.10, and evaluated the pleiotropic effects of the corresponding base editing loci. BExplorer revealed 524 and 900 editable pathogenic single nucleotide polymorphism (SNP) loci in the human genome together with the selected optimized gRNAs for BE3 and ABE7.10, respectively. In addition, the impact of 707 edited pathogenic SNP loci following base editing on 131 diseases was systematically explored by revealing their pleiotropic effects, indicating that base editing should be carefully utilized given the potential pleiotropic effects. Collectively, the systematic exploration of optimized base editing <strong>gRNA design</strong> and the corresponding pleiotropic effects with BExplorer provides a computational basis for applying base editing in disease treatment.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 6","pages":"Pages 1237-1245"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082405/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40475056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

T2T-YAO, T2T-SHUN, and more T2T-YAO、T2T-SHUN等。

IF 11.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2023-12-01 DOI: 10.1016/j.gpb.2023.09.002

Jingfa Xiao , Jun Yu

引用次数: 0

T2T-YAO: A Telomere-to-telomere Assembled Diploid Reference Genome for Han Chinese T2T-YAO：从端粒到端粒的汉族二倍体参考基因组。

IF 11.5 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics

Pub Date : 2023-12-01 DOI: 10.1016/j.gpb.2023.08.001

Yukun He , Yanan Chu , Shuming Guo , Jiang Hu , Ran Li , Yali Zheng , Xinqian Ma , Zhenglin Du , Lili Zhao , Wenyi Yu , Jianbo Xue , Wenjie Bian , Feifei Yang , Xi Chen , Pingan Zhang , Rihan Wu , Yifan Ma , Changjun Shao , Jing Chen , Jian Wang , Zhancheng Gao

Since its initial release in 2001, the human reference genome has undergone continuous improvement in quality, and the recently released telomere-to-telomere (T2T) version — T2T-CHM13 — reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified, nearly homozygous genome of a hydatidiform mole cell line. Here, to provide an authentic complete diploid human genome reference for the Han Chinese, the largest population in the world, we assembled the genome of a male Han Chinese individual, T2T-YAO, which includes T2T assemblies of all the 22 + X + M and 22 + Y chromosomes in both haploids. The quality of T2T-YAO is much better than those of all currently available diploid assemblies, and its haploid version, T2T-YAO-hp, generated by selecting the better assembly for each autosome, reaches the top quality of fewer than one error per 29.5 Mb, even higher than that of T2T-CHM13. Derived from an individual living in the aboriginal region of the Han population, T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors. Each haplotype of T2T-YAO possesses ∼ 330-Mb exclusive sequences, ∼ 3100 unique genes, and tens of thousands of nucleotide and structural variations as compared with CHM13, highlighting the necessity of a population-stratified reference genome. The construction of T2T-YAO, an accurate and authentic representative of the Chinese population, would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes, especially within the context of the unique variations of the Chinese population.

自2001年首次发布以来，人类参考基因组的质量不断提高，最近发布的端粒到端粒（T2T）版本--T2T-CHM13--经过20年的努力，在简化的、近乎同源的水滴形痣细胞系基因组的基础上，达到了连续性和准确性的最高水平。在这里，为了给世界上人口最多的汉族提供一个真实完整的二倍体人类基因组参考，我们组装了一个汉族男性个体的基因组 T2T-YAO，其中包括单倍体中所有 22 + X + M 和 22 + Y 染色体的 T2T 组装。T2T-YAO 的质量远远优于目前所有的二倍体基因组，其单倍体版本 T2T-YAO-hp 是通过为每个常染色体选择更好的基因组而生成的，达到了每 29.5 Mb 只有不到一个错误的最高质量，甚至高于 T2T-CHM13 的质量。T2T-YAO 源自一个生活在汉族原住民地区的个体，显示出与远古祖先清晰的祖先关系和潜在的遗传连续性。与 CHM13 相比，T2T-YAO 的每个单倍型拥有 330-Mb 的独有序列，3100 个独特基因，以及数以万计的核苷酸和结构变异，凸显了人群分层参考基因组的必要性。T2T-YAO是一个真正准确和真实的中国人群代表，它的构建将有助于精确划分基因组变异，推进我们对疾病和表型遗传性的理解，尤其是在中国人群独特变异的背景下。

{"title":"T2T-YAO: A Telomere-to-telomere Assembled Diploid Reference Genome for Han Chinese","authors":"Yukun He , Yanan Chu , Shuming Guo , Jiang Hu , Ran Li , Yali Zheng , Xinqian Ma , Zhenglin Du , Lili Zhao , Wenyi Yu , Jianbo Xue , Wenjie Bian , Feifei Yang , Xi Chen , Pingan Zhang , Rihan Wu , Yifan Ma , Changjun Shao , Jing Chen , Jian Wang , Zhancheng Gao","doi":"10.1016/j.gpb.2023.08.001","DOIUrl":"10.1016/j.gpb.2023.08.001","url":null,"abstract":"<div><div>Since its initial release in 2001, the human reference genome has undergone continuous improvement in quality, and the recently released telomere-to-telomere (T2T) version — T2T-CHM13 — reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified, nearly homozygous genome of a hydatidiform mole cell line. Here, to provide an authentic complete <strong>diploid</strong> human genome reference for the <strong>Han Chinese</strong>, the largest population in the world, we assembled the genome of a male Han Chinese individual, T2T-YAO, which includes T2T assemblies of all the 22 + X + M and 22 + Y chromosomes in both haploids. The quality of T2T-YAO is much better than those of all currently available diploid assemblies, and its haploid version, T2T-YAO-hp, generated by selecting the better assembly for each autosome, reaches the top quality of fewer than one error per 29.5 Mb, even higher than that of T2T-CHM13. Derived from an individual living in the aboriginal region of the Han population, T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors. Each haplotype of T2T-YAO possesses ∼ 330-Mb exclusive sequences, ∼ 3100 unique genes, and tens of thousands of nucleotide and structural variations as compared with CHM13, highlighting the necessity of a population-stratified reference genome. The construction of T2T-YAO, an accurate and authentic representative of the Chinese population, would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes, especially within the context of the unique variations of the Chinese population.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 6","pages":"Pages 1085-1100"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10023539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0