Since its initial release in 2001, the human reference genome has undergone continuous improvement in quality, and the recently released telomere-to-telomere (T2T) version - T2T-CHM13 - reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified, nearly homozygous genome of a hydatidiform mole cell line. Here, to provide an authentic complete diploid human genome reference for the Han Chinese, the largest population in the world, we assembled the genome of a male Han Chinese individual, T2T-YAO, which includes T2T assemblies of all the 22 + X + M and 22 + Y chromosomes in both haploids. The quality of T2T-YAO is much better than those of all currently available diploid assemblies, and its haploid version, T2T-YAO-hp, generated by selecting the better assembly for each autosome, reaches the top quality of fewer than one error per 29.5 Mb, even higher than that of T2T-CHM13. Derived from an individual living in the aboriginal region of the Han population, T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors. Each haplotype of T2T-YAO possesses ∼ 330-Mb exclusive sequences, ∼ 3100 unique genes, and tens of thousands of nucleotide and structural variations as compared with CHM13, highlighting the necessity of a population-stratified reference genome. The construction of T2T-YAO, an accurate and authentic representative of the Chinese population, would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes, especially within the context of the unique variations of the Chinese population.
自2001年首次发布以来,人类参考基因组的质量不断提高,最近发布的端粒到端粒(T2T)版本--T2T-CHM13--经过20年的努力,在简化的、近乎同源的水滴形痣细胞系基因组的基础上,达到了连续性和准确性的最高水平。在这里,为了给世界上人口最多的汉族提供一个真实完整的二倍体人类基因组参考,我们组装了一个汉族男性个体的基因组 T2T-YAO,其中包括单倍体中所有 22 + X + M 和 22 + Y 染色体的 T2T 组装。T2T-YAO 的质量远远优于目前所有的二倍体基因组,其单倍体版本 T2T-YAO-hp 是通过为每个常染色体选择更好的基因组而生成的,达到了每 29.5 Mb 只有不到一个错误的最高质量,甚至高于 T2T-CHM13 的质量。T2T-YAO 源自一个生活在汉族原住民地区的个体,显示出与远古祖先清晰的祖先关系和潜在的遗传连续性。与 CHM13 相比,T2T-YAO 的每个单倍型拥有 330-Mb 的独有序列,3100 个独特基因,以及数以万计的核苷酸和结构变异,凸显了人群分层参考基因组的必要性。T2T-YAO是一个真正准确和真实的中国人群代表,它的构建将有助于精确划分基因组变异,推进我们对疾病和表型遗传性的理解,尤其是在中国人群独特变异的背景下。
{"title":"T2T-YAO: A Telomere-to-telomere Assembled Diploid Reference Genome for Han Chinese.","authors":"Yukun He, Yanan Chu, Shuming Guo, Jiang Hu, Ran Li, Yali Zheng, Xinqian Ma, Zhenglin Du, Lili Zhao, Wenyi Yu, Jianbo Xue, Wenjie Bian, Feifei Yang, Xi Chen, Pingan Zhang, Rihan Wu, Yifan Ma, Changjun Shao, Jing Chen, Jian Wang, Jiwei Li, Jing Wu, Xiaoyi Hu, Qiuyue Long, Mingzheng Jiang, Hongli Ye, Shixu Song, Guangyao Li, Yue Wei, Yu Xu, Yanliang Ma, Yanwen Chen, Keqiang Wang, Jing Bao, Wen Xi, Fang Wang, Wentao Ni, Moqin Zhang, Yan Yu, Shengnan Li, Yu Kang, Zhancheng Gao","doi":"10.1016/j.gpb.2023.08.001","DOIUrl":"10.1016/j.gpb.2023.08.001","url":null,"abstract":"<p><p>Since its initial release in 2001, the human reference genome has undergone continuous improvement in quality, and the recently released telomere-to-telomere (T2T) version - T2T-CHM13 - reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified, nearly homozygous genome of a hydatidiform mole cell line. Here, to provide an authentic complete diploid human genome reference for the Han Chinese, the largest population in the world, we assembled the genome of a male Han Chinese individual, T2T-YAO, which includes T2T assemblies of all the 22 + X + M and 22 + Y chromosomes in both haploids. The quality of T2T-YAO is much better than those of all currently available diploid assemblies, and its haploid version, T2T-YAO-hp, generated by selecting the better assembly for each autosome, reaches the top quality of fewer than one error per 29.5 Mb, even higher than that of T2T-CHM13. Derived from an individual living in the aboriginal region of the Han population, T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors. Each haplotype of T2T-YAO possesses ∼ 330-Mb exclusive sequences, ∼ 3100 unique genes, and tens of thousands of nucleotide and structural variations as compared with CHM13, highlighting the necessity of a population-stratified reference genome. The construction of T2T-YAO, an accurate and authentic representative of the Chinese population, would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes, especially within the context of the unique variations of the Chinese population.</p>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"1085-1100"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10023539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In perinatal medicine, intrauterine growth restriction (IUGR) is one of the greatest challenges. The etiology of IUGR is multifactorial, but most cases are thought to arise from placental insufficiency. However, identifying the placental cause of IUGR can be difficult due to numerous confounding factors. Selective IUGR (sIUGR) would be a good model to investigate how impaired placentation affects fetal development, as the growth discordance between monochorionic twins cannot be explained by confounding genetic or maternal factors. Herein, we constructed and analyzed the placental proteomic profiles of IUGR twins and normal cotwins. Specifically, we identified a total of 5481 proteins, of which 233 were differentially expressed (57 up-regulated and 176 down-regulated) in IUGR twins. Bioinformatics analysis indicates that these differentially expressed proteins (DEPs) are mainly associated with cardiovascular system development and function, organismal survival, and organismal development. Notably, 34 DEPs are significantly enriched in angiogenesis, and diminished placental angiogenesis in IUGR twins has been further elaborately confirmed. Moreover, we found decreased expression of metadherin (MTDH) in the placentas of IUGR twins and demonstrated that MTDH contributes to placental angiogenesis and fetal growth in vitro. Collectively, our findings reveal the comprehensive proteomic signatures of placentas for sIUGR twins, and the DEPs identified may provide in-depth insights into the pathogenesis of placental dysfunction and subsequent impaired fetal growth.
{"title":"The Proteome Landscape of Human Placentas for Monochorionic Twins with Selective Intrauterine Growth Restriction.","authors":"Xin-Lu Meng, Peng-Bo Yuan, Xue-Ju Wang, Jing Hang, Xiao-Ming Shi, Yang-Yu Zhao, Yuan Wei","doi":"10.1016/j.gpb.2023.03.002","DOIUrl":"10.1016/j.gpb.2023.03.002","url":null,"abstract":"<p><p>In perinatal medicine, intrauterine growth restriction (IUGR) is one of the greatest challenges. The etiology of IUGR is multifactorial, but most cases are thought to arise from placental insufficiency. However, identifying the placental cause of IUGR can be difficult due to numerous confounding factors. Selective IUGR (sIUGR) would be a good model to investigate how impaired placentation affects fetal development, as the growth discordance between monochorionic twins cannot be explained by confounding genetic or maternal factors. Herein, we constructed and analyzed the placental proteomic profiles of IUGR twins and normal cotwins. Specifically, we identified a total of 5481 proteins, of which 233 were differentially expressed (57 up-regulated and 176 down-regulated) in IUGR twins. Bioinformatics analysis indicates that these differentially expressed proteins (DEPs) are mainly associated with cardiovascular system development and function, organismal survival, and organismal development. Notably, 34 DEPs are significantly enriched in angiogenesis, and diminished placental angiogenesis in IUGR twins has been further elaborately confirmed. Moreover, we found decreased expression of metadherin (MTDH) in the placentas of IUGR twins and demonstrated that MTDH contributes to placental angiogenesis and fetal growth in vitro. Collectively, our findings reveal the comprehensive proteomic signatures of placentas for sIUGR twins, and the DEPs identified may provide in-depth insights into the pathogenesis of placental dysfunction and subsequent impaired fetal growth.</p>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"1246-1259"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082409/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9375205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The fetal liver (FL) is the key erythropoietic organ during fetal development, but knowledge on human FL erythropoiesis is very limited. In this study, we sorted primary erythroblasts from FL cells and performed RNA sequencing (RNA-seq) analyses. We found that temporal gene expression patterns reflected changes in function during primary human FL terminal erythropoiesis. Notably, the expression of genes enriched in proteolysis and autophagy was up-regulated in orthochromatic erythroblasts (OrthoEs), suggesting the involvement of these pathways in enucleation. We also performed RNA-seq of in vitro cultured erythroblasts derived from FL CD34+ cells. Comparison of transcriptomes between the primary and cultured erythroblasts revealed significant differences, indicating impacts of the culture system on gene expression. Notably, the expression of lipid metabolism-related genes was increased in cultured erythroblasts. We further immortalized erythroid cell lines from FL and cord blood (CB) CD34+ cells (FL-iEry and CB-iEry, respectively). FL-iEry and CB-iEry were immortalized at the proerythroblast stage and can be induced to differentiate into OrthoEs, but their enucleation ability was very low. Comparison of the transcriptomes between OrthoEs with and without enucleation capability revealed the down-regulation of pathways involved in chromatin organization and mitophagy in OrthoEs without enucleation capacity, indicating that defects in chromatin organization and mitophagy contribute to the inability of OrthoEs to enucleate. Additionally, the expression of HBE1, HBZ, and HBG2 was up-regulated in FL-iEry compared with CB-iEry, and such up-regulation was accompanied by down-regulated expression of BCL11A and up-regulated expression of LIN28B and IGF2BP1. Our study provides new insights into human FL erythropoiesis and rich resources for future studies.
{"title":"Comprehensive Characterization and Global Transcriptome Analysis of Human Fetal Liver Terminal Erythropoiesis.","authors":"Yongshuai Han, Shihui Wang, Yaomei Wang, Yumin Huang, Chengjie Gao, Xinhua Guo, Lixiang Chen, Huizhi Zhao, Xiuli An","doi":"10.1016/j.gpb.2023.07.001","DOIUrl":"10.1016/j.gpb.2023.07.001","url":null,"abstract":"<p><p>The fetal liver (FL) is the key erythropoietic organ during fetal development, but knowledge on human FL erythropoiesis is very limited. In this study, we sorted primary erythroblasts from FL cells and performed RNA sequencing (RNA-seq) analyses. We found that temporal gene expression patterns reflected changes in function during primary human FL terminal erythropoiesis. Notably, the expression of genes enriched in proteolysis and autophagy was up-regulated in orthochromatic erythroblasts (OrthoEs), suggesting the involvement of these pathways in enucleation. We also performed RNA-seq of in vitro cultured erythroblasts derived from FL CD34<sup>+</sup> cells. Comparison of transcriptomes between the primary and cultured erythroblasts revealed significant differences, indicating impacts of the culture system on gene expression. Notably, the expression of lipid metabolism-related genes was increased in cultured erythroblasts. We further immortalized erythroid cell lines from FL and cord blood (CB) CD34<sup>+</sup> cells (FL-iEry and CB-iEry, respectively). FL-iEry and CB-iEry were immortalized at the proerythroblast stage and can be induced to differentiate into OrthoEs, but their enucleation ability was very low. Comparison of the transcriptomes between OrthoEs with and without enucleation capability revealed the down-regulation of pathways involved in chromatin organization and mitophagy in OrthoEs without enucleation capacity, indicating that defects in chromatin organization and mitophagy contribute to the inability of OrthoEs to enucleate. Additionally, the expression of HBE1, HBZ, and HBG2 was up-regulated in FL-iEry compared with CB-iEry, and such up-regulation was accompanied by down-regulated expression of BCL11A and up-regulated expression of LIN28B and IGF2BP1. Our study provides new insights into human FL erythropoiesis and rich resources for future studies.</p>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"1117-1132"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082260/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10135733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01Epub Date: 2022-12-20DOI: 10.1016/j.gpb.2022.12.003
Zhi-Xue Yang, Ya-Wen Fu, Juan-Juan Zhao, Feng Zhang, Si-Ang Li, Mei Zhao, Wei Wen, Lei Zhang, Tao Cheng, Jian-Ping Zhang, Xiao-Bing Zhang
A series of clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR associated protein 9 (Cas9) systems have been engineered for genome editing. The most widely used Cas9 is SpCas9 from Streptococcus pyogenes and SaCas9 from Staphylococcus aureus. However, a comparison of their detailed gene editing outcomes is still lacking. By characterizing the editing outcomes of 11 sites in human induced pluripotent stem cells (iPSCs) and K562 cells, we found that SaCas9 could edit the genome with greater efficiencies than SpCas9. We also compared the effects of spacer lengths of single-guide RNAs (sgRNAs; 18-21 nt for SpCas9 and 19-23 nt for SaCas9) and found that the optimal spacer lengths were 20 nt and 21 nt for SpCas9 and SaCas9, respectively. However, the optimal spacer length for a particular sgRNA was 18-21 nt for SpCas9 and 21-22 nt for SaCas9. Furthermore, SpCas9 exhibited a more substantial bias than SaCas9 for nonhomologous end-joining (NHEJ) +1 insertion at the fourth nucleotide upstream of the protospacer adjacent motif (PAM), indicating a characteristic of a staggered cut. Accordingly, editing with SaCas9 led to higher efficiencies of NHEJ-mediated double-stranded oligodeoxynucleotide (dsODN) insertion or homology-directed repair (HDR)-mediated adeno-associated virus serotype 6 (AAV6) donor knock-in. Finally, GUIDE-seq analysis revealed that SaCas9 exhibited significantly reduced off-target effects compared with SpCas9. Our work indicates the superior performance of SaCas9 to SpCas9 in transgene integration-based therapeutic gene editing and the necessity to identify the optimal spacer length to achieve desired editing results.
{"title":"Superior Fidelity and Distinct Editing Outcomes of SaCas9 Compared with SpCas9 in Genome Editing.","authors":"Zhi-Xue Yang, Ya-Wen Fu, Juan-Juan Zhao, Feng Zhang, Si-Ang Li, Mei Zhao, Wei Wen, Lei Zhang, Tao Cheng, Jian-Ping Zhang, Xiao-Bing Zhang","doi":"10.1016/j.gpb.2022.12.003","DOIUrl":"10.1016/j.gpb.2022.12.003","url":null,"abstract":"<p><p>A series of clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR associated protein 9 (Cas9) systems have been engineered for genome editing. The most widely used Cas9 is SpCas9 from Streptococcus pyogenes and SaCas9 from Staphylococcus aureus. However, a comparison of their detailed gene editing outcomes is still lacking. By characterizing the editing outcomes of 11 sites in human induced pluripotent stem cells (iPSCs) and K562 cells, we found that SaCas9 could edit the genome with greater efficiencies than SpCas9. We also compared the effects of spacer lengths of single-guide RNAs (sgRNAs; 18-21 nt for SpCas9 and 19-23 nt for SaCas9) and found that the optimal spacer lengths were 20 nt and 21 nt for SpCas9 and SaCas9, respectively. However, the optimal spacer length for a particular sgRNA was 18-21 nt for SpCas9 and 21-22 nt for SaCas9. Furthermore, SpCas9 exhibited a more substantial bias than SaCas9 for nonhomologous end-joining (NHEJ) +1 insertion at the fourth nucleotide upstream of the protospacer adjacent motif (PAM), indicating a characteristic of a staggered cut. Accordingly, editing with SaCas9 led to higher efficiencies of NHEJ-mediated double-stranded oligodeoxynucleotide (dsODN) insertion or homology-directed repair (HDR)-mediated adeno-associated virus serotype 6 (AAV6) donor knock-in. Finally, GUIDE-seq analysis revealed that SaCas9 exhibited significantly reduced off-target effects compared with SpCas9. Our work indicates the superior performance of SaCas9 to SpCas9 in transgene integration-based therapeutic gene editing and the necessity to identify the optimal spacer length to achieve desired editing results.</p>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"1206-1220"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082263/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10419418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01Epub Date: 2023-04-20DOI: 10.1016/j.gpb.2023.04.002
Ann-Yae Na, Hyojin Lee, Eun Ki Min, Sanjita Paudel, So Young Choi, HyunChae Sim, Kwang-Hyeon Liu, Ki-Tae Kim, Jong-Sup Bae, Sangkyu Lee
The recently developed technologies that allow the analysis of each single omics have provided an unbiased insight into ongoing disease processes. However, it remains challenging to specify the study design for the subsequent integration strategies that can associate sepsis pathophysiology and clinical outcomes. Here, we conducted a time-dependent multi-omics integration (TDMI) in a sepsis-associated liver dysfunction (SALD) model. We successfully deduced the relation of the Toll-like receptor 4 (TLR4) pathway with SALD. Although TLR4 is a critical factor in sepsis progression, it is not specified in single-omics analyses but only in the TDMI analysis. This finding indicates that the TDMI-based approach is more advantageous than single-omics analyses in terms of exploring the underlying pathophysiological mechanism of SALD. Furthermore, TDMI-based approach can be an ideal paradigm for insightful biological interpretations of multi-omics datasets that will potentially reveal novel insights into basic biology, health, and diseases, thus allowing the identification of promising candidates for therapeutic strategies.
{"title":"Novel Time-dependent Multi-omics Integration in Sepsis-associated Liver Dysfunction.","authors":"Ann-Yae Na, Hyojin Lee, Eun Ki Min, Sanjita Paudel, So Young Choi, HyunChae Sim, Kwang-Hyeon Liu, Ki-Tae Kim, Jong-Sup Bae, Sangkyu Lee","doi":"10.1016/j.gpb.2023.04.002","DOIUrl":"10.1016/j.gpb.2023.04.002","url":null,"abstract":"<p><p>The recently developed technologies that allow the analysis of each single omics have provided an unbiased insight into ongoing disease processes. However, it remains challenging to specify the study design for the subsequent integration strategies that can associate sepsis pathophysiology and clinical outcomes. Here, we conducted a time-dependent multi-omics integration (TDMI) in a sepsis-associated liver dysfunction (SALD) model. We successfully deduced the relation of the Toll-like receptor 4 (TLR4) pathway with SALD. Although TLR4 is a critical factor in sepsis progression, it is not specified in single-omics analyses but only in the TDMI analysis. This finding indicates that the TDMI-based approach is more advantageous than single-omics analyses in terms of exploring the underlying pathophysiological mechanism of SALD. Furthermore, TDMI-based approach can be an ideal paradigm for insightful biological interpretations of multi-omics datasets that will potentially reveal novel insights into basic biology, health, and diseases, thus allowing the identification of promising candidates for therapeutic strategies.</p>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"1101-1116"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082264/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9422024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01Epub Date: 2023-04-17DOI: 10.1016/j.gpb.2023.03.001
Lin-Fang Ju, Heng-Ji Xu, Yun-Gui Yang, Ying Yang
During mammalian preimplantation development, a totipotent zygote undergoes several cell cleavages and two rounds of cell fate determination, ultimately forming a mature blastocyst. Along with compaction, the establishment of apicobasal cell polarity breaks the symmetry of an embryo and guides subsequent cell fate choice. Although the lineage segregation of the inner cell mass (ICM) and trophectoderm (TE) is the first symbol of cell differentiation, several molecules have been shown to bias the early cell fate through their inter-cellular variations at much earlier stages, including the 2- and 4-cell stages. The underlying mechanisms of early cell fate determination have long been an important research topic. In this review, we summarize the molecular events that occur during early embryogenesis, as well as the current understanding of their regulatory roles in cell fate decisions. Moreover, as powerful tools for early embryogenesis research, single-cell omics techniques have been applied to both mouse and human preimplantation embryos and have contributed to the discovery of cell fate regulators. Here, we summarize their applications in the research of preimplantation embryos, and provide new insights and perspectives on cell fate regulation.
{"title":"Omics Views of Mechanisms for Cell Fate Determination in Early Mammalian Development.","authors":"Lin-Fang Ju, Heng-Ji Xu, Yun-Gui Yang, Ying Yang","doi":"10.1016/j.gpb.2023.03.001","DOIUrl":"10.1016/j.gpb.2023.03.001","url":null,"abstract":"<p><p>During mammalian preimplantation development, a totipotent zygote undergoes several cell cleavages and two rounds of cell fate determination, ultimately forming a mature blastocyst. Along with compaction, the establishment of apicobasal cell polarity breaks the symmetry of an embryo and guides subsequent cell fate choice. Although the lineage segregation of the inner cell mass (ICM) and trophectoderm (TE) is the first symbol of cell differentiation, several molecules have been shown to bias the early cell fate through their inter-cellular variations at much earlier stages, including the 2- and 4-cell stages. The underlying mechanisms of early cell fate determination have long been an important research topic. In this review, we summarize the molecular events that occur during early embryogenesis, as well as the current understanding of their regulatory roles in cell fate decisions. Moreover, as powerful tools for early embryogenesis research, single-cell omics techniques have been applied to both mouse and human preimplantation embryos and have contributed to the discovery of cell fate regulators. Here, we summarize their applications in the research of preimplantation embryos, and provide new insights and perspectives on cell fate regulation.</p>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"950-961"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928378/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10101436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01Epub Date: 2023-06-22DOI: 10.1016/j.gpb.2023.06.001
Yaojun Wang, Shiwei Sun
{"title":"Revolutionizing Antibody Discovery: An Innovative AI Model for Generating Robust Libraries.","authors":"Yaojun Wang, Shiwei Sun","doi":"10.1016/j.gpb.2023.06.001","DOIUrl":"10.1016/j.gpb.2023.06.001","url":null,"abstract":"","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"910-912"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928364/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9671806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01Epub Date: 2023-02-14DOI: 10.1016/j.gpb.2023.02.004
Wenbin Li, Lin Gao, Xin Yi, Shuangfeng Shi, Jie Huang, Leming Shi, Xiaoyan Zhou, Lingying Wu, Jianming Ying
Defects in genes involved in the DNA damage response cause homologous recombination repair deficiency (HRD). HRD is found in a subgroup of cancer patients for several tumor types, and it has a clinical relevance to cancer prevention and therapies. Accumulating evidence has identified HRD as a biomarker for assessing the therapeutic response of tumor cells to poly(ADP-ribose) polymerase inhibitors and platinum-based chemotherapies. Nevertheless, the biology of HRD is complex, and its applications and the benefits of different HRD biomarker assays are controversial. This is primarily due to inconsistencies in HRD assessments and definitions (gene-level tests, genomic scars, mutational signatures, or a combination of these methods) and difficulties in assessing the contribution of each genomic event. Therefore, we aim to review the biological rationale and clinical evidence of HRD as a biomarker. This review provides a blueprint for the standardization and harmonization of HRD assessments.
{"title":"Patient Assessment and Therapy Planning Based on Homologous Recombination Repair Deficiency.","authors":"Wenbin Li, Lin Gao, Xin Yi, Shuangfeng Shi, Jie Huang, Leming Shi, Xiaoyan Zhou, Lingying Wu, Jianming Ying","doi":"10.1016/j.gpb.2023.02.004","DOIUrl":"10.1016/j.gpb.2023.02.004","url":null,"abstract":"<p><p>Defects in genes involved in the DNA damage response cause homologous recombination repair deficiency (HRD). HRD is found in a subgroup of cancer patients for several tumor types, and it has a clinical relevance to cancer prevention and therapies. Accumulating evidence has identified HRD as a biomarker for assessing the therapeutic response of tumor cells to poly(ADP-ribose) polymerase inhibitors and platinum-based chemotherapies. Nevertheless, the biology of HRD is complex, and its applications and the benefits of different HRD biomarker assays are controversial. This is primarily due to inconsistencies in HRD assessments and definitions (gene-level tests, genomic scars, mutational signatures, or a combination of these methods) and difficulties in assessing the contribution of each genomic event. Therefore, we aim to review the biological rationale and clinical evidence of HRD as a biomarker. This review provides a blueprint for the standardization and harmonization of HRD assessments.</p>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"962-975"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928375/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10737665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antibody leads must fulfill multiple desirable properties to be clinical candidates. Primarily due to the low throughput in the experimental procedure, the need for such multi-property optimization causes the bottleneck in preclinical antibody discovery and development, because addressing one issue usually causes another. We developed a reinforcement learning (RL) method, named AB-Gen, for antibody library design using a generative pre-trained transformer (GPT) as the policy network of the RL agent. We showed that this model can learn the antibody space of heavy chain complementarity determining region 3 (CDRH3) and generate sequences with similar property distributions. Besides, when using human epidermal growth factor receptor-2 (HER2) as the target, the agent model of AB-Gen was able to generate novel CDRH3 sequences that fulfill multi-property constraints. Totally, 509 generated sequences were able to pass all property filters, and three highly conserved residues were identified. The importance of these residues was further demonstrated by molecular dynamics simulations, consolidating that the agent model was capable of grasping important information in this complex optimization task. Overall, the AB-Gen method is able to design novel antibody sequences with an improved success rate than the traditional propose-then-filter approach. It has the potential to be used in practical antibody design, thus empowering the antibody discovery and development process. The source code of AB-Gen is freely available at Zenodo (https://doi.org/10.5281/zenodo.7657016) and BioCode (https://ngdc.cncb.ac.cn/biocode/tools/BT007341).
{"title":"AB-Gen: Antibody Library Design with Generative Pre-trained Transformer and Deep Reinforcement Learning.","authors":"Xiaopeng Xu, Tiantian Xu, Juexiao Zhou, Xingyu Liao, Ruochi Zhang, Yu Wang, Lu Zhang, Xin Gao","doi":"10.1016/j.gpb.2023.03.004","DOIUrl":"10.1016/j.gpb.2023.03.004","url":null,"abstract":"<p><p>Antibody leads must fulfill multiple desirable properties to be clinical candidates. Primarily due to the low throughput in the experimental procedure, the need for such multi-property optimization causes the bottleneck in preclinical antibody discovery and development, because addressing one issue usually causes another. We developed a reinforcement learning (RL) method, named AB-Gen, for antibody library design using a generative pre-trained transformer (GPT) as the policy network of the RL agent. We showed that this model can learn the antibody space of heavy chain complementarity determining region 3 (CDRH3) and generate sequences with similar property distributions. Besides, when using human epidermal growth factor receptor-2 (HER2) as the target, the agent model of AB-Gen was able to generate novel CDRH3 sequences that fulfill multi-property constraints. Totally, 509 generated sequences were able to pass all property filters, and three highly conserved residues were identified. The importance of these residues was further demonstrated by molecular dynamics simulations, consolidating that the agent model was capable of grasping important information in this complex optimization task. Overall, the AB-Gen method is able to design novel antibody sequences with an improved success rate than the traditional propose-then-filter approach. It has the potential to be used in practical antibody design, thus empowering the antibody discovery and development process. The source code of AB-Gen is freely available at Zenodo (https://doi.org/10.5281/zenodo.7657016) and BioCode (https://ngdc.cncb.ac.cn/biocode/tools/BT007341).</p>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"1043-1053"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928431/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10045398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields, including biochemistry, medicine, physics, mathematics, and computer science. These researchers adopt various research paradigms to attack the same structure prediction problem: biochemists and physicists attempt to reveal the principles governing protein folding; mathematicians, especially statisticians, usually start from assuming a probability distribution of protein structures given a target sequence and then find the most likely structure, while computer scientists formulate protein structure prediction as an optimization problem - finding the structural conformation with the lowest energy or minimizing the difference between predicted structure and native structure. These research paradigms fall into the two statistical modeling cultures proposed by Leo Breiman, namely, data modeling and algorithmic modeling. Recently, we have also witnessed the great success of deep learning in protein structure prediction. In this review, we present a survey of the efforts for protein structure prediction. We compare the research paradigms adopted by researchers from different fields, with an emphasis on the shift of research paradigms in the era of deep learning. In short, the algorithmic modeling techniques, especially deep neural networks, have considerably improved the accuracy of protein structure prediction; however, theories interpreting the neural networks and knowledge on protein folding are still highly desired.
蛋白质结构预测是一个跨学科研究课题,吸引了来自生物化学、医学、物理学、数学和计算机科学等多个领域的研究人员。这些研究人员采用不同的研究范式来解决相同的结构预测问题:生物化学家和物理学家试图揭示蛋白质折叠的原理;数学家,尤其是统计学家,通常从假设目标序列中蛋白质结构的概率分布出发,然后找出最可能的结构;而计算机科学家则将蛋白质结构预测表述为一个优化问题--寻找能量最低的结构构象,或将预测结构与原生结构之间的差异最小化。这些研究范式属于 L. Breiman 提出的两种统计建模文化,即数据建模和算法建模。最近,我们也见证了深度学习在蛋白质结构预测方面的巨大成功。在这篇综述中,我们对蛋白质结构预测方面的工作进行了调查。我们比较了不同领域研究人员所采用的研究范式,重点关注深度学习时代研究范式的转变。总之,算法建模技术,尤其是深度神经网络,大大提高了蛋白质结构预测的准确性;然而,解释神经网络的理论和蛋白质折叠方面的知识仍是亟待解决的问题。
{"title":"Protein Structure Prediction: Challenges, Advances, and the Shift of Research Paradigms.","authors":"Bin Huang, Lupeng Kong, Chao Wang, Fusong Ju, Qi Zhang, Jianwei Zhu, Tiansu Gong, Haicang Zhang, Chungong Yu, Wei-Mou Zheng, Dongbo Bu","doi":"10.1016/j.gpb.2022.11.014","DOIUrl":"10.1016/j.gpb.2022.11.014","url":null,"abstract":"<p><p>Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields, including biochemistry, medicine, physics, mathematics, and computer science. These researchers adopt various research paradigms to attack the same structure prediction problem: biochemists and physicists attempt to reveal the principles governing protein folding; mathematicians, especially statisticians, usually start from assuming a probability distribution of protein structures given a target sequence and then find the most likely structure, while computer scientists formulate protein structure prediction as an optimization problem - finding the structural conformation with the lowest energy or minimizing the difference between predicted structure and native structure. These research paradigms fall into the two statistical modeling cultures proposed by Leo Breiman, namely, data modeling and algorithmic modeling. Recently, we have also witnessed the great success of deep learning in protein structure prediction. In this review, we present a survey of the efforts for protein structure prediction. We compare the research paradigms adopted by researchers from different fields, with an emphasis on the shift of research paradigms in the era of deep learning. In short, the algorithmic modeling techniques, especially deep neural networks, have considerably improved the accuracy of protein structure prediction; however, theories interpreting the neural networks and knowledge on protein folding are still highly desired.</p>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":" ","pages":"913-925"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928435/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9593946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}